Ethical Hacking - Automating Information Gathering - Perficient Blogs
  • Topics
  • Industries
  • Partners





Ethical Hacking – Automating Information Gathering

As consultants we are constantly looking for gaps in industries that utilize technology. One of the most common gaps we see is with information security. Unfortunately, information security is often ignored and not taken seriously until after an incident occurs. To help fill this gap, I am studying for the Offensive Security Certified Professional (OSCP) certification. The OSCP is an extremely challenging hands-on ethical hacking certification. A major part of penetration testing (often referred to as pentesting) is the information gathering step.

My original pentest methodology was to essentially scan all the ports on a server, then look through the results. Ports are like device gates. Software needs certain device gates to be open in order to communicate with other devices. The purpose of a port scan is to attempt to identify what software is being used on a target machine, then look for misconfigurations or vulnerabilities in that software. While gathering information for over 50 different machines in the lab environment, I realized that I was wasting a lot of time filtering through multiple scan result files.

To save time during my labs, I simplified my methodology by first running scans to see whether the target was online / reachable. Then, if the target device was online, I would execute another scan to check the top 100 or 200 most commonly used  ports.   When trying to hack into a specific a device, I found it helpful to perform a full port scan (TCP + UDP protocols ports 1-65535), and that takes a very long time. The reason for performing a full port scan is to attempt to identify every open port on the target. Often in labs and on enterprise networks, software is configured to run on unexpected port numbers. This makes it more difficult to detect and predict vulnerabilities on the machine. Now imagine running a full port scan on a bunch of machines simultaneously, without having established that the machine was online in the first place. That ends up being hours of wasted time. Unfortunately I learned this the hard way, and I made the rookie mistake of running a full port scan on multiple machines within my first day of the labs. My scans ran for hours, and most of the devices I was checking were not even online. Luckily I learned from my mistake, took thorough notes, and focused on process improvement.

With my improved methodology, I was also able to develop an information gathering tool to heavily reduce the amount of time spent scanning and parsing through results. Those of you who are knowledgeable in pentesting tools are probably aware that Metasploit has a feature similar to nmap. During the OSCP exam the use of Metasploit is restricted, and this encouraged me to develop alternative methods of performing reconnaissance on the lab network. I was able to use python to run nmap and return all of the results in JSON format. Storing the results in JSON format makes the results easy to search, and they can be uploaded to a  no-SQL database such as CouchDB. With CouchDB I am able to upload all of my results and use them as metrics. I can view scans over time, compare results, collaborate with coworkers, and quickly filter results to identify potential vulnerabilities on a target. When all is said and done, this new tool provides way more information than even the Metasploit nmap tool. This makes it perfect for customizing information gathering and metrics.

This is a prime example of how having a good understanding of a process can help implement automation. Now that I have a basic information gathering tool, I can build an automation framework to do things like automatically lookup vulnerabilities, identify recommended fixes, and find known exploits for a vulnerability. This type of research is what pentesters spend the most amount of time on.

Here are some screenshots of how the recon tool works.


The scanning framework supports different types of target address range specifications, nmap scan types (including custom), output directory specification, and database table name specification for results storage.


This view is of different local databases that store our results. the name defaults to today’s date if not defined, making it easy to run new scans and store the results.


This is a view of a database and all of the documents it contains. By clicking a document, you will be taken to its JSON view.


This is the JSON view. These are the scanning results and you can filter/query against them.

Hopefully this tool highlights how daily challenges can be resolved by a little bit of process improvement. A big part of process improvement and innovation is being creative – review what information you have available and brainstorm how it can be used  to simplify your tasks. The recon tool described in this article is still being developed, and is by no means perfect or fully completed. I am looking forward to sharing this tool with the community and seeing what it will grow into. Please make sure you have permission before running any port scans on targets that you do not personally own. I am not liable for how the tool is used. The recon tool can be found on my GitHub:  .

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Subscribe to the Weekly Blog Digest:

Sign Up