Inaccurate Design

SANS GWAPT Day 1: Reconnaissance

Monday, 13 May 2013

Recon is an important part of web application testing. If you get the recon wrong, you could miss something, or end up pentesting the wrong website/application/company, especially if they use shared hosting services!

  • Be sure to clarify scope with the client
  • Don’t skimp on the recon.

Light Information Gathering

  • Search Google (or use an automated tool such as FOCA (tool is in English, website is not) to grab documents from public websites. From these documents, use metadata for recon as well as the content of the documents.

  • Social networks, including Facebook, Twitter and LinkedIn are great for researching specific targets within an organisation. However, be wary of sites such as LinkedIn that reveal who viewed your profile.

  • ‘Google Hacking’ can be very effective. That is, searching for content exposed to Google that should not be by default. Tools such as the Google Hacking Database on Exploit-DB have collections of search terms that can be mightly useful in uncovering this

    • Sidenote from above: You can use the search terms from above in conjunction with Google Alerts to monitor your own websites/sites of clients for anyhting unexpected.
  • gpscan is a good tool for scraping the Google Profiles of people at the target company (although I’m unsure how this fits in with Google+ profiles now).

Technical Recon

  • Investigate DNS records, whois records, IP addresses. To help automate this you can use a tool such as Fierce Domain Scan, or you do it the manual way with dig, host and nslookup.

  • Use tools such as sslthing or THCSSLCheck to get the SSL info for exposed hosts.

  • Explore IP addresses to find all possible sites (including Virtual Hosts - although there is no foolproof way to do this)

    • Reverse DNS lookups against the IP address/Zone transfers

    • Bing allows you to search against sites by IP (eg. ip:

  • Investigate if any load balancers are in place and how they affect your scope/testing targets

    • Date and Last Modified headers can be useful, as it can show there are multiple servers if they are not perfectly in sync as the header values can jump around.

    • Have a look at cookies. Some load balancers will include cookies that enable them to keep track of the session state, which can reveal their presence.

  • Confirm with client the servers identified if possible, to ensure that they’re all in scope and you’re not testing the wrong boxes

Webserver Configuration

  • Can use software such as Nikto

    • Nikto can output to a HTML file with links to OSVD pages for vulnerabilities
  • HTTP methods supported by the web server (GET, POST, PUT, DELETE, etc). These can reveal edge cases in software that may not be accounted for otherwise

  • Check server software versions (web server, SSL, databases, PHP, modules, etc). Out of date versions can have vulnerabilities that can be exploited

    • You can use [Nmap]() to do port scanning and service fingerprinting


  • Check ‘robots.txt’ to see what URLs are accessible publicly, but not available to be indexed by search engines

  • OWASP ZAP is a good spidering tool

    • Also see Burp and wget (with the -r, -l and -e robots=off flags)
  • CeWL can be used to spider websites and generate custom word lists to use for password lists. You can then use another tool such as John the Ripper to mangle these passwords to use for ‘targeted’ bruteforcing (if there is such a thing).

  • Read through HTML pages and see if there’s any comments that can revel functionality or code (such as Javascript) that can be enabled or changed to work for us.

    • Burp Professional has an ‘Engagement tools’ function that can go through a spidered site to grab comments or interesting bits of information
  • Have a look at the HTML code on the pages - through the changes in code style or comments, you might be able to spot if pages have been developed by different developers, or generated by different applications.

  • Spidering tools follow existing links on pages. Be sure to check the website code for Javascript or CSS includes to other directories. eruggia/

    • Additionally, check common files such as ‘’, ‘users.txt’ and ‘test.php’.

    • Tools such as OWASP Dirbuster are able to bruteforce common directories and files.

Business Logic Recon

  • Have a think about the function of the site you’re trying to test, and think about that the aim of an attacker would be. This is something scanner will not be able to figure out.

    • For example, in a shopping cart application, are you able to add another product to the cart (using another tab) between when you’ve authorised the payment, and when you confirm the order?
  • Analyse the parameters in the URL to find functionality that may not be exposed by default. A given URL might have the parameter ‘action=addUser’. There might be a corresponding function ‘action=delUser’

Session Tokens

  • Can be stored and transmitted a number of ways, including cookies, hidden form fields, or the URL

  • See if the session tokens are predictable - try logging in and out multiple times to get a sample

    • Remember on a produciton website you might have other users logging in at the same time, throwing off your count.

    • Some frameworks will randomize the session IDs

    • Can be automated with Burp, using the ‘Repeater’ function to send multiple GET requests to simulate visiting the site multiple times, or can customer code a quick Python script

    • Burp also has a ‘Sequencer’ function that can be used to analyse session tokens and try and figure out a pattern.