githubEdit

Web Reconnaissance

HTTP Headers Reference

Request Headers (Useful for Recon)

Header
Description

Host

Target hostname - useful for vhost discovery

User-Agent

Client identifier - spoofable to bypass restrictions

Cookie

Session tokens, auth data

Authorization

Basic/Bearer tokens

Referer

Previous page - spoofable for access bypasses

X-Forwarded-For

Client IP (via proxy) - spoofable for IP bypasses

Response Headers (Look For)

Header
Description

Server

Web server software & version

X-Powered-By

Backend technology (PHP, ASP.NET, Express)

X-Redirect-By

CMS identification (WordPress, Drupal)

Set-Cookie

Session handling, flags (Secure, HttpOnly)

Content-Type

Response type and encoding

WWW-Authenticate

Auth mechanism required

Security Headers (Missing = Potential Issue)

Header
Description

Content-Security-Policy

XSS prevention - script sources

X-Frame-Options

Clickjacking protection

Strict-Transport-Security

HTTPS enforcement (HSTS)

X-Content-Type-Options

Prevent MIME sniffing

Referrer-Policy

Control referrer information

Permissions-Policy

Feature restrictions


Web Fingerprinting

Identify Server & Tech Stack (curl)

Look for:

  • Server: header (e.g., Apache/2.4.41, nginx/1.18.0)

  • X-Powered-By: header (e.g., PHP/7.4, Express)

  • X-Redirect-By: header (e.g., WordPress)


WAF Detection (wafw00f)


Nikto Fingerprinting


robots.txt Analysis

Key Directives

Directive
Description

Disallow:

Paths the bot shouldn't crawl (interesting for recon!)

Allow:

Explicitly permitted paths

Sitemap:

URL to sitemap.xml

Crawl-delay:

Seconds between requests

Recon Value: Disallowed paths often reveal admin panels, backup directories, or sensitive endpoints.


.well-known URIs

Useful .well-known URIs

URI
Description

security.txt

Security contact info

openid-configuration

OAuth2/OIDC endpoints, supported scopes

assetlinks.json

Android app asset links

mta-sts.txt

Email MTA-STS policy

Full list: https://www.iana.org/assignments/well-known-uris/well-known-uris.xhtml


Google Dorking

Operators

Operator
Description
Example

site:

Limit to domain

site:example.com

inurl:

Term in URL

inurl:admin

intitle:

Term in page title

intitle:"index of"

filetype:

Specific file type

filetype:pdf

ext:

File extension

ext:conf

intext:

Term in page body

intext:password

cache:

Cached version

cache:example.com

link:

Pages linking to URL

link:example.com

Common Dorks

Google Hacking Database

https://www.exploit-db.com/google-hacking-database


Wayback Machine

Web Interface

https://web.archive.org/

Enter URL to view historical snapshots.

Command Line

Recon Value

  • Find old pages, endpoints, files no longer linked

  • Discover removed sensitive content

  • Track tech stack changes over time

  • Passive (no direct target interaction)


Web Crawling

Burp Suite Spider

  1. Set scope to target domain

  2. Right-click target in Site Map → Spider this host

  3. Review discovered endpoints in Site Map

OWASP ZAP Spider

  1. Set target URL

  2. Right-click → Attack → Spider

  3. Review Sites tree for discovered content

ReconSpider (Custom Tool)

Output in results.json:

  • Emails, links, external files

  • JavaScript files

  • Form fields

  • Images, videos, audio

  • HTML comments

Scrapy (Python)

Custom spider for large-scale crawling.


Automation Frameworks

Tool
Description

Python-based, modular (SSL, WHOIS, headers, crawl)

Framework with modules for DNS, subdomains, ports, etc.

Email, subdomain, host gathering from multiple sources

Comprehensive subdomain enumeration

Fast crawler extracting URLs, emails, files, endpoints

FinalRecon

Recon-ng

Last updated