Passive Reconnaissance

Gathering Without Touching

Passive reconnaissance means collecting information about a target without directly interacting with it.

No packets sent. No connections made. No logs generated on the target’s systems.

You’re using publicly available information. Search engines, public databases, cached pages, leaked repositories. The target has no way to know you’re looking.


Why Start Here?

Two reasons:

  1. It’s invisible. No firewall alerts, no IDS triggers, no suspicious log entries
  2. It’s surprisingly powerful. Organizations leak more information than they realize

Before you ever touch a target, you should know its domains, subdomains, employee names, technology stack, and sometimes even credentials. All from public sources.


Domain Registration (WHOIS)

Every domain on the internet is registered. That registration is public record.

A WHOIS lookup reveals:

  • Registrar and registration dates
  • Name servers (tells you who handles their DNS)
  • Contact information (sometimes names, emails, phone numbers)
  • Organization name and address
whois acmecorp.com

Many organizations use privacy protection to hide contact details. But even then, the name servers and registrar are visible.

The name servers are important. They tell you who manages the DNS infrastructure, which feeds into active enumeration later.

Even “redacted” WHOIS records leak useful data. Registration dates, name servers, and registrar choice all provide context.


Search Engine Reconnaissance

Google indexes far more than organizations intend. With the right search operators, you can find things that were never meant to be public.

This technique is called Google dorking.


Useful Operators

OperatorWhat it doesExample
site:Limit results to a domainsite:acmecorp.com
filetype:Find specific file typesfiletype:pdf site:acmecorp.com
intitle:Search page titlesintitle:"index of" site:acmecorp.com
inurl:Search within URLsinurl:admin site:acmecorp.com
-Exclude termssite:acmecorp.com -www
ext:File extensionext:php site:acmecorp.com

What You’re Hunting For

  • Login portalsintitle:"login" site:target.com
  • Exposed directoriesintitle:"index of" site:target.com
  • Configuration filesfiletype:conf OR filetype:env site:target.com
  • Documents with metadatafiletype:pdf OR filetype:docx site:target.com
  • Subdomainssite:target.com -www shows pages on non-www subdomains

PDFs and Office documents often contain metadata: author names, software versions, internal paths, usernames. All valuable.

Google dorking is often the highest-value passive technique. A single well-crafted query can reveal admin panels, credentials in config files, or internal documents.


Web Infrastructure Analysis

Tools like Netcraft and BuiltWith analyze a target’s technology stack without you sending a single packet.

They tell you:

  • Web server software (Apache, Nginx, IIS)
  • Hosting provider and IP history
  • Frameworks (React, Django, WordPress)
  • Analytics and tracking services
  • Historical changes to the infrastructure

Why does this matter?

Knowing the tech stack narrows your search. If you know the target runs WordPress 5.8 on Apache, you know exactly which CVEs to look for. If they recently migrated from IIS to Nginx, the old server might still be accessible.


Source Code Mining

Developers accidentally commit secrets to public repositories. Constantly.

What to search for on GitHub:

  • API keys and tokens
  • Database connection strings
  • Hardcoded passwords
  • Internal hostnames and IP addresses
  • Configuration files with credentials
  • .env files that should have been gitignored

Searching Effectively

Search by organization name, domain name, and employee names:

org:acmecorp password
"acmecorp.com" api_key
"acmecorp" filename:.env

Even if the repository is private now, old commits may have been forked or cached before it was locked down.

One leaked API key can be the entire way in. Source code mining has led to some of the largest breaches in history.


Internet-Connected Device Discovery

Shodan and Censys are search engines, but not for web pages. They scan the entire internet and index every device, service, and banner they find.


What Shodan Reveals

Search for an organization by name, domain, or IP range:

  • Open ports and services across their infrastructure
  • Software versions and banners
  • SSL certificate details
  • Default credentials on exposed devices
  • IoT devices, webcams, printers, industrial control systems

Why It’s Devastating

Shodan doesn’t just show web servers. It shows everything connected to the internet.

Forgotten development servers. Unpatched database instances. Network equipment with default passwords. Industrial control systems that should never be internet-facing.

If it’s connected to the internet and has an open port, Shodan has probably already found it.


TLS Certificates and Security Headers

Certificate Transparency

Every TLS certificate issued is logged in public Certificate Transparency (CT) logs. This means you can find every subdomain an organization has ever gotten a certificate for.

Tools like crt.sh let you search CT logs by domain:

%.acmecorp.com

This reveals subdomains that might not show up in DNS brute-forcing: staging servers, internal tools, forgotten services.


Security Headers

Visiting a target’s website and inspecting the HTTP response headers reveals their security posture:

HeaderWhat it tells you
X-Powered-ByBackend technology (PHP, ASP.NET)
ServerWeb server software and version
Missing X-Frame-OptionsPotentially vulnerable to clickjacking
Missing Content-Security-PolicyPotentially vulnerable to XSS
Strict-Transport-SecurityWhether they enforce HTTPS

The absence of security headers is just as informative as their presence.


AI-Assisted Reconnaissance

LLMs can accelerate passive recon by helping you:

  • Generate Google dork queries tailored to a target’s industry
  • Analyze WHOIS data and identify patterns across related domains
  • Summarize large amounts of public information quickly
  • Identify naming patterns in subdomains or email formats
  • Cross-reference findings from multiple OSINT sources

The key is writing specific, context-rich prompts. Don’t ask “find info about target.com.” Instead:

“Based on what’s publicly known about AcmeCorp’s organizational structure and industry, generate a list of likely subdomain naming patterns including infrastructure, departmental, and regional conventions.”

LLMs don’t replace manual OSINT. They amplify it. Always verify their output, as they can hallucinate details that look convincing but are wrong.