Passive Reconnaissance

Gathering Without Touching

Passive reconnaissance means collecting information about a target without directly interacting with it.

No packets sent. No connections made. No logs generated on the target’s systems.

You’re using publicly available information. Search engines, public databases, cached pages, leaked repositories. The target has no way to know you’re looking.

Why Start Here?

Two reasons:

It’s invisible. No firewall alerts, no IDS triggers, no suspicious log entries
It’s surprisingly powerful. Organizations leak more information than they realize

Before you ever touch a target, you should know its domains, subdomains, employee names, technology stack, and sometimes even credentials. All from public sources.

Domain Registration (WHOIS)

Every domain on the internet is registered. That registration is public record.

A WHOIS lookup reveals:

Registrar and registration dates
Name servers (tells you who handles their DNS)
Contact information (sometimes names, emails, phone numbers)
Organization name and address

whois acmecorp.com

Many organizations use privacy protection to hide contact details. But even then, the name servers and registrar are visible.

The name servers are important. They tell you who manages the DNS infrastructure, which feeds into active enumeration later.

Even “redacted” WHOIS records leak useful data. Registration dates, name servers, and registrar choice all provide context.

Search Engine Reconnaissance

Google indexes far more than organizations intend. With the right search operators, you can find things that were never meant to be public.

This technique is called Google dorking.

Useful Operators

Operator	What it does	Example
`site:`	Limit results to a domain	`site:acmecorp.com`
`filetype:`	Find specific file types	`filetype:pdf site:acmecorp.com`
`intitle:`	Search page titles	`intitle:"index of" site:acmecorp.com`
`inurl:`	Search within URLs	`inurl:admin site:acmecorp.com`
`-`	Exclude terms	`site:acmecorp.com -www`
`ext:`	File extension	`ext:php site:acmecorp.com`

What You’re Hunting For

Login portals: intitle:"login" site:target.com
Exposed directories: intitle:"index of" site:target.com
Configuration files: filetype:conf OR filetype:env site:target.com
Documents with metadata: filetype:pdf OR filetype:docx site:target.com
Subdomains: site:target.com -www shows pages on non-www subdomains

PDFs and Office documents often contain metadata: author names, software versions, internal paths, usernames. All valuable.

Google dorking is often the highest-value passive technique. A single well-crafted query can reveal admin panels, credentials in config files, or internal documents.

Web Infrastructure Analysis

Tools like Netcraft and BuiltWith analyze a target’s technology stack without you sending a single packet.

They tell you:

Web server software (Apache, Nginx, IIS)
Hosting provider and IP history
Frameworks (React, Django, WordPress)
Analytics and tracking services
Historical changes to the infrastructure

Why does this matter?

Knowing the tech stack narrows your search. If you know the target runs WordPress 5.8 on Apache, you know exactly which CVEs to look for. If they recently migrated from IIS to Nginx, the old server might still be accessible.

Source Code Mining

Developers accidentally commit secrets to public repositories. Constantly.

What to search for on GitHub:

API keys and tokens
Database connection strings
Hardcoded passwords
Internal hostnames and IP addresses
Configuration files with credentials
.env files that should have been gitignored

Searching Effectively

Search by organization name, domain name, and employee names:

org:acmecorp password
"acmecorp.com" api_key
"acmecorp" filename:.env

Even if the repository is private now, old commits may have been forked or cached before it was locked down.

One leaked API key can be the entire way in. Source code mining has led to some of the largest breaches in history.

Internet-Connected Device Discovery

Shodan and Censys are search engines, but not for web pages. They scan the entire internet and index every device, service, and banner they find.

What Shodan Reveals

Search for an organization by name, domain, or IP range:

Open ports and services across their infrastructure
Software versions and banners
SSL certificate details
Default credentials on exposed devices
IoT devices, webcams, printers, industrial control systems

Why It’s Devastating

Shodan doesn’t just show web servers. It shows everything connected to the internet.

Forgotten development servers. Unpatched database instances. Network equipment with default passwords. Industrial control systems that should never be internet-facing.

If it’s connected to the internet and has an open port, Shodan has probably already found it.

TLS Certificates and Security Headers

Certificate Transparency

Every TLS certificate issued is logged in public Certificate Transparency (CT) logs. This means you can find every subdomain an organization has ever gotten a certificate for.

Tools like crt.sh let you search CT logs by domain:

%.acmecorp.com

This reveals subdomains that might not show up in DNS brute-forcing: staging servers, internal tools, forgotten services.

Security Headers

Visiting a target’s website and inspecting the HTTP response headers reveals their security posture:

Header	What it tells you
`X-Powered-By`	Backend technology (PHP, ASP.NET)
`Server`	Web server software and version
Missing `X-Frame-Options`	Potentially vulnerable to clickjacking
Missing `Content-Security-Policy`	Potentially vulnerable to XSS
`Strict-Transport-Security`	Whether they enforce HTTPS

The absence of security headers is just as informative as their presence.

AI-Assisted Reconnaissance

LLMs can accelerate passive recon by helping you:

Generate Google dork queries tailored to a target’s industry
Analyze WHOIS data and identify patterns across related domains
Summarize large amounts of public information quickly
Identify naming patterns in subdomains or email formats
Cross-reference findings from multiple OSINT sources

The key is writing specific, context-rich prompts. Don’t ask “find info about target.com.” Instead:

“Based on what’s publicly known about AcmeCorp’s organizational structure and industry, generate a list of likely subdomain naming patterns including infrastructure, departmental, and regional conventions.”

LLMs don’t replace manual OSINT. They amplify it. Always verify their output, as they can hallucinate details that look convincing but are wrong.