Why Web Apps?
Web applications are the most common attack surface in modern networks. Every organization has them, and most are built under time pressure with frameworks that trade security for convenience.
Insecure dependencies, misconfigured servers, weak authentication, and unsanitized input fields. These flaws repeat across every tech stack, every framework, every language.
If it has a login page, it has an attack surface.
Testing Methodologies
There are three approaches to assessing a web application, depending on what information you’re given.
Black-Box Testing
You know nothing. Just a URL. No source code, no credentials, no documentation.
This is:
- The most realistic testing approach
- How bug bounty programs work
- The hardest approach (everything must be discovered)
You start from zero and enumerate everything: tech stack, directories, parameters, APIs, input fields.
White-Box Testing
You have everything. Source code, infrastructure docs, architecture diagrams, credentials.
This is:
- The most thorough approach
- Requires code review skills (reading PHP, Python, Java, etc.)
- Takes longer, but finds more subtle bugs (logic flaws, race conditions)
Grey-Box Testing
Somewhere in between. You might get:
- Valid credentials
- The framework name
- Partial documentation
- Network diagrams
Most real-world pentests are grey-box. The client gives you some info to focus your efforts.
The OWASP Top 10
The OWASP Foundation maintains a list of the ten most critical web application security risks. This is the industry standard for what to test.
| Rank | Vulnerability | What it means |
|---|---|---|
| 1 | Broken Access Control | Users can access things they shouldn’t |
| 2 | Cryptographic Failures | Sensitive data exposed (weak encryption, plaintext) |
| 3 | Injection | SQL injection, command injection, XSS |
| 4 | Insecure Design | Flawed architecture, missing security controls |
| 5 | Security Misconfiguration | Default credentials, unnecessary features enabled |
| 6 | Vulnerable Components | Outdated libraries with known CVEs |
| 7 | Authentication Failures | Weak passwords, broken session management |
| 8 | Data Integrity Failures | Insecure deserialization, unsigned updates |
| 9 | Logging Failures | No audit trail, attacks go undetected |
| 10 | Server-Side Request Forgery | Server tricked into making requests to internal services |
The OWASP Top 10 is your checklist. When you find a web app, mentally walk through this list. Most vulnerabilities you’ll encounter in the wild fall into one of these categories.
The Web App Testing Workflow
Every web application test follows roughly the same flow:
- Fingerprint the stack - What web server? What framework? What language?
- Discover content - Directory brute-force, sitemap, robots.txt
- Map the application - Pages, forms, APIs, parameters, authentication
- Identify inputs - Every field, header, cookie, and URL parameter is a potential injection point
- Test for vulnerabilities - Injection, XSS, access control, authentication flaws
- Exploit and escalate - Use findings to gain access, then pivot deeper
Fingerprinting the Stack
Before testing for vulnerabilities, you need to know what you’re attacking.
With Nmap
sudo nmap -p80,443 -sV 10.10.10.50The version detection reveals the web server software (Apache, Nginx, IIS) and sometimes the backend language.
For deeper enumeration:
sudo nmap -p80 --script=http-enum 10.10.10.50The http-enum script checks for common directories, admin panels, login pages, and known application paths.
With Wappalyzer
Wappalyzer is a browser extension that passively identifies the technology stack by analyzing what the page serves:
- Web server (Apache, Nginx, IIS)
- CMS (WordPress, Drupal, Joomla)
- Frameworks (React, Django, Laravel)
- JavaScript libraries (jQuery version, Bootstrap)
- CDN and hosting (CloudFlare, AWS)
No extra traffic generated. It just reads what’s already in the page source and headers.
Knowing the stack narrows your search. WordPress? Check for plugin vulnerabilities. jQuery 3.6? Check for known XSS. Apache 2.4.49? Check for path traversal.
Directory Discovery with Gobuster
Hidden files and directories are everywhere. Admin panels, backup files, configuration dumps, upload endpoints. Gobuster finds them by brute-forcing paths with a wordlist.
gobuster dir -u http://10.10.10.50 -w /usr/share/wordlists/dirb/common.txt -t 5dir- directory/file brute-force mode-u- target URL-w- wordlist-t 5- threads (lower = less noise)
Reading the Output
| Status Code | Meaning | Action |
|---|---|---|
| 200 | Accessible | Investigate immediately |
| 301 | Redirect | Follow it, see where it goes |
| 302 | Temporary redirect | Often indicates authentication |
| 403 | Forbidden | Exists but blocked, try bypasses |
| 404 | Not found | Move on |
Useful Wordlists
| Wordlist | Use case |
|---|---|
| common.txt | Quick scan, most common paths |
| big.txt | Thorough scan |
| directory-list-2.3-medium.txt | Very thorough, takes longer |
| raft-medium-words.txt | Good general purpose list |
Always run Gobuster early. While you’re manually exploring the app, let Gobuster run in the background. It often finds the path that leads to your initial foothold.
Inspecting the Application
URL Clues
File extensions in URLs reveal the backend language:
| Extension | Language |
|---|---|
.php | PHP |
.asp, .aspx | ASP.NET |
.jsp, .do | Java |
.py | Python |
| No extension | Modern framework with routing |
Page Source and DevTools
Open the browser DevTools (F12) and explore:
- Inspector - right-click any element to see its HTML. Find hidden form fields, input names, comments
- Debugger/Sources - see all JavaScript files. Look for library versions, hardcoded secrets, API endpoints
- Network - watch all HTTP requests in real time. See response headers, cookies, redirects
- Console - execute JavaScript directly. Test payloads, inspect variables
Response Headers
Click a request in the Network tab and inspect the response headers:
| Header | What it reveals |
|---|---|
Server | Web server software and version |
X-Powered-By | Backend language (PHP, ASP.NET) |
X-AspNet-Version | Specific .NET version |
x-amz-cf-id | Amazon CloudFront CDN |
Set-Cookie | Session management details, security flags |
The absence of security headers is just as telling. No
X-Frame-Options? Possible clickjacking. NoContent-Security-Policy? Easier XSS exploitation.
robots.txt and sitemap.xml
robots.txt tells search engines what not to index. For pentesters, this is a roadmap of hidden paths:
curl http://10.10.10.50/robots.txtDisallow entries are often admin panels, sensitive directories, or internal tools. The site is literally telling you what it’s trying to hide.
sitemap.xml lists all pages the site wants indexed. Can reveal pages you wouldn’t find through browsing or brute-forcing.