The classic
The intended request: /download?file=invoice.pdf → serves /var/www/files/invoice.pdf.
The attacker's request: /download?file=../../../../etc/passwd → serves /var/www/files/../../../../etc/passwd which OS-normalizes to /etc/passwd.
The Unix shell, the C library's fopen, Python's open, Java's FileInputStream — all of them resolve .. the same way. Path traversal is a feature of the operating system, and the developer's "+" handed it directly to the user.
What the attacker reads
| Linux target | Why it matters |
|---|---|
/etc/passwd | User list. The polite "are you vulnerable?" probe. Doesn't have hashes (those are in /etc/shadow which root-only). |
/etc/shadow | Password hashes — if the web user is root (yikes) or has read perms. |
/home/USER/.ssh/id_rsa | Private SSH key. Pivot to any host the user has access to. |
/home/USER/.aws/credentials | AWS keys. Full account access if the keys are long-lived. |
/var/www/app/config.py | Database credentials, API keys, signing secrets. Often holds the master key. |
/proc/self/environ | Environment variables of the current process — often holds DB password, secret keys, OAuth secrets injected by the deployment system. |
/proc/self/cmdline | Full command line of the running app — sometimes contains credentials. |
/var/log/apache2/access.log | Log poisoning — if the attacker can write a known string into the log (User-Agent, URL), they can then read it back via path traversal and turn read into execute if the file is interpreted (LFI-to-RCE). |
Bypass tricks
Developers try to filter the obvious. Encodings, normalization quirks, and recursive parsing all defeat the filters.
| Filter | Bypass |
|---|---|
Reject literal ../ | URL-encode: %2e%2e%2f |
| Reject URL-encoded too | Double-encode: %252e%252e%252f — if the server URL-decodes twice (once by the proxy, once by the app), the literal ../ emerges after the second decode. |
Reject any .. | Use Unicode normalization tricks: ..%c0%af (overlong UTF-8 encoding of /) bypassed Apache for years. Or alternative separators: ..\ on Windows. |
strip ../ once | ....// — after the strip, what's left is ../. Recursive strip required. |
Whitelist by suffix (.pdf) | Null byte: ../../../../etc/passwd%00.pdf — many languages (older PHP, older Python) treated \0 as string-terminator at the OS layer while the suffix check saw the .pdf. |
Block absolute paths starting with / | Use ../ traversal that starts relative. Or use Windows drive letters: C:\Windows\System32\config\SAM. |
The fix
Pattern 1: resolve, then verify
Resolve the user-supplied path against your intended base directory using the OS's own normalizer. Then check that the resolved absolute path actually lives under the base.
Pattern 2: don't accept paths at all
Often the cleanest fix is to never let users supply file names. Use IDs that map to filenames server-side.
The user can enumerate IDs (a different bug — see IDOR) but they cannot escape the file directory because the filename never comes from them.
Pattern 3: chroot / container / read-only mount
Defense in depth. The web user shouldn't have read access to /etc/shadow, /home/*/.ssh/, or app config files in the first place. Run the web service as a dedicated low-privilege user; mount only the directories it needs.
Apache's famous one: CVE-2021-41773
Apache httpd 2.4.49 shipped with a path normalization regression. URL-encoded path components like .%2e/ were not normalized correctly before the access check, allowing path traversal outside the document root. The escape room earlier in the course uses this exact bug as Stage 2's CVE.
The patched version 2.4.50 shipped within days — and was found to have an incomplete fix, leading to CVE-2021-42013, requiring another patch. Path normalization is genuinely hard.
The takeaway
Path traversal is an operating system feature, weaponized by a developer's "+". You cannot filter your way out — encoding tricks beat filter logic. The fix is resolve and verify: ask the OS to normalize the path, then confirm the result is inside the directory you expected.
Where possible, don't accept user-supplied filenames at all. Pass IDs; let the server map IDs to filenames. The vulnerability class disappears entirely.
References
Formatted in APA 7.
- MITRE. (2024). CWE-22: Improper limitation of a pathname to a restricted directory ('path traversal'). Common Weakness Enumeration. https://cwe.mitre.org/data/definitions/22.html
- National Institute of Standards and Technology. (2021). CVE-2021-41773 detail. National Vulnerability Database. https://nvd.nist.gov/vuln/detail/CVE-2021-41773
- OWASP. (2024). Path traversal. https://owasp.org/www-community/attacks/Path_Traversal
- PortSwigger. (2024). Directory traversal. Web Security Academy. https://portswigger.net/web-security/file-path-traversal