Path Traversal Vulnerability and Protection
1. Vulnerability Description
Path Traversal (also known as Directory Traversal) is a security vulnerability that exploits file operation interfaces to access unauthorized directories. Attackers construct special paths (e.g., ../) to bypass system restrictions, enabling them to read or write arbitrary files (such as system configuration files, logs, source code, etc.). This vulnerability is commonly found in functionalities like file download, upload, and display.
2. Vulnerability Principle
Core Issue: Improper Filtering of User Input
For instance, a file download function concatenates the path directly based on the user-supplied filename:
http://example.com/download?file=report.pdf
Actual Path: /var/www/files/report.pdf
However, if the user inputs file=../../../etc/passwd, the concatenated path might become:
/var/www/files/../../../etc/passwd
Equivalent Path: /etc/passwd
If the system does not filter ../, unauthorized access occurs.
3. Attack Scenario Examples
Scenario 1: File Download Function
Assuming the code logic is as follows:
$filename = $_GET['file'];
$filepath = "/var/www/files/" . $filename;
readfile($filepath);
The attacker requests:
download?file=../../etc/passwd
The server might return the contents of sensitive files.
Scenario 2: Archive Extraction
If the server extracts user-uploaded archives, and the archive contains malicious paths (e.g., ../../script.sh), it may overwrite system files.
4. Vulnerability Discovery Methods
- Manual Testing: Attempt the following payloads in file parameters:
../../../etc/passwd....//....//etc/passwd(double encoding or special bypasses)- Absolute paths (e.g.,
/etc/passwd)
- Tool Assistance: Use Burp Suite's Scanner module or specialized path traversal tools (e.g., dotdotpwn).
5. Protection Strategies
Strategy 1: Input Validation and Normalization
- Allowlist Validation: Only permit specific characters or file extensions (e.g.,
[a-zA-Z0-9]\.pdf). - Path Normalization: Use functions like
realpath()orPath.GetFullPath()to resolve full paths and verify they are within allowed directories:$basePath = "/var/www/files/"; $userPath = realpath($basePath . $filename); if (strpos($userPath, $basePath) !== 0) { die("Illegal path!"); }
Strategy 2: File Index Mapping
- Avoid using user-supplied filenames directly; instead, map numeric IDs to actual files:
download?id=123 → Actual File: safe_name_xyz.pdf
Strategy 3: Web Server Configuration
- Restrict access directories (e.g., configure Nginx's
rootdirective with least-privilege paths). - Disable HTTP access to sensitive directories (e.g.,
/etc/).
Strategy 4: Secure Coding Practices
- Avoid direct path concatenation; use language-specific safe functions (e.g., Python's
os.path.join). - Perform path validation during archive extraction to reject files containing
../.
6. Vulnerability Cases
- Apache Tomcat CVE-2020-1938: Allows reading arbitrary files outside the web application via specific requests.
- A Cloud Storage Application Path Traversal: Enables downloading other users' private files by modifying URL parameters.
Summary
The core of path traversal vulnerabilities lies in excessive trust in user input. Protection requires a multi-layered approach, combining input validation, path normalization, the principle of least privilege, and other measures to ensure file operations are strictly confined to the intended scope.