Path Traversal Vulnerability and Protection

Path Traversal Vulnerability and Protection

1. Vulnerability Description

Path Traversal (also known as Directory Traversal) is a security vulnerability that exploits file operation interfaces to access unauthorized directories. Attackers construct special paths (e.g., ../) to bypass system restrictions, enabling them to read or write arbitrary files (such as system configuration files, logs, source code, etc.). This vulnerability is commonly found in functionalities like file download, upload, and display.


2. Vulnerability Principle

Core Issue: Improper Filtering of User Input

For instance, a file download function concatenates the path directly based on the user-supplied filename:

http://example.com/download?file=report.pdf  
Actual Path: /var/www/files/report.pdf  

However, if the user inputs file=../../../etc/passwd, the concatenated path might become:

/var/www/files/../../../etc/passwd  
Equivalent Path: /etc/passwd  

If the system does not filter ../, unauthorized access occurs.


3. Attack Scenario Examples

Scenario 1: File Download Function

Assuming the code logic is as follows:

$filename = $_GET['file'];  
$filepath = "/var/www/files/" . $filename;  
readfile($filepath);  

The attacker requests:

download?file=../../etc/passwd  

The server might return the contents of sensitive files.

Scenario 2: Archive Extraction

If the server extracts user-uploaded archives, and the archive contains malicious paths (e.g., ../../script.sh), it may overwrite system files.


4. Vulnerability Discovery Methods

  1. Manual Testing: Attempt the following payloads in file parameters:
    • ../../../etc/passwd
    • ....//....//etc/passwd (double encoding or special bypasses)
    • Absolute paths (e.g., /etc/passwd)
  2. Tool Assistance: Use Burp Suite's Scanner module or specialized path traversal tools (e.g., dotdotpwn).

5. Protection Strategies

Strategy 1: Input Validation and Normalization

  • Allowlist Validation: Only permit specific characters or file extensions (e.g., [a-zA-Z0-9]\.pdf).
  • Path Normalization: Use functions like realpath() or Path.GetFullPath() to resolve full paths and verify they are within allowed directories:
    $basePath = "/var/www/files/";  
    $userPath = realpath($basePath . $filename);  
    if (strpos($userPath, $basePath) !== 0) {  
        die("Illegal path!");  
    }  
    

Strategy 2: File Index Mapping

  • Avoid using user-supplied filenames directly; instead, map numeric IDs to actual files:
    download?id=123  → Actual File: safe_name_xyz.pdf  
    

Strategy 3: Web Server Configuration

  • Restrict access directories (e.g., configure Nginx's root directive with least-privilege paths).
  • Disable HTTP access to sensitive directories (e.g., /etc/).

Strategy 4: Secure Coding Practices

  • Avoid direct path concatenation; use language-specific safe functions (e.g., Python's os.path.join).
  • Perform path validation during archive extraction to reject files containing ../.

6. Vulnerability Cases

  • Apache Tomcat CVE-2020-1938: Allows reading arbitrary files outside the web application via specific requests.
  • A Cloud Storage Application Path Traversal: Enables downloading other users' private files by modifying URL parameters.

Summary

The core of path traversal vulnerabilities lies in excessive trust in user input. Protection requires a multi-layered approach, combining input validation, path normalization, the principle of least privilege, and other measures to ensure file operations are strictly confined to the intended scope.