Deserialization Vulnerabilities and Protection

Deserialization Vulnerabilities and Protection

Description:
Deserialization vulnerabilities occur when an application restores serialized data (such as JSON, XML, or binary formats) into objects without adequately validating the data source. Attackers can craft malicious serialized data to trigger arbitrary code execution, privilege escalation, or data tampering during the deserialization process. These vulnerabilities are common in languages that use serialization features (e.g., Java, Python, PHP).

Background Knowledge:

Serialization: The process of converting an object into a storable or transmittable data format (e.g., byte stream, JSON string).
Deserialization: The process of restoring serialized data back into an object instance.
Root Cause of Vulnerability: The deserialization process may automatically execute special methods of objects (e.g., Java's readObject, Python's __reduce__). If these methods are maliciously exploited, dangerous operations can occur.

Attack Principle:

Malicious Payload Construction: Attackers tamper with serialized data, inserting malicious code or specifying objects that execute dangerous methods.
Trigger Point: During deserialization, the application automatically invokes object methods (e.g., initialization, destruction, custom logic).
Exploitation Scenarios:
- In Java, leveraging chain calls (Gadget Chains) from libraries like Apache Commons Collections to execute commands.
- In PHP, using methods like __destruct or __wakeup to include malicious files.
- In Python, exploiting the __reduce__ method to execute system commands.

Example (Java):
Assume an application receives serialized objects and directly deserializes them:

// Vulnerable code example
ObjectInputStream ois = new ObjectInputStream(inputStream);
Object obj = ois.readObject(); // Dangerous: input is not validated

An attacker may submit malicious serialized data containing chained objects, ultimately executing Runtime.exec("malicious command").

Protective Measures:

Avoid Deserializing Untrusted Data: Prefer pure data formats like JSON over object serialization.
Whitelist Validation: Restrict the classes that can be parsed during deserialization (e.g., Java's ObjectInputFilter).
Log Monitoring: Record deserialization exceptions to detect attack attempts.
Code Security: Avoid writing dangerous logic in special methods of serialized objects.
Library Updates: Promptly update serialization libraries with known vulnerabilities (e.g., Apache Commons Collections).

Summary:
The essence of deserialization vulnerabilities is loss of trust boundary control. The core of protection lies in strictly validating input data, restricting deserialization behaviors, and reducing risks through the principle of least privilege.