Insecure deserialization is the class on the web application security vulnerabilities map that turns "we just hydrate the object back" into "we hand the attacker a shell". It is catalogued as CWE-502 and folded into A08 Software and Data Integrity Failures in the 2021 OWASP Top 10 after standing on its own as A8:2017. The bug is older than JSON, the defences are well understood, and it is still everywhere because every modern stack ships a native serializer that happily reconstructs arbitrary class instances from bytes the attacker controls.
This article is the deep dive for the security cluster. I cover what deserialization actually is, why object reconstruction creates attack surface, the language-by-language picture (Java, .NET, Python, PHP, Ruby, Node), what gadget chains are and why ysoserial works, the real-world incidents that defined the class, how to detect and exploit it in practice, and the defences that actually move the needle. No working lab in this one: the surface is spread across half a dozen ecosystems and the cluster's exploitation walkthroughs live in language-specific spokes.
In short: what is insecure deserialization?
Insecure deserialization is what happens when an application reconstructs an in-memory object from bytes supplied by an untrusted party using a serializer that runs constructors, magic methods, or type-driven side effects during reconstruction. The attacker does not need to read or write a single line of the application's code: they craft a payload that, when the application "just unpickles" or "just deserializes" it, executes attacker-chosen code as a side effect of the rehydration. The result is typically remote code execution, occasionally an authentication bypass (forged session objects), occasionally a denial-of-service (object expansion, infinite recursion). The defining feature of the bug is that the application's own intended logic never has to run; the harm is done before the deserialize call returns.
What is deserialization, mechanically?
Serialization is the process of turning an in-memory object (a graph of references, with types, fields, and sometimes private state) into a stream of bytes that can be persisted to disk, written to a cookie, posted to a queue, or sent over the wire. Deserialization is the inverse: bytes go in, an object graph comes out. Two flavours exist:
- Native / binary serializers preserve type identity. Java's
ObjectInputStream, .NET'sBinaryFormatter, Python'spickle, Ruby'sMarshal, PHP'sunserialize. The byte stream carries class names; the deserializer locates the class on the runtime classpath, allocates an instance, and populates its fields. Many of them also invoke constructors, finalizers, or magic methods (readObject,__wakeup,__destruct,__reduce__) as part of the reconstruction. That side-effect-during-reconstruction is the entire attack surface. - Schema-based / text serializers preserve data, not types. JSON, XML, MessagePack, Protocol Buffers. The bytes describe values; the application's code decides what type to coerce them into. These are safer by default, but several JSON libraries opt back into type information for "polymorphic" objects (Json.NET's
TypeNameHandling, Jackson's default typing) and re-introduce the same class of bug.
The mental model that defenders need: any serializer that lets the byte stream choose which class gets instantiated is a code-execution primitive waiting for a gadget chain. Treat the deserialize call exactly like an eval() call on attacker-controlled input.
A note on OWASP Top 10 placement
Insecure deserialization was its own category as A8:2017 in the OWASP Top 10. In the 2021 revision it was merged into A08 Software and Data Integrity Failures alongside auto-update integrity, unsigned packages, and CI/CD trust failures, on the reasoning that all of these collapse to "the application trusted data it should not have trusted". The category move is not a downgrade: deserialization bugs are still routinely the highest-CVSS findings in any pentest report that surfaces them, because the typical outcome is unauthenticated RCE.
Java: ObjectInputStream and the ysoserial era
Java's java.io.ObjectInputStream.readObject() accepts a byte stream and returns an object graph. The stream begins with the four magic bytes \xac\xed\x00\x05 (the serialization protocol marker plus version), followed by tagged records: class descriptors, then field data, then nested objects. During reconstruction, the JVM:
- Reads the class name from the stream.
- Loads the class from the application's classpath (whatever JARs are on it).
- Allocates an instance.
- Restores fields.
- Calls the class's
readObjectmethod if one is defined.
Step 5 is the foothold. readObject is a normal Java method; it can do anything Java code can do. If an attacker can reach a class on the classpath whose readObject performs a sensitive operation, the bytes-to-shell chain is open.
The breakthrough was ysoserial. The underlying research, "Marshalling Pickles", was presented by Chris Frohoff and Gabriel Lawrence at AppSecCali in 2015; the ysoserial tool itself was released by Chris Frohoff. ysoserial is a library of pre-built gadget chains: sequences of method calls, beginning with the deserialization entry point and ending in Runtime.exec, that exist purely from classes shipping in widely-used libraries like Apache Commons Collections, Spring, Groovy, and Hibernate. The Commons Collections chain (CommonsCollections1 and its descendants), corresponding to CVE-2015-7501, used the InvokerTransformer class to turn a TransformedMap.put operation, triggered during the deserialization of an AnnotationInvocationHandler, into reflection-driven Runtime.exec("calc.exe") or whatever command you wanted. The attacker did not need the application to use Commons Collections; the library just had to be on the classpath.
Generate a payload:
java -jar ysoserial.jar CommonsCollections1 'id > /tmp/pwned' > payload.binAny endpoint that calls ObjectInputStream.readObject on the body of an HTTP request, the value of a cookie, the contents of a JMS message, the RMI invocation arguments, or anywhere else attacker bytes meet readObject, is exploitable. Foxglove Security's 2015 writeup ("Marshalling Pickles") demonstrated working exploits against WebSphere, JBoss, Jenkins, WebLogic, and OpenNMS, all from the same primitive. Java deserialization stopped being a theoretical concern that afternoon.
.NET: BinaryFormatter and the type-confusion JSON traps
.NET's System.Runtime.Serialization.Formatters.Binary.BinaryFormatter is the direct analogue of Java's ObjectInputStream: it serialises and deserialises arbitrary CLR types, runs constructors and OnDeserialized hooks, and is gadget-chain-exploitable in the same way. The .NET equivalent of ysoserial (ysoserial.net) ships chains for TypeConfuseDelegate, ObjectDataProvider (a WPF class that, charmingly, exists to invoke arbitrary methods reflectively), WindowsIdentity, and others. Microsoft has explicitly marked BinaryFormatter as dangerous, deprecated it in .NET 5, and removed it from .NET 9. If your codebase still imports it, the surface is open.
The trap that catches modern .NET apps is JSON. Newtonsoft.Json (Json.NET) has a setting called TypeNameHandling that emits a "$type" field in the JSON describing the runtime type of each value. Set it to Auto, All, or Objects and Json.NET, on deserialize, loads the class named in $type from any loaded assembly and instantiates it. An attacker posts JSON like:
{
"$type": "System.Windows.Data.ObjectDataProvider, PresentationFramework",
"MethodName": "Start",
"ObjectInstance": {
"$type": "System.Diagnostics.Process, System",
"StartInfo": { "FileName": "cmd", "Arguments": "/c calc" }
}
}and the JSON deserialize call launches a process. Same primitive, JSON-shaped. Jackson's enableDefaultTyping in the Java world has the same shape: friendly polymorphism, hostile deserialization. The Newtonsoft docs now warn against TypeNameHandling other than None for any input that crosses a trust boundary.
Python: pickle, the canonical "do not deserialize untrusted data"
Python's pickle module is the textbook example. The first line of its documentation is a warning that the module "is not secure" and that untrusted pickles can execute arbitrary code. The mechanism is the __reduce__ protocol: any class can define a __reduce__ method that returns a tuple (callable, args). On unpickle, the runtime calls callable(*args) to reconstruct the object. The attacker payload:
import pickle, os
class Exploit:
def __reduce__(self):
return (os.system, ('id > /tmp/pwned',))
print(pickle.dumps(Exploit()))Anything that calls pickle.loads on attacker-controlled bytes is RCE. Pickle protocol 4 (default in Python 3.8 through 3.13) starts with the magic bytes \x80\x04; protocol 5 (default from Python 3.14 onwards, released October 2025) starts with \x80\x05; the leading \x80 is the protocol-version opcode and is a reliable detection signal in HTTP bodies, cookies, and file uploads. The same warning applies to cPickle, dill, cloudpickle, shelve (built on pickle), and joblib (the model-loading library used across the ML world, also built on pickle). A scikit-learn model.pkl downloaded from an untrusted source is a remote code execution waiting to happen; this is why HuggingFace now scans for pickle imports in uploaded models and warns the user.
PHP: unserialize and PHAR
PHP's unserialize accepts a string and returns an object graph. The wire format is human-readable: O:6:"Person":1:{s:4:"name";s:5:"alice";} is an Object named Person with one field name set to alice. The leading O: is a reliable magic signal in cookies, hidden form fields, and stored user data. During unserialize, PHP invokes:
__wakeup()on every object instantiated.__destruct()when the object is later garbage-collected.__toString()if the object is coerced to a string later in the request.
Frameworks ship classes that do useful work in those magic methods, which means gadget chains are easy to find. PHPGGC (PHP Generic Gadget Chains) is the PHP ysoserial: pre-built payloads for Laravel, Symfony, WordPress plugin classes, Drupal, Magento, and Joomla.
The PHP-specific twist is PHAR deserialization. PHAR is PHP's archive format. When any filesystem function (file_exists, is_file, file_get_contents, filesize, fopen) is called with a phar:// URL, PHP parses the archive's manifest, which is stored as a serialized PHP object, and unserializes it to recover the metadata. The attacker does not need to find a literal unserialize call in the codebase: they need to find any file-handling code that touches a path they can influence. Upload a PHAR archive disguised as an image (PHAR's binary format is forgiving of leading bytes), trigger the application to file_exists on it via phar://uploads/avatar.jpg/foo, and the manifest deserializes. Sam Thomas's 2018 Black Hat USA presentation on PHAR deserialization rewrote the threat model for every PHP app that handled user-supplied filenames. PHP 8.0 dropped automatic PHAR-stream metadata deserialization in some contexts, but plenty of LTS-pinned PHP 7.x systems are still live.
Ruby: Marshal and the Rails cookie incidents
Ruby's Marshal.load is the native binary serializer; it starts with the magic bytes \x04\x08 (protocol version 4.8) and, like every other native serializer on this page, can be coerced into RCE via gadget chains in widely-deployed libraries. The classic vector was Rails: through Rails 3.x, the default session store stored marshalled Ruby objects in a signed cookie. The signing key (the secret_token) leaked through misconfigured Capistrano deploys, accidentally committed secrets.yml files, and developer machines often enough that "decrypt the cookie, swap in a Marshal-encoded gadget, re-sign with the leaked key" became a routine post-exploitation step on any Rails 3 app. Rails 4 moved to a JSON-based encrypted cookie store by default, which closed the most obvious version of the bug, but plenty of long-lived Rails apps still carry Marshal-backed caches, ActiveJob queues, or Memcached values that an attacker who reaches the cache can poison.
YAML in Ruby has the same issue with a friendlier face: YAML.load (under the older syck or recent psych backends) supported !ruby/object tags that instantiate arbitrary classes. The default loader in modern Ruby is now YAML.safe_load, which refuses object tags; YAML.load was effectively renamed to YAML.unsafe_load in Psych 4.0 to make the danger explicit.
Node.js: node-serialize, the IIFE trick
Node has no first-class native serializer in core; JSON.parse does not instantiate user-defined types and is safe by default. The bugs come from third-party packages that re-invent native serialization, most notably node-serialize. Its unserialize evaluates the field rce if it contains an immediately-invoked function expression:
const serialize = require('node-serialize');
const payload = '{"rce":"_$$ND_FUNC$$_function(){require(\'child_process\').exec(\'id > /tmp/pwned\', () => {});}()"}';
serialize.unserialize(payload);Ajv Brar's 2017 writeup popularised the technique. The same shape exists in serialize-javascript (older versions), funcster, and any homegrown "let's persist a function across processes" helper. Any time a Node codebase serialises a function reference, the unserialize side is suspect.
What gadget chains actually are
A gadget chain is the deserialization equivalent of ROP (return-oriented programming) in binary exploitation. The attacker does not write the exploit code; they assemble a chain of calls from pre-existing code already present in the target's libraries. Each "gadget" is a class that, when its readObject/__wakeup/__reduce__/OnDeserialized runs, performs one useful operation: a hash-table lookup, a reflection call, a comparator invocation, a Method.invoke. The chain begins with a class whose magic method is automatically triggered by the deserializer and ends with the JVM's Runtime.exec, the CLR's Process.Start, or os.system. Building one is mostly graph search across the classpath; that is what ysoserial, ysoserial.net, PHPGGC, and the various pickle-payload generators automate.
The unintuitive consequence is that fixing the application's own code does not fix the bug. If Runtime.exec lives in a library on the classpath, and Commons Collections is on the classpath, and the application calls readObject on attacker bytes, the chain works regardless of what the application itself does after deserialization returns. The fix has to be at the deserialize call, not in the surrounding logic.
Real-world incidents
A focused tour of incidents where the deserialization primitive defined the outcome. Version-specific and CVSS-specific details age fast; verify against the linked advisory before quoting.
- Apache Struts 2, CVE-2017-5638 (March 2017). Often filed as the Equifax CVE. Strictly speaking this is an OGNL expression-injection bug in the Jakarta multipart parser rather than classical object-deserialization, but it sits in the same family: untrusted input drives object/expression evaluation server-side, ending in
Runtime.exec. Unauthenticated RCE, CVSS 10.0. Equifax's failure to patch within the window led to the breach of about 147 million U.S. consumer records over May to July 2017. - Oracle WebLogic Server, CVE-2017-10271 (October 2017). The WLS Security component deserialized SOAP request bodies via XMLDecoder without validation, accepting
java.beans.XMLDecoderpayloads that instantiated arbitrary classes. The primary attack surface is thewls-wsatHTTP endpoint (/wls-wsat/CoordinatorPortType), not T3. Mass exploitation through 2018 for Monero mining; the CVE is one of the most actively scanned-for Java RCEs in the wild and still shows up on internet-exposed WebLogic instances years later. - Liferay Portal, CVE-2020-7961 (March 2020). The JSON web services endpoint deserialised JSON with type information via Jodd JSON, allowing an unauthenticated attacker to instantiate arbitrary classes and reach RCE. Affected Liferay Portal CE 6.x and 7.x and Liferay DXP 7.x prior to the fix. Exploited in the wild within weeks of disclosure.
- Apache Commons Collections gadget era (CVE-2015-7501 and family, November 2015). Not a single product CVE so much as a primitive that made every Java application server with Commons Collections on the classpath and an exposed
readObjectsink trivially exploitable. WebSphere, JBoss, Jenkins, WebLogic, and OpenNMS all had to ship deserialization-filtering fixes; the long tail of internal Java apps that never patched is still out there. - Insecure pickle in ML model files (ongoing). Not a single CVE, a class. Every public model registry that hosts
.pkl,.pt(PyTorch), orjoblibfiles has had to add scanning for embedded__reduce__payloads. HuggingFace's "Pickle scanning" feature ships flags on uploads that importos,subprocess,socket, orbuiltins.eval. The PyTorch.ptformat is built on pickle; "just load this checkpoint" is, in the general case, RCE.
Detecting deserialization in the wild
The serializers leave fingerprints. Three signals worth knowing:
- Java
ObjectInputStream: stream begins with\xac\xed\x00\x05. Base64-encoded, that prefix isrO0AB. If you see a cookie, form field, or header beginning withrO0AB, the application is sending serialized Java; probe with a ysoserial DNS-OOB payload before assuming anything about whether it deserialises with class filtering. - Python pickle: protocol 4 streams (default in Python 3.8 through 3.13) start with
\x80\x04, protocol 5 (default from Python 3.14 onwards) with\x80\x05. Base64-encoded,gASVorgAWVare tell-tale prefixes. If a Django app stores a session that decodes to one of these, the session store is pickle-backed. - PHP
unserialize: human-readable. The leading character isO:for an object,a:for an array,s:for a string. If a cookie or hidden form field containsO:6:"User":you are looking at a PHP serialized object. - Ruby Marshal: binary, leads with
\x04\x08. Less common in HTTP bodies, common in Memcached/Redis values for older Rails apps. - .NET
BinaryFormatter: binary stream begins with\x00\x01\x00\x00\x00\xFF\xFF\xFF\xFF(header opcode + record type). Less common externally; more common inside.NET Remoting, MSMQ queues, and ViewState.
Burp Suite's Hackvertor extension recognises these prefixes and offers transforms. ysoserial in DNS-OOB mode is the standard probe for a suspected Java endpoint: send a payload whose only side effect is a DNS lookup of a unique subdomain (Burp Collaborator works), then watch whether the lookup arrives. No lookup is not "safe", it just means the chain you sent did not match the classpath; try a different one.
Exploitation patterns to recognise
Any of these is the start of an audit, not the conclusion of one:
- Session cookies that are not opaque random IDs. A cookie that looks like base64-of-binary is probably a serialized object. Old Rails, old Java app servers, .NET ViewState, Express apps using
cookie-sessionwith a serializer. - Hidden form fields that round-trip "complex" state. A field named
state,context,prefs, or__VIEWSTATEcarrying a long base64 string is often a serialized object the server expects to receive back unmodified. - JSON with
$typeor@typefields. Json.NET withTypeNameHandling, Jackson with default typing. The polymorphism feature is the bug. - JWT
alg: noneand weak-secret abuse. Not deserialization in the native sense, but in the same family: the JWT payload is JSON the server trusts after a signature check. If the signature check is bypassable (alg: noneaccepted, secret guessable, key confusion via RS256/HS256), the attacker controls every field the server later trusts. Treat it as a sibling threat. - File-handling code that accepts attacker-supplied paths in PHP. Any
file_exists($_GET['file'])is a PHAR deserialization candidate if the attacker can also upload a PHAR archive anywhere under the document root. - Model loaders, plugin loaders, deserialization-as-a-feature. Anything that loads a
.pkl,.pt,.joblib,.bin, or.picklefrom a path the user supplies. Treat the load call aseval.
Defences
Do not deserialize untrusted data
This is the first item in the OWASP Deserialization Cheat Sheet and it is correct. Native binary serializers (ObjectInputStream, BinaryFormatter, pickle, Marshal, PHP unserialize) should never see bytes that crossed a trust boundary. For data the application has to accept, use a schema-driven format that does not encode types: JSON via a schema-validated parser, Protocol Buffers, FlatBuffers. The deserialize step then has no class-resolution primitive to abuse.
If you must deserialize, allowlist the classes
For ecosystems where ripping out the native serializer is impractical, the next-best defence is a class allow-list at the deserialize boundary:
- Java: JEP 290 (Java 9+) and
ObjectInputFilter. Set a process-wide filter via-Djdk.serialFilter=...or a per-streamObjectInputStream.setObjectInputFilter. The filter sees every class about to be resolved and can reject it before instantiation. - .NET: prefer
System.Text.Json(no type information by default). IfNewtonsoft.JsonwithTypeNameHandlingis unavoidable, set aSerializationBinder(nowISerializationBinder) that allow-lists the small set of types the application actually expects. - Python: subclass
pickle.Unpicklerand overridefind_classto reject anything not on a tight allow-list. Better, switch tojsonormsgpackfor cross-trust-boundary traffic. - PHP: pass an
allowed_classesoption tounserialize:unserialize($data, ['allowed_classes' => ['ExpectedClass']]). Available since PHP 7.0. Setting it tofalsedeserialises objects as__PHP_Incomplete_Class, which has no magic methods and is inert. - Ruby:
YAML.safe_loadoverYAML.load; restrictMarshal.loadto internal trust zones only.
Sign and verify the payload
If the application has a legitimate reason to send serialized state out and accept it back (Rails-style signed cookies, ASP.NET ViewState with MAC), the integrity check has to happen before deserialization. The pattern:
- Serialize the object to bytes.
- Compute an HMAC of the bytes with a server-side secret.
- Emit
bytes || hmac. - On receive, recompute the HMAC and compare with
hmac.compare_digest. Only if it matches do you calldeserialize.
Constant-time compare matters; a normal == invites a timing oracle. The secret must be long enough (256 bits) and stored outside the codebase (env var, KMS, never secrets.yml in git). The Rails 3 cookie incidents above are the cautionary tale: HMAC was in place, the key was leaked, the chain executed.
Run deserialization in a sandbox
For the unavoidable cases (loading user-supplied ML models, accepting plugin packages), run the load in a sandboxed process with no network, no filesystem outside a tempdir, no privileges to escalate. The Linux primitives are seccomp filters, namespaces, and capability dropping; the container primitives are gVisor, Kata, or Firecracker. The HuggingFace approach is to scan first and then load in a constrained worker; that is the production pattern worth copying.
Integrity at the supply chain
Deserialization is sometimes only the proximate cause; the root cause is that the bytes came from a source the app should not have trusted. Sign the artefacts the app loads (cosign for container images, Sigstore for packages), pin dependency hashes, fail the build on hash drift. This is exactly the territory A08:2021 covers, and the reason the OWASP category was widened from "insecure deserialization" to "software and data integrity failures".
Common defence mistakes I still see
- Encrypting the payload and assuming that is enough. Encryption gives confidentiality, not integrity, and not against the holder of the key. ASP.NET ViewState encryption with MAC validation disabled is the canonical failure case: the attacker cannot read the ViewState, but can submit any ViewState they construct themselves with the same encryption key (often leaked) and the server will decrypt-then-deserialise. Always MAC, always verify the MAC before the deserialize call.
- Sanitising the deserialized object. Sanitisation runs after deserialize, so the gadget chain has already fired by the time the sanitiser sees the result. The defence has to be at or before the deserialize boundary.
- Relying on "the bytes are not user input, they came from our own session store". If the session store is Memcached or Redis and an attacker reaches the cache (via a separate bug, a misconfigured firewall, a shared multi-tenant cache), every cached serialized object is now attacker-controlled. Memcached poisoning to RCE via Marshal is a well-trodden Rails exploitation path.
- Removing
BinaryFormatterbut leaving Newtonsoft.Json withTypeNameHandling.Auto. Same primitive, JSON-shaped. The migration off BinaryFormatter is necessary; it is not sufficient if the replacement is a JSON library configured to carry type information. - Trusting WAF signatures for deserialization payloads. Some WAFs flag
rO0ABorO:[0-9]+:". The signature catches naive payloads; it does not catch gzip-then-base64, alternate base alphabets, chunked deliveries, or any of the dozen encodings ysoserial supports. WAF is one layer. - Forgetting the non-HTTP entry points. The deserialize call is sometimes nowhere near an HTTP handler: JMS, AMQP, MSMQ queues, RMI endpoints, JDWP debug ports, MongoDB BSON with custom types, Memcached values, Redis values, ActiveJob/Sidekiq jobs. Map every place serialized bytes cross a trust boundary, not just the HTTP surface.
- Treating
__VIEWSTATEas opaque. ViewState is serialized .NET. If MAC validation is off (theenableViewStateMacsetting, removed in newer .NET but still possible on legacy IIS), it is a deserialization sink the attacker can write directly to.
Where to go next
The deserialization cluster fans out from this hub into language-specific spokes. For a tooling-focused walkthrough of payload generators, deserializer fingerprinters, and gadget-chain libraries, the best deserialization tools 2026 listicle covers ysoserial, ysoserial.net, PHPGGC, marshalsec, and the pickle-payload generators side by side, with which one to reach for in which engagement.
For the broader outcome these bugs almost always reach, the remote code execution deep dive covers the full RCE landscape end to end, of which deserialization is one of the most reliable paths in. For the wider map of the class, back up to the web application security vulnerabilities taxonomy.
Sources
Authoritative references this article was fact-checked against.
- OWASP Top 10, A08:2021 Software and Data Integrity Failuresowasp.org
- OWASP Top 10, A8:2017 Insecure Deserializationowasp.org
- ysoserial, proof-of-concept tool for generating Java deserialization payloadsgithub.com
- OWASP Deserialization Cheat Sheetcheatsheetseries.owasp.org
- NVD, CVE-2017-5638 Apache Struts 2 remote code executionnvd.nist.gov
- NVD, CVE-2017-10271 Oracle WebLogic WLS Security componentnvd.nist.gov
- CWE-502: Deserialization of Untrusted Datacwe.mitre.org





