Regex Lookaheads and Lookbehinds: Syntax, Examples, Engine Support (2026)

Lookahead (?=...) asserts that what follows the current position matches without consuming any characters. Lookbehind (?<=...) asserts that what precedes the current position matches, also without consuming. Their negatives (?!...) and (?<!...) assert the opposite: that what follows or precedes does NOT match. These four "zero-width assertions" turn regex from a search tool into a context-aware matcher: "find every digit followed by kg, but don't include the kg in the match", or "match a password that contains at least one digit and one uppercase letter, without saying where they are".

The reason most regex tutorials introduce these last is that they sit on top of basic matching as a layer. Once you have them, you can write patterns that would otherwise need multiple regex passes or post-processing.

Jump to:

Lookahead: match X only if followed by Y
Lookbehind: match X only if preceded by Y
Negative lookahead and lookbehind
The password-validation use case
Examples in JavaScript, Python, and PHP
Engine compatibility (especially JavaScript lookbehind)
FAQ

Lookahead: match X only if followed by Y

Syntax: X(?=Y) matches X only if Y follows immediately after, but the match returned is X (the lookahead consumes nothing).

Example: extract numeric weights followed by kg:

code

\d+(?=kg)

Against input 42kg, 7lb, 150kg, this matches 42 and 150 (the 7 is rejected because lb follows, not kg). The kg itself is not in the matched string.

The classic where-this-is-useful case: extract a substring that's adjacent to a marker without capturing the marker. Without lookahead, you'd capture the marker and then strip it in post-processing.

Lookbehind: match X only if preceded by Y

Syntax: (?<=Y)X matches X only if Y immediately precedes it. Again, Y is not in the returned match.

Example: extract prices written with a leading dollar sign:

code

(?<=\$)\d+(\.\d{2})?

Against $42, €15, $99.50, this matches 42 and 99.50 but not 15 (it's preceded by €, not $). The $ is not in the result.

Negative lookahead and lookbehind

(?!...) and (?<!...) invert the assertion. They succeed when the pattern does NOT match.

Example: match digits NOT followed by a percent sign:

code

\d+(?!%)

Against 25%, 30, 40%, this matches 30 (25 is followed by %, so the lookahead fails; 40 is followed by %, same).

Negative lookbehind example: capitalised words NOT preceded by Mr.:

code

(?<!Mr\. )[A-Z][a-z]+

Against Mr. Smith and Alice met John, this matches Alice and John, but not Smith (because Mr. precedes it).

The password-validation use case

The classic real-world use case for lookahead: enforce that a password contains at least one digit AND at least one uppercase letter AND at least one special character AND is at least 8 characters. You can't easily do this with a positional regex because the order of those character types is unconstrained.

With lookaheads:

code

^(?=.*[0-9])(?=.*[A-Z])(?=.*[!@#$%^&*])[A-Za-z0-9!@#$%^&*]{8,}$

Each (?=...) is a zero-width assertion that anchors back to the start of the string (because of the ^). They all check the same range. The actual match ([A-Za-z0-9!@#$%^&*]{8,}$) is what consumes the characters and ensures the right length and character set.

This pattern is the foundation of every "strong password regex". The password strength validation walkthrough covers tighter variants and the cases where regex stops being the right tool.

Examples in JavaScript, Python, and PHP

JavaScript:

javascript

// Match numbers followed by "kg" (without including kg)
const weights = "42kg, 7lb, 150kg".match(/\d+(?=kg)/g);
// ['42', '150']

// Match prices preceded by $ (without including $)
const prices = "$42, €15, $99.50".match(/(?<=\$)\d+(?:\.\d{2})?/g);
// ['42', '99.50']

// Password validation
const strongPassword = /^(?=.*[0-9])(?=.*[A-Z])(?=.*[!@#$%^&*])[A-Za-z0-9!@#$%^&*]{8,}$/;
strongPassword.test("Hello123!");   // true
strongPassword.test("hello123!");   // false (no uppercase)

Python:

python

import re

# Numbers followed by kg
weights = re.findall(r"\d+(?=kg)", "42kg, 7lb, 150kg")
# ['42', '150']

# Prices preceded by $
prices = re.findall(r"(?<=\$)\d+(?:\.\d{2})?", "$42, €15, $99.50")
# ['42', '99.50']

# Password validation
strong = re.compile(r"^(?=.*[0-9])(?=.*[A-Z])(?=.*[!@#$%^&*])[A-Za-z0-9!@#$%^&*]{8,}$")
bool(strong.match("Hello123!"))  # True

PHP:

php

// Numbers followed by kg
preg_match_all('/\d+(?=kg)/', "42kg, 7lb, 150kg", $matches);
// $matches[0] = ['42', '150']

// Prices preceded by $
preg_match_all('/(?<=\$)\d+(?:\.\d{2})?/', "\$42, €15, \$99.50", $matches);
// $matches[0] = ['42', '99.50']

// Password validation
$strong = '/^(?=.*[0-9])(?=.*[A-Z])(?=.*[!@#$%^&*])[A-Za-z0-9!@#$%^&*]{8,}$/';
preg_match($strong, "Hello123!");  // 1 (match)

Engine compatibility (especially JavaScript lookbehind)

Lookahead (?=...) and negative lookahead (?!...) are supported in every modern regex engine that supports assertions: JavaScript, Python, PHP/PCRE, Java, .NET, Ruby, and Rust. Go's standard-library regexp does NOT support lookarounds at all because RE2 omits them to guarantee linear-time matching.

Lookbehind (?<=...) and negative lookbehind (?<!...) had patchier historical support:

Engine	Lookahead	Lookbehind	Variable-width lookbehind
JavaScript (V8)	All versions	ES2018+ (Chrome 62, Firefox 78, Safari 16.4, Node 10)	Yes, since ES2018
Python (`re`)	All versions	All versions	No, fixed-width only at every Python version including 3.12. Use the third-party `regex` package for variable width
Python (`regex` package)	All versions	All versions	Yes
PCRE	All versions	All versions	PCRE1 fixed-width only; PCRE2 (PHP 7.3+) supports variable width
Java	All versions	All versions	Limited (alternation of fixed widths) since Java 9+
.NET	All versions	All versions	Yes, always (the only engine that's had variable-width since v1)
Ruby (Onigmo)	All versions	1.9+	Yes
Rust (`regex` crate)	All versions	No	n/a (no lookbehind at all)
Go (`regexp`, RE2)	No	No	n/a (RE2 omits all lookaround)

In Go specifically, if you need lookbehind, use the third-party github.com/dlclark/regexp2 package which implements the .NET regex flavor and supports the full feature set. Same advice applies in Rust: the stdlib-style regex crate is linear-time RE2-style; use fancy-regex or regress if you need lookarounds.

Variable-width vs fixed-width lookbehind

The most common cross-engine surprise. Fixed-width lookbehind means the assertion has to match a specific number of characters. (?<=abc) is fixed-width 3. (?<=ab|cd) is fixed-width 2. (?<=a*) or (?<=https?:\/\/) is variable-width because the matched length can vary.

The breakdown:

JavaScript V8 (since ES2018): variable-width lookbehind is supported. (?<=https?:\/\/)\w+ works. This was a deliberate spec choice and one of the headline ES2018 regex features.
Python re: fixed-width only. (?<=https?:\/\/)\w+ raises error: look-behind requires fixed-width pattern. The workaround is to use alternation of fixed-width options, (?<=http:\/\/|https:\/\/)\w+, which Python accepts because each alternative is itself fixed-width.
Python regex package: variable-width supported. Drop-in replacement for re when you need this.
PCRE2 (PHP 7.3+): supports variable-width lookbehind. PCRE1 (older PHP) is fixed-width.
Java: limited variable-width support since Java 9. The engine accepts alternation of fixed-widths and capped quantifiers ({0,5}), but rejects unbounded ones (*, +, {0,}).
.NET: variable-width lookbehind has always worked.

If you're writing cross-engine regex, prefer fixed-width or alternation-of-fixed-width lookbehinds. That's the only form that compiles everywhere.

Common mistakes

Mistake 1: capturing inside a lookaround and expecting it in the result. A lookahead like (?=(\d+)) does capture the digits into group 1, but the lookaround itself contributes zero characters to the main match. If you only check match[0], you won't see them. Use the explicit capture group reference (match[1]) or restructure the pattern.

Mistake 2: writing impossible lookarounds. q(?=u)i can never match because it asks the engine to match i at the same position where u was asserted to follow. The lookahead consumed no characters, so after the assertion succeeds the engine is still trying to match i at the u position. Always write the assertion at the position you actually want.

Mistake 3: variable-width lookbehind in Python re. Python's stdlib re will refuse to compile (?<=https?:\/\/)\w+ and throw error: look-behind requires fixed-width pattern. Either rewrite as alternation of fixed-widths ((?<=http:\/\/|https:\/\/)) or switch to the third-party regex package.

Mistake 4: using lookarounds in Go. Go's regexp package will not even compile patterns with (?=, (?<=, (?!, or (?<!. The compile call returns an error rather than failing silently. Reach for the third-party regexp2 package or restructure the pattern to avoid lookarounds.

Mistake 5: assuming the password pattern enforces character order. ^(?=.*[0-9])(?=.*[A-Z])... says "the string must contain at least one digit AND at least one uppercase letter somewhere". It does NOT enforce that the digit comes before the uppercase letter, or any other order. Each lookahead is an independent zero-width assertion anchored at the start.

Mistake 6: forgetting that lookarounds are zero-width. Stacking lookaheads at the same position is fine and is the foundation of the password-validation pattern. Stacking lookbehinds at the same position is also fine. Mixing them with consuming patterns at the same position requires the engine to back up and re-try, which is the right behavior but worth knowing for performance.

What to do next

For the regex features that pair most naturally with lookarounds:

Regex Anchors: ^, $, \A, \z, the line-and-string boundary assertions you'll use alongside lookarounds for full-input validation.
Regex Word Boundaries: \b and \B, the other zero-width assertion family. Lookarounds are the variable-width alternative when \b isn't precise enough.
Regex Capturing Groups and Backreferences: parentheses you use to capture and reference parts of a match, which combine with lookarounds to write surprisingly capable patterns.
Validate Password Strength with Regex: the production application of the lookahead-stacking pattern shown above.

For specific real-world patterns that lean heavily on lookarounds:

Match URLs with Regex: scheme + host + path extraction uses lookbehind for the scheme assertion.
Match Email Addresses with Regex: negative lookahead is essential for excluding non-email tokens that look superficially like addresses.

For the wider regex syntax reference, see the Regex Cheat Sheet.

And for a war story about exactly this trade-off, RE2-style engines like Go's dropping lookarounds in exchange for linear-time matching: the interview question that cost me the job, where a Go log-parsing take-home turned on knowing which patterns Go's regexp will and won't compile.

External reference: the MDN regex assertions page covers JavaScript's implementation; test interactively at regex101.com (set the flavor to PCRE2 or JavaScript depending on your target).

FAQ

Lookahead (?=...) asserts that a pattern follows the current position; lookbehind (?<=...) asserts that a pattern precedes it. Neither consumes characters; they only assert.

Use lookahead when you want to match X but only when Y comes after, without capturing Y. Use lookbehind for the opposite: match X only when Y comes before.

Positive lookahead (?=Y) succeeds when Y matches at the current position. Negative lookahead (?!Y) succeeds when Y does NOT match. Same idea applies to lookbehind: (?<=Y) requires Y before, (?<!Y) requires Y to NOT be before.

Common use: positive lookahead to find tokens adjacent to a marker without capturing the marker. Negative lookahead to find tokens that explicitly avoid a context (e.g., \d+(?!%) for numbers not followed by a percent sign).

Yes, since ES2018. Browser support: Chrome 62+, Firefox 78+, Safari 16.4+, Node.js 10+. Older runtimes throw a SyntaxError at regex-compile time.

JavaScript's lookbehind also supports variable-width patterns since ES2018, which is unusual: most other engines require fixed-width or limited alternation. If you need to support pre-2018 browsers, restructure the pattern using non-capturing groups and post-processing, or wrap the regex construction in try/catch.

Python's stdlib re module requires fixed-width lookbehind at every version including 3.12. Patterns like (?<=https?:\/\/) are rejected because the alternation can match 7 or 8 characters depending on which side wins.

Two fixes. Rewrite as alternation of fixed-widths: (?<=http:\/\/|https:\/\/)\w+ (each branch is now fixed). Or install the third-party regex package, which is a drop-in replacement for re with full variable-width lookbehind support.

Not in the standard library's regexp package, which uses RE2 and omits all backreferences and lookarounds to guarantee linear-time matching.

If you need lookbehind in Go, use the third-party regexp2 package, which implements the .NET regex flavor and supports the full feature set. Same situation in Rust: the stdlib regex crate is RE2-style and has no lookarounds; use fancy-regex for the full PCRE-style feature set.

An assertion that matches a position rather than characters. Anchors (^, $, \b) and lookarounds are all zero-width: they constrain where a match can happen but contribute no characters to the matched text.

The practical implication is that you can stack them without affecting the match's length or position. Multiple lookaheads at the start of a pattern (like in password validation) all check the same range from the same anchor point.

Use negative lookbehind: (?<!preceding\s)word. In engines that require fixed-width lookbehind (Python stdlib re, older PCRE), the inside of the lookbehind has to match an exact length. In engines that support variable-width (JavaScript V8, .NET, PCRE2, the Python regex package), the lookbehind can match any width.

If your engine has fixed-width-only lookbehind and you need variable width, restructure as a positive match plus filtering in code rather than fighting the regex.

Yes. Capturing groups inside a lookaround do capture text into numbered groups, even though the lookaround itself contributes nothing to the main match. A pattern like q(?=(\d+)) matches a literal q at position 0 (length 1), but group 1 captures the digits that follow.

The common bug is checking match[0] and expecting the captured digits there. They're in match[1]. This is the technique people use to "match" a length-zero context and pull out a substring that wasn't consumed.

Recommended books

Lookarounds are the most powerful and most misunderstood part of regex. The books that explain them properly:

Mastering Regular Expressions (Jeffrey Friedl, 3rd edition). The definitive deep-dive on how regex engines actually work: backtracking, NFA versus DFA, and the optimisation that makes a pattern fast or catastrophic. Dense, and unmatched once you are past the basics.
Regular Expressions Cookbook (Jan Goyvaerts and Steven Levithan, 2nd edition). Problem-then-solution recipes across eight languages (JavaScript, Python, PHP, Java, .NET, Ruby, Perl, VB). The one to keep next to the keyboard.

How to Use Regex Lookaheads and Lookbehinds

Lookahead: match X only if followed by Y

Lookbehind: match X only if preceded by Y

Negative lookahead and lookbehind

The password-validation use case

Examples in JavaScript, Python, and PHP

Engine compatibility (especially JavaScript lookbehind)

Variable-width vs fixed-width lookbehind

Common mistakes

What to do next

FAQ

Recommended books

Sources

Ishan Karunaratne

Related posts

How to Use Regex in Nginx (location and rewrite)

How to Use Regex in .htaccess (Apache mod_rewrite)

How to Give a User sudo Access on Linux

What is the difference between lookahead and lookbehind in regex?

What is the difference between positive and negative lookahead?

Does JavaScript support regex lookbehind?

Why does my Python lookbehind throw "requires fixed-width pattern"?

Can I use lookbehind in Go?

What is a zero-width assertion?

How do I write a regex that matches a word NOT preceded by another word?

Can I capture text inside a lookaround?

Sources

Ishan Karunaratne