grep does not have one regex engine. It has three, and which one you get depends on the flag. The default is basic regex (BRE). Add -E and you get extended regex (ERE). Add -P and you get Perl-compatible regex (PCRE). The same pattern string can match different text, or fail to compile, depending on which mode is active.
The mode that surprises everyone is the default. In BRE, the characters +, ?, |, (, ), {, and } are literal text. They match themselves. To use them as metacharacters you have to backslash-escape them: \+, \?, \|, \(, \), \{, \}. That inversion (escape the metacharacter to make it special, leave it bare to make it literal) is the single biggest reason people give up on grep and reach for grep -E.
This article is the deep dive on the three modes. For the full flag reference, see the grep cheat sheet.
Set your values
Set your OS, search path, and a test pattern. Every grep example below updates with your values.
The three modes at a glance
| Mode | Flag | Metacharacters bare | Best for |
|---|---|---|---|
| Basic (BRE) | none (default) | . * ^ $ [...] \{ \} \( \) \+ \? | | Simple literal-ish searches; portable scripts |
| Extended (ERE) | -E (or egrep) | adds + ? ` | ( ) ` bare |
| Perl-compatible (PCRE) | -P | adds lookaround, \d \w \s, non-greedy, backreferences | Anything BRE and ERE cannot express |
The practical advice: reach for -E by default. Use plain grep only when the pattern is genuinely basic, and use -P only when you need something PCRE-exclusive and you are on GNU grep.
BRE: the default, where metacharacters are literal
In basic regex, this list of characters means themselves, not their regex function:
+ matches a literal plus sign
? matches a literal question mark
| matches a literal pipe character
( matches a literal open paren
) matches a literal close paren
{ matches a literal open brace
} matches a literal close brace
To get the regex behavior, you escape them. So in BRE, "one or more digits" is written with an escaped plus:
grep '[0-9]\+' app.logThat \+ is "one or more of the preceding". Without the backslash, [0-9]+ would match a digit followed by a literal + character. Grouping and alternation work the same way, escaped:
grep '\(error\|warn\)' app.logThe escaped \( and \) form a group; the escaped \| is alternation. Interval quantifiers also need escaping. To match "between 2 and 4 of the preceding", you write the braces escaped:
grep 'a\{2,4\}' app.logWhat does work bare in BRE: . (any character), * (zero or more of the preceding), ^ (start of line), $ (end of line), [...] (character class), and [^...] (negated class). Those five are the BRE toolkit. Everything else is escape-to-activate.
BRE exists because it is the original 1970s grep behavior, frozen by POSIX for backward compatibility. The default stays, and -E is the opt-in to sanity.
ERE: extended regex, the one you actually want
Extended regex flips the rule. In ERE, + ? | ( ) { } are metacharacters directly, no backslash needed. To match them literally you escape them, which is what every other regex flavor does and what your instincts expect.
The same three patterns from above, rewritten for ERE:
grep -E '[0-9]+' app.log
grep -E '(error|warn)' app.log
grep -E 'a{2,4}' app.logCleaner, and it matches how regex works in Python, JavaScript, Perl, and every editor's find dialog. This is why ERE is the right default for interactive use.
grep -E ':pattern' :search_path/*.logegrep is the historical shorthand for grep -E. It still works on most systems but modern GNU and BSD grep print a deprecation warning and tell you to use grep -E. Treat egrep as legacy; write grep -E in anything you commit.
One thing ERE does not add: the Perl shorthand classes. \d, \w, and \s are not part of ERE. More on that below.
PCRE: the full Perl engine
grep -P switches to PCRE, the regex library that backs Perl. This is a genuinely different and far larger engine. It adds everything ERE has plus:
- Lookahead
(?=...)and negative lookahead(?!...) - Lookbehind
(?<=...)and negative lookbehind(?<!...) - Non-greedy quantifiers:
*?,+?,??,{n,m}? - Shorthand classes:
\d(digit),\w(word char),\s(whitespace), and their negations\D,\W,\S - Named groups:
(?<name>...) - Backreferences by number
\1and by name\k<name> - Word boundaries
\bthat work reliably across the engine
Lookbehind is the headline feature. To extract the value after user= without including user= itself in the match, you anchor with a lookbehind:
grep -oP '(?<=user=)\w+' app.logThe (?<=user=) lookbehind asserts "preceded by user=" without consuming those characters, so -o prints just the username. There is no way to write that in BRE or ERE. The closest you get is a capture group plus sed or awk to pull the group out.
Non-greedy matching is the other one ERE cannot do. .* is greedy and grabs as much as possible; .*? stops at the first opportunity:
grep -oP '".*?"' data.json-P is GNU only. It is a compile-time option in GNU grep, and even on Linux some minimal builds omit it (you get grep: support for the -P option has not been compiled in). It does not exist at all in BSD grep, which is what macOS ships. That platform gap is the next section.
The same match in all three flavors
Here is one task (find lines with one or more digits followed by ms) written three ways:
BRE: grep '[0-9]\+ms' app.log
ERE: grep -E '[0-9]+ms' app.log
PCRE: grep -P '\d+ms' app.log
All three match the same lines. BRE escapes the +; ERE uses it bare; PCRE uses it bare and swaps [0-9] for the \d shorthand. ERE is the portable choice that still reads cleanly.
A second example, "a word repeated 2 to 3 times", shows the brace difference:
BRE: grep '\(foo\)\{2,3\}' app.log
ERE: grep -E '(foo){2,3}' app.log
PCRE: grep -P '(foo){2,3}' app.log
ERE and PCRE are identical here; only BRE needs the escaping.
Feature comparison
| Feature | BRE | ERE | PCRE |
|---|---|---|---|
Anchors ^ $ | Yes | Yes | Yes |
Any char ., star * | Yes | Yes | Yes |
Character class [...] | Yes | Yes | Yes |
| Grouping | \( \) | ( ) | ( ) |
| Alternation | | | ` | ` |
| One-or-more, zero-or-one | \+ \? | + ? | + ? |
| Interval quantifier | \{n,m\} | {n,m} | {n,m} |
Shorthand \d \w \s | No | No | Yes |
POSIX class [[:digit:]] | Yes | Yes | Yes |
Backreference \1 | Yes | No (POSIX), GNU adds it | Yes |
Non-greedy *? | No | No | Yes |
| Lookahead, lookbehind | No | No | Yes |
| Named groups | No | No | Yes |
The two rows that catch people: shorthand classes are PCRE-only, and ERE actually drops backreference support that BRE has (GNU re-adds it as an extension, but POSIX ERE has no \1).
\d \w \s are not in BRE or ERE
This is the most common false assumption. \d looks universal because it works in Python, JavaScript, and PCRE. But in BRE and ERE, \d is just an escaped d, which matches a literal d. So grep -E '\d' finds the letter d, not digits.
The portable replacement is a POSIX character class or an explicit range:
| Perl shorthand | POSIX class (BRE/ERE) | Explicit range |
|---|---|---|
\d | [[:digit:]] | [0-9] |
\w | [[:alnum:]_] | [A-Za-z0-9_] |
\s | [[:space:]] | (no clean range) |
\D | [^[:digit:]] | [^0-9] |
So "three digits" in ERE is:
grep -E '[[:digit:]]{3}' app.logPOSIX classes have an advantage over [0-9]: they are locale-aware. In a non-ASCII locale, [[:alpha:]] matches accented letters that [A-Za-z] misses. For pure ASCII data the explicit ranges are fine. If you genuinely want \d and \w, that is your signal to use -P on GNU grep.
macOS BSD grep vs GNU grep
macOS ships BSD grep, not GNU grep. They agree on BRE and ERE. They diverge hard on PCRE.
| Capability | GNU grep | BSD grep (macOS default) |
|---|---|---|
| BRE (default) | Yes | Yes |
ERE (-E) | Yes | Yes |
PCRE (-P) | Yes (if compiled in) | Not supported at all |
\d \w \s in -E | Literal d w s | Literal d w s |
POSIX classes [[:digit:]] | Yes | Yes |
Backreference \1 in ERE | Yes (GNU extension) | No |
On macOS, grep -P fails immediately with grep: invalid option -- P. There is no PCRE engine behind BSD grep to enable. Three fixes:
- Install GNU grep.
brew install grepputs it on PATH asggrep. Runggrep -P '...', or aliasgrep='ggrep'in your shell rc. - Use
pcregrep. A separate Homebrew package (brew install pcre) that is purpose-built for PCRE and also does multi-line matching with-M. - Use ripgrep.
brew install ripgrep, thenrg --pcre2 '...'. ripgrep defaults to its own ERE-like engine and switches to PCRE2 on the--pcre2flag.
grep -oP '(?<=v)[0-9]+' :search_path/*.logPowerShell's Select-String uses the .NET regex engine, which supports lookaround and \d natively, so the PCRE-style patterns just work on Windows without any extra install.
Common mistakes
1. Using + in BRE and expecting one-or-more. Plain grep '[0-9]+' looks for a digit followed by a literal plus sign, because in BRE the + is literal. You wanted grep '[0-9]\+' or, better, grep -E '[0-9]+'. This is the number-one BRE trap.
2. Expecting \d to work under -E. grep -E '\d{3}' does not match three digits. ERE has no \d; the engine reads it as a literal d. Use grep -E '[0-9]{3}' or grep -P '\d{3}'.
3. Reaching for lookahead without -P. (?=...) and (?<=...) are PCRE constructs. Under plain grep or grep -E they are parsed as a literal group containing a literal ? and =. If you need lookaround, you need -P, full stop.
4. Running grep -P on macOS. BSD grep has no -P and never will. The command fails with invalid option. Install GNU grep, pcregrep, or ripgrep instead of fighting it.
5. Escaping in the wrong direction. In ERE, \( matches a literal paren and ( starts a group. People coming from BRE escape their groups out of habit, then wonder why the grouping vanished. Pick a mode and commit to its rules.
6. Forgetting POSIX intervals need a closing brace. grep -E 'a{2,' with an unterminated {2, is sometimes accepted as literal text and sometimes errors, depending on the build. Always close the interval.
When NOT to use this
Regex is not always the right tool. Skip it when:
- The pattern is a fixed literal string. If you are searching for
192.168.1.1orCmd+Shift+P, usegrep -F(fixed strings). It is faster, and it means the.and+in your search term are treated as literal characters with zero escaping. No regex mode needed. - You need to actually parse structured data. Regex is a poor JSON, HTML, or CSV parser. For JSON use
jq; for columnar text useawk; for real grammar use a proper parser. A regex that "mostly works" on structured input is a bug waiting for the one edge case that breaks it. - You need fields, arithmetic, or multi-line logic. That is
awkterritory.grepfinds lines;awkprocesses them. If your pattern is growing capture groups just to pull out a column, switch tools. - You are matching across newlines.
grepis line-oriented and no regex mode changes that. Usepcregrep -M,ripgrep --multiline, or preprocess withtr.
See also
- grep cheat sheet: the full flag reference, the hub this article expands on
- Match multiple patterns with grep: alternation, multiple
-eflags, andgrep -fpattern files - regex cheat sheet: the full regex syntax reference for the patterns you feed into
grep -Eand-P - External: GNU grep manual, FreeBSD grep(1) man page





