grep -w: Match a Whole Word, Not a Substring (2026)

grep cat file matches more than you want. It hits category, concatenate, scatter, wildcat: anything with the three letters cat somewhere inside it. By default grep matches substrings, and a short search word turns into a wall of false positives.

grep -w 'cat' file fixes it. The -w flag tells grep to match only when the pattern stands alone as a whole word, with a word boundary on each side. cat matches; category does not. This is the flag I reach for whenever the search term is a real word that also lives inside longer words, and it is the difference between a clean result set and a manual scroll-and-skim.

Set your values

Try it with your own values

Set your OS, the search path, and the word you want to match. Every grep example below updates with your values.

Operating systemSearch pathWord to match

The one-liner

bash· Linux (GNU)

grep -w ':pattern' :search_path

That returns every line where :pattern appears as a standalone word. With the default value id, grep -w 'id' matches the literal word id but skips width, valid, idea, android, and kid. Without -w, a search for id in any real codebase is close to useless.

PowerShell has no direct -w equivalent, so the Windows variant uses explicit word-boundary anchors (\b) in the .NET regex.

What counts as a word boundary

-w is not magic. It is shorthand for wrapping your pattern in word boundaries, and a word boundary is defined by what grep considers a word character:

Letters: a-z, A-Z
Digits: 0-9
Underscore: _

Everything else (spaces, punctuation, slashes, dots, the start and end of the line) is a non-word character, and the transition between a word character and a non-word character is a boundary.

So -w requires that the character immediately before the match is a non-word character or the line start, and the character immediately after is a non-word character or the line end. That is the whole rule.

The underscore inclusion is the part people forget. grep -w 'user' will not match user_id or _user, because _ is a word character, so there is no boundary between user and _. If you are searching code where identifiers use snake_case, -w on a single segment of the name silently misses every compound identifier. More on that in the mistakes section.

The manual equivalent with regex word boundaries

-w is a convenience flag. You can write the same thing by hand with regex word-boundary metacharacters, which is useful when you need a boundary on only one side, or when you are composing a larger pattern.

GNU grep supports the Perl-style \b boundary in ERE mode:

bash· Linux (GNU)

grep -E '\b:pattern\b' :search_path

The classic BRE (basic regex) form uses \< for "start of word" and \> for "end of word":

bash· Linux (GNU)

grep '\<:pattern\>' :search_path

The \< and \> anchors are zero-width: they match a position, not a character, just like ^ and $. \< is supported on both GNU and BSD grep, which makes it the most portable choice when you need an explicit boundary. \b matches either side of a boundary, so it is more flexible, but BSD grep does not honor it.

For the full tour of which metacharacters belong to which regex mode, see the BRE vs ERE vs PCRE guide.

-w with a multi-word or regex pattern

When the pattern is more than a single literal word, -w applies the boundary to the whole pattern, not to each piece inside it. This is the rule that surprises people most.

grep -w 'hot dog' file requires a boundary before hot and after dog. It matches a hot dog please but not hot dogs (the s makes dog not end on a boundary). The space between hot and dog is internal to the pattern; -w does not touch it.

With an alternation, the boundary still wraps the entire expression:

bash· Linux (GNU)

grep -wE '(cat|dog|fish)' pets.txt

GNU's documentation describes -w precisely: the matched substring must either be at the start of the line or preceded by a non-word character, and must either be at the end of the line or followed by a non-word character. For an alternation, "the matched substring" is whichever branch matched, so each of cat, dog, fish is independently required to sit on boundaries. That behavior is what you usually want, but it is worth knowing it is the match, not the pattern source, that gets the boundary.

-w combined with -i, -r, -v, -c

-w composes cleanly with the rest of the grep flag set.

Case-insensitive whole-word match:

bash· Linux (GNU)

grep -wi ':pattern' :search_path

Recursive whole-word search through a directory tree:

bash· Linux (GNU)

grep -rwn ':pattern' :search_path

Invert the match: lines that do not contain the word as a standalone token:

bash· Linux (GNU)

grep -wv ':pattern' :search_path

One subtlety with -wv: a line containing width (but never the bare word id) counts as a non-match for grep -w 'id', so -wv keeps it. That is correct, but if you mentally model -w as "lines mentioning id", the inverted result can look wrong. It is not; -v inverts the whole-word test, not a substring test.

Count whole-word matches:

bash· Linux (GNU)

grep -wc ':pattern' :search_path

-c counts matching lines, not matches. A line with the word twice counts once. For a true occurrence count use grep -ow ':pattern' file | wc -l.

-x for whole-LINE match (the stricter cousin)

-w has a stricter sibling: -x matches only when the pattern equals the entire line, not just a whole word inside it.

Flag	Matches	`cat` matches the line...
(none)	substring anywhere	`concatenate`, `the cat sat`
`-w`	whole word, boundaries on both sides	`the cat sat`, not `concatenate`
`-x`	the whole line, nothing else on it	`cat`, not `the cat sat`

grep -x 'cat' file matches a line that is exactly cat and nothing more: no leading spaces, no trailing text. It is the right tool when you are checking config files or allow-lists where a line must be an exact value.

bash· Linux (GNU)

grep -x ':pattern' :search_path

A common pairing is grep -Fx to check membership: grep -Fxq 'value' allowlist.txt exits 0 if value is a whole line in the file, treating the pattern as a fixed string. That is the canonical "is this entry in the list" test in shell scripts.

Think of it as a scale: no flag matches anywhere on the line, -w matches a token on the line, -x matches the whole line. Pick the tightest one that still catches what you need.

macOS BSD grep vs GNU grep

-w and -x themselves are portable: both work identically on GNU grep (Linux) and BSD grep (the macOS default). The divergence is entirely in the regex boundary metacharacters.

Feature	GNU grep	BSD grep (macOS default)
`-w` (whole word)	Supported	Supported
`-x` (whole line)	Supported	Supported
`\<` `\>` (BRE word anchors)	Supported	Supported
`\b` (Perl-style boundary)	Supported in BRE and ERE	Treated as a literal backspace
`\w` `\W` (word char classes)	Supported	Not supported (use `[[:alnum:]_]`)

The practical takeaway: if a script has to run on both Linux and macOS, use -w for whole-word matching, or use \< \> when you need an explicit one-sided boundary. Avoid \b and \w in portable scripts. On macOS, brew install grep gives you GNU grep as ggrep if you genuinely need \b. The grep cheat sheet has the full BSD vs GNU divergence table.

Common grep -w mistakes

1. Assuming -w adds boundaries inside the pattern. -w wraps the whole pattern (or whichever alternation branch matched), not each word in it. grep -w 'foo|bar' without -E matches the literal seven-character string foo|bar as a whole word, which is almost never the intent. Use grep -wE '(foo|bar)' so the alternation is parsed and each branch gets the boundary.

2. Forgetting underscore is a word character. grep -w 'user' does not match user_id, user_name, or _user, because _ is a word character and there is no boundary between user and _. In snake_case codebases this silently drops every compound identifier. If you want user and user_*, drop -w and use an explicit pattern like grep -E '\buser\b|\buser_' (GNU), or just search the substring and accept the noise.

3. A pattern that starts or ends with a non-word character. grep -w '.config' is contradictory: -w wants a word boundary before the match, but the first character . is itself a non-word character, so there is no word for the boundary to attach to. Patterns whose edges are punctuation usually match nothing under -w. Drop -w and anchor manually, or rethink the pattern.

4. Using -w when the word is glued to punctuation you care about. Searching for error with -w matches error, and error: (the comma and colon are boundaries) but a search for error-code as a hyphenated unit needs the hyphen inside the pattern, since - is a boundary character that -w will treat as a word edge.

5. Confusing -w with -x. -w matches a word on the line; -x matches the whole line. If grep -w 'cat' returns lines with extra text and you wanted exact lines only, you wanted -x.

6. Expecting -w to do language-aware tokenization. -w only knows the [A-Za-z0-9_] rule. It has no concept of word stems, contractions, or non-ASCII word characters in some locales. don't is two grep-words (don and t) split by the apostrophe.

When NOT to use this

-w is the wrong tool when:

You genuinely want substring matches. Searching for a partial identifier, a filename fragment, or a prefix is a legitimate use of plain grep. Forcing -w there just makes you miss results. grep 'config' to find configure, config.json, and reconfigured is correct as-is.
You need language-aware tokenization. Splitting prose into real words (handling contractions, hyphenated compounds, Unicode letters, stemming) is a job for a text-processing library, not grep's [A-Za-z0-9_] boundary rule. If don't or co-operate must count as single words, grep cannot do it.
The "word" contains boundary characters. A version string like 1.2.3, a path like /etc/hosts, or an email address are not single grep-words; the dots, slashes, and @ are all boundaries. Match these with -F (fixed string) or an anchored regex instead.
You are matching a whole line. Use -x. -w still allows other text on the line.

FAQ

The -w flag makes grep match the pattern only when it appears as a whole word, with a word boundary on each side. A boundary is the transition between a word character (a letter, a digit, or underscore) and a non-word character (anything else, including the start and end of the line).

So grep -w 'cat' matches the standalone word cat but skips category, concatenate, and scatter, because each of those has a word character directly touching cat.

-w matches the pattern as a whole word anywhere on a line; other text can be present. -x matches only when the pattern equals the entire line, with nothing else on it.

For the line the cat sat, grep -w 'cat' matches it but grep -x 'cat' does not, because the line is not exactly cat. Use -x for exact line membership checks, often as grep -Fxq in scripts.

Yes. grep counts letters, digits, and underscore as word characters. That means grep -w 'user' will not match user_id or _user, because there is no boundary between user and the underscore.

This catches people off guard in snake_case codebases. If you need to match a name segment that may be glued to an underscore, drop -w and write the boundaries explicitly, or accept a substring search.

Wrap the pattern in regex word boundaries. On GNU grep, grep -E '\bword\b' file works in extended-regex mode. The portable form, which also works on macOS BSD grep, uses the basic-regex anchors: grep '\<word\>' file, where \< marks the start of a word and \> marks the end.

BSD grep does not honor \b (it reads it as a literal backspace character), so prefer \< and \> when the script has to run on macOS.

-w applies the boundary to the whole pattern, and without -E a pipe is a literal character. grep -w 'cat|dog' looks for the literal six-character string cat|dog as a whole word, which usually matches nothing.

Add -E so the alternation is parsed: grep -wE '(cat|dog)'. Then each branch independently must sit on word boundaries, which is the behavior you wanted.

Yes. The -w and -x flags behave identically on GNU grep (Linux) and BSD grep (the macOS default). The portability problem is only with the boundary metacharacters: works on GNU but is treated as a literal backspace on BSD, while < and > work on both.

For cross-platform scripts, stick to -w for whole-word matching, or use < and > when you need an explicit boundary.

How to Match a Whole Word with grep -w

Set your values

The one-liner

What counts as a word boundary

The manual equivalent with regex word boundaries

-w with a multi-word or regex pattern

-w combined with -i, -r, -v, -c

-x for whole-LINE match (the stricter cousin)

macOS BSD grep vs GNU grep

Common grep -w mistakes

When NOT to use this

See also

FAQ

Ishan Karunaratne

Related posts

How to Match a Hex Color Code with Regex

How to Match a URL with Regex

How to Match a Domain Name with Regex

What does grep -w actually do?

What is the difference between grep -w and grep -x?

Does grep -w treat underscore as part of a word?

How do I match a whole word without using the -w flag?

Why does grep -w with an alternation match unexpected results?

Does grep -w work the same on macOS as on Linux?

Ishan Karunaratne