TechEarl

How to Find Files Containing Specific Text (find + grep)

find ... -print0 | xargs -0 grep -l 'PATTERN' finds every file containing a piece of text. The combo handles weird filenames, scales to huge trees, and replaces three other common but broken pipelines. When to use grep -r alone, when to add find as a pre-filter, and the BSD vs GNU pitfalls.

Ishan KarunaratneIshan Karunaratne⏱️ 11 min readUpdated
Use find ... -print0 | xargs -0 grep -l 'PATTERN' to find every file containing a string. When grep -r is enough, when to add find as a pre-filter for performance, multi-pattern matching with -E, and the safe NUL-delimited pipeline.

find . -type f -name '*.log' -print0 | xargs -0 grep -l 'ERROR' lists every .log file under the current directory that contains the word ERROR. The -l flag tells grep to print just the filename of each match (not the matching line). The -print0 | xargs -0 pair makes the pipeline safe for filenames with spaces and newlines.

For most "find files containing X" tasks, plain grep -r 'PATTERN' . is the shorter answer. The find-plus-grep combo earns its place when you need to narrow by filename, file type, or age before grep starts reading content, which matters a lot on big trees with mixed file types.

Set your values

Try it with your own values

Set your OS, search path, name pattern, and the text to search for. Every example below updates with your values.

The one-liner (find + grep)

bash· Linux (GNU)
find :search_path -type f -name ':pattern' -print0 | xargs -0 grep -l ':text'

Returns just the filenames of files containing the text. Drop -l to also see the matching lines (then add -n to get line numbers).

When grep -r is enough

If you're searching for a string across all files under a directory with no filename filtering, grep -r is shorter and equally fast:

bash· Linux (GNU)
grep -rl ':text' :search_path

-r makes grep recursive; -l makes it list filenames only. This skips the find pipeline entirely and is the right tool when the question is just "what files contain this string".

You'd switch to the find-plus-grep combo when you need to narrow by name, type, size, or age before grep starts reading contents. Pre-filtering with find is a real speedup on big trees with lots of binary files.

When to add find as a pre-filter

The combo wins on three concrete cases.

Case 1: Filename filter (only search .log files under a tree with thousands of files of other types):

bash· Linux (GNU)
# find narrows to .log files first; grep only opens those
find :search_path -type f -name ':pattern' -print0 | xargs -0 grep -l ':text'

grep -r --include='*.log' does the same thing in one tool. Useful when you don't have the GNU grep --include flag (BSD grep lacks it in older versions).

Case 2: Time filter (only search files modified in the last 7 days):

bash· Linux (GNU)
find :search_path -type f -name ':pattern' -mtime -7 -print0 | xargs -0 grep -l ':text'

There's no grep -r equivalent for this; the find pre-filter is necessary.

Case 3: Size filter (skip files larger than 1 MB to avoid grep'ing big binaries):

bash· Linux (GNU)
find :search_path -type f -name ':pattern' -size -1M -print0 | xargs -0 grep -l ':text'

For codebases with mixed text and binary files, this is the difference between a 2-second search and a 2-minute search.

Drop -l to see what grep matched, and add -n for line numbers:

bash· Linux (GNU)
find :search_path -type f -name ':pattern' -print0 | xargs -0 grep -n ':text'

Output is <file>:<line>:<text>. For viewing context around each match, add -C N (N lines before AND after) or -A N / -B N (after / before only):

bash· Linux (GNU)
find :search_path -type f -name ':pattern' -print0 | xargs -0 grep -n -C 2 ':text'

-C 2 shows 2 lines on either side of the match, separated by -- between groups.

Multi-pattern search (any of N strings)

grep -E enables extended regex, which supports | for OR:

bash· Linux (GNU)
find :search_path -type f -name ':pattern' -print0 | xargs -0 grep -El 'ERROR|FATAL|CRITICAL'

Alternative form using -e (one pattern per flag, no regex):

bash· Linux (GNU)
find :search_path -type f -name ':pattern' -print0 | xargs -0 grep -l -e 'ERROR' -e 'FATAL' -e 'CRITICAL'

The -e form is safer when patterns might contain regex metacharacters and you want literal matching (combine with -F for fixed-string mode).

Find files containing ALL of N patterns (AND logic)

grep itself only does OR. For AND, chain greps:

bash· Linux (GNU)
# Files matching ALL three patterns
find :search_path -type f -name ':pattern' -exec grep -l 'ERROR' {} \; | xargs grep -l 'database' | xargs grep -l 'timeout'

Each grep filters the output of the previous, leaving only files that match every pattern. Inefficient for many patterns (it re-reads each file per grep), but conceptually simple. For complex AND logic over a large tree, switch to ripgrep or a custom script.

Find files NOT containing a pattern

grep -L (capital L) inverts: list files where the pattern does not appear:

bash· Linux (GNU)
find :search_path -type f -name ':pattern' -print0 | xargs -0 grep -L ':text'

Useful for audit checks like "list every Python file that doesn't have a __future__ import" or "list every config file without an Authorization header".

Use ripgrep if you have it

ripgrep (rg) is a modern replacement that's faster than grep -r and respects .gitignore by default:

bash· Linux (GNU)
# Install: apt install ripgrep / brew install ripgrep
rg -l ':text' :search_path
# With filename glob
rg -l --glob ':pattern' ':text' :search_path

For interactive code search, ripgrep is consistently 5-10× faster than grep -r because it parallelizes and uses memory-mapped files. For shell scripts that need to work on minimal containers without extra binaries, fall back to find + grep.

macOS BSD vs GNU grep (the gotchas)

FeatureGNU grepBSD grep (macOS default)
-r recursiveSupportedSupported
-l list filenamesSupportedSupported
-L list non-matchingSupportedSupported
-n line numbersSupportedSupported
-i case-insensitiveSupportedSupported
-E extended regexSupportedSupported
-P Perl regexSupportedNOT supported (BSD grep lacks PCRE)
-Z NUL-terminated outputSupportedSupported
--include=GLOBSupportedNOT supported in older BSD; macOS 12+ added it
--exclude-dir=DIRSupportedSupported in macOS 11+
Color output (--color=auto)SupportedSupported

The big one is -P (PCRE). If your grep pattern uses lookarounds, named groups, or anything beyond basic regex, BSD grep will reject it. Either install GNU grep via Homebrew (brew install grep, then ggrep) or switch to ripgrep (which has its own regex engine).

Common find + grep mistakes

1. Unsafe pipe (find | xargs grep). Without -print0 and -0, the pipeline breaks on filenames containing spaces. Always use the NUL-delimited form: find ... -print0 | xargs -0 grep.

2. Using -exec grep {} \; instead of \;'s + or xargs. Each -exec ... \; forks a new grep per file. On big trees this is 100× slower than batching. Use -exec grep ... {} + or pipe through xargs.

3. Forgetting -type f. Without it, find tries to grep directories (which fails noisily) and symlinks (which may follow into infinite loops or external mounts). Always -type f for content searches.

4. Searching binary files unintentionally. grep treats \0 in a file as "this is binary" and prints "Binary file FOO matches" instead of the matching line. For text-only searches, pre-filter with find ... -name '*.txt' or use grep --binary-files=text to force text mode.

5. Searching huge files when a size cap would do. find ... -size -10M -print0 | xargs -0 grep skips files larger than 10 MB. Useful when the tree contains video, image, or build-output blobs that aren't relevant to text searches.

6. Not using -l when you only need filenames. Without -l, grep prints every matching line, which can be a lot of output. grep -l stops at the first match per file and only prints the filename.

7. Regex metacharacters in the pattern without -F. A pattern like 192.168.1.1 is regex; the dots match any character. For literal-string matching, use grep -F (fixed-string mode) or escape the dots: 192\.168\.1\.1.

8. Forgetting to exclude .git or node_modules. Recursive searches in code trees are dominated by these directories. Add find ... -not -path '*/.git/*' -not -path '*/node_modules/*' or use ripgrep (which auto-respects .gitignore).

When to skip find and grep entirely

Reach for a different tool when:

  • Searching a code repository. git grep 'PATTERN' searches only tracked files and is faster than both grep -r and find+grep because git already has the file list cached.
  • You want fast interactive search across a project. ripgrep, ag (silver searcher), and ack are all faster than grep on large trees and have better defaults (skip .git, respect .gitignore).
  • You're indexing for repeated queries. Tools like mlocate (filename) or Recoll / Elasticsearch (full-text) build a persistent index. One-shot find+grep makes sense for ad-hoc queries; for queries you'll run many times, index once.
  • You need to extract structured data, not just match lines. awk, jq (for JSON), xmlstarlet (for XML), or pup (for HTML) parse the file format and let you query by field. grep is line-based; the other tools are structure-aware.

See also

FAQ

TagsfindgrepsearchCLILinuxmacOSBSD
Share
Ishan Karunaratne

Ishan Karunaratne

Tech Architect · Software Engineer · AI/DevOps

Tech architect and software engineer with 20+ years across software, Linux systems, DevOps, and infrastructure — and a more recent focus on AI. Currently Chief Technology Officer at a tech startup in the healthcare space.

Keep reading

Related posts

Use find -size +100M to list files larger than 100 megabytes. Unit suffixes (c/k/M/G), +/- sign convention, combine with sort -rn to surface the biggest files on disk, and BSD vs GNU rendering differences.

How to Find Files Larger Than a Size with find -size

find . -size +100M lists every file larger than 100 megabytes. The unit suffixes (c, k, M, G), the +/- sign convention, how to combine with sort to find the biggest files on disk, the BSD vs GNU divergence for printing sizes, and the wc -c trick for byte-exact thresholds.

Use find -mtime -7 to list files modified in the last 7 days. The off-by-one (-7 means under 7 days, +7 means over 7 days), -mmin for minute resolution, -newer for exact timestamps, and the BSD rounding gotcha on macOS.

How to Find Files Modified in the Last 7 Days (find -mtime)

find -mtime -7 lists every file modified in the last 7 days. The catch is the off-by-one: -7 means less than 7 days ago, +7 means more than 7 days ago, and exact-7 almost never matches what people expect. The flag reference, worked variations for hours and minutes, the BSD vs GNU rounding difference, and the safe cleanup patterns.

Use find -user, -group, and -perm to locate files by ownership and mode. The -perm -mode vs -perm /mode distinction explained, world-writable and SUID/SGID audit recipes, orphaned-file checks, and the BSD vs GNU find differences on macOS.

How to Find Files by Owner, Group, or Permission with find

find -user www-data lists every file owned by a user; -group developers filters by group; -perm matches the mode bits. The subtle part is -perm -mode (all of these bits set) versus -perm /mode (any of these bits set). Plus the security-audit recipes for world-writable files and SUID/SGID binaries, the BSD vs GNU divergences, and the orphaned-file checks.