TechEarl
Topic · DevOps

Automate, integrate, accelerate your success

59 articlesWritten by Ishan Karunaratne
More in DevOps
Why find scripts break between macOS and Linux: -printf and -regextype are GNU only, regex flavors and stat format strings differ. The portable find subset, the gotchas, and brew install findutils for gfind.

BSD find vs GNU find: Every macOS vs Linux Difference That Matters

macOS ships BSD find; Linux ships GNU find. The two share a name and most of an interface, but -printf, -regextype, and the stat format strings diverge hard enough to break scripts shipped between platforms. The full divergence list, the portable subset that works on both, and how to get GNU find on a Mac.

The grep -o | sort | uniq -c | sort -rn pipeline counts unique matches and ranks them. Why sort comes before uniq, worked log-analysis examples, sort -u, uniq -d, and the awk one-pass alternative.

How to Count Unique Matches with grep, sort, and uniq

The grep -o 'pattern' file | sort | uniq -c | sort -rn pipeline is the classic log-analysis one-liner. Why sort must come before uniq, how each stage works, worked examples for top IPs and status codes, the awk one-pass alternative for huge files, and the BSD vs GNU notes.

find walks the live filesystem every time; locate and plocate query a prebuilt updatedb database. Compare freshness, speed, permission-awareness, and filtering, plus the mlocate vs plocate split and the macOS mdfind alternative.

find vs locate vs mlocate: Which File Search Tool to Use

find walks the live filesystem every time it runs: always current, sometimes slow. locate queries a prebuilt database: instant, but stale until the next updatedb. This breaks down the locate family (mlocate, plocate, slocate), the macOS situation, and exactly when to reach for each one.

Use grep 'pattern' file | awk '{print $2}' to filter lines and print a specific column. awk field basics, custom separators with -F, multi-column output, grep -o and cut alternatives, and when awk alone replaces the pipe.

How to grep and Print a Specific Column (grep + awk)

grep filters lines, awk extracts fields. The classic pipe is grep 'pattern' file | awk '{print $2}'. This covers awk field basics ($1, $NF), custom separators with -F, multi-column output, the cases grep -o and cut cover on their own, and the fact that awk's own pattern match makes the grep half optional.

rsync files modified in the last N days by piping find -print0 into rsync --files-from=- --from0. The relative-path gotcha, the dry run, BSD vs GNU notes, and when rsync filters replace find.

How to rsync Only the Files find Selected

rsync has no native time filter, so the standard trick is to let find pick the files and feed the list to rsync. The one-liner is find ... -print0 | rsync --files-from=- --from0, and the failure mode is always the same: the paths in the list have to be relative to the rsync source argument. The breakdown, the dry run habit, and when rsync's own filters make find unnecessary.

Archive every file matching a find pattern with tar. The safe find -print0 | tar --null --files-from=- one-liner, the macOS BSD tar -T difference, archiving by modification time, and gzip vs bzip2 vs xz vs zstd.

How to Archive Files Matching a find Pattern with tar

find locates the files, tar archives them. The safe pairing is find -print0 piped into tar reading a NUL-delimited list from stdin: no breakage on spaces or newlines. The flag breakdown, the macOS BSD tar vs GNU tar difference, the -exec append alternative, archiving by modification time, and the compression choices.

grep is universal and searches everything you point it at; ripgrep (rg) is the fast Rust default that skips .gitignore'd and binary files; ag is the older fast-grep now superseded. Compare speed, defaults, and regex engines.

grep vs ripgrep vs ag: Which Search Tool to Use

grep is on every system and searches exactly what you point it at. ripgrep (rg) is the fast Rust-based default for code search: it skips .gitignore'd, hidden, and binary files unless told otherwise. ag (the_silver_searcher) was the older fast-grep, now largely superseded by ripgrep. This breaks down speed, defaults, regex engines, and exactly when to reach for each one.

Use xargs -P to run find results in parallel: find ... -print0 | xargs -0 -P 4 -n 1 cmd. Set -P to the core count, why -n 1 matters, CPU-bound vs IO-bound work, and xargs -P vs GNU parallel.

How to Run find in Parallel with xargs -P

find . -type f -name '*.log' -print0 | xargs -0 -P 4 -n 1 gzip compresses every matched file four at a time. The flags that make it work: -P for parallel workers, -n 1 so each worker gets one job, -0 paired with find's -print0 for safety. When parallelism helps (CPU-bound work) and when it just thrashes the disk.

find -name uses shell globs on the basename; find -regex matches a full regular expression against the whole path. The -regextype flavors, the GNU emacs vs BSD basic default drift, and when each one is the right tool.

find -regex vs -name: When to Use Regex in find

find -name takes a shell glob and matches the basename; find -regex takes a full regular expression and matches the whole path. That whole-path detail is the number one surprise: -regex '.*\.txt' works but -regex '.txt' matches nothing. The flag reference, -regextype flavors, the GNU vs BSD default-flavor drift, and when -name is the better tool.

Skip node_modules, .git, and build output from grep with --exclude-dir, exclude filename globs with --exclude, and search only matching files with --include. The one-liner, brace expansion, --exclude-from, and the BSD grep fallback.

How to Exclude Files and Directories from grep

grep does not read .gitignore, so skipping node_modules, .git, and build output is on you. The flags that do it: --exclude for filename globs, --exclude-dir for whole directories, --include for the inverse, --exclude-from to read the list from a file, plus the find -prune fallback for older macOS grep.

Exclude a directory in find with -path './node_modules' -prune -o ... -print. Why the trailing -print is mandatory, the multi-directory form, the slower -not -path alternative, and BSD vs GNU notes.

How to Exclude a Directory in find (the -prune Pattern Explained)

find -path './node_modules' -prune -o -type f -print skips a directory subtree instead of walking into it. The pattern looks strange because -prune is an action, not a test, and the trailing -print is mandatory once you write an explicit action. The breakdown, the multi-directory form, the slower -not -path alternative, and when each one is the right call.

Use grep -l 'pattern' files to list only the filenames that contain a match. The inverted grep -L, the recursive grep -rl one-liner, the NUL-safe xargs pipeline for find-and-replace, and the macOS BSD vs GNU notes.

How to List Only Filenames with grep -l

grep -l prints the name of each file that contains a match and stops reading at the first hit, which makes it the fast answer to 'which files contain this string'. The lowercase -l, the inverted -L for files missing a pattern, the grep -rl one-liner, the NUL-safe xargs pipeline for find-and-replace, and the BSD vs GNU notes.

Rank the biggest files on a full disk with find -printf '%s %p' piped to sort -rn. The GNU one-liner, the BSD stat variant for macOS, why -xdev matters, human-readable sizes, and when du or ncdu beats find.

How to Find the Largest Files on Disk (find, sort, du)

find / -xdev -type f -printf '%s %p\n' | sort -rn | head -20 gives you a ranked list of the biggest files on a full disk. The GNU one-liner, the BSD/macOS stat variant, why -xdev matters, human-readable output with numfmt, when to switch to du or ncdu for per-directory totals, and the mistakes that send a scan into /proc.

grep -E vs grep -P explained: basic regex (BRE) treats + ? | ( ) { } as literal text, extended regex (ERE) makes them metacharacters, and PCRE adds lookaround and \d. Plus why macOS BSD grep has no -P.

grep Regex: BRE vs ERE vs PCRE Explained

grep has three regex engines and the default one surprises everyone: in basic regex (BRE) the characters + ? | ( ) { } are literal text until you backslash-escape them. -E switches to extended regex (ERE) where they work bare, and -P unlocks Perl-compatible regex with lookaround and \d. The full BRE vs ERE vs PCRE comparison, the same pattern in all three, and why -P does not exist on macOS.

Use find -user, -group, and -perm to locate files by ownership and mode. The -perm -mode vs -perm /mode distinction explained, world-writable and SUID/SGID audit recipes, orphaned-file checks, and the BSD vs GNU find differences on macOS.

How to Find Files by Owner, Group, or Permission with find

find -user www-data lists every file owned by a user; -group developers filters by group; -perm matches the mode bits. The subtle part is -perm -mode (all of these bits set) versus -perm /mode (any of these bits set). Plus the security-audit recipes for world-writable files and SUID/SGID binaries, the BSD vs GNU divergences, and the orphaned-file checks.

Use grep -w to match a whole word instead of a substring. What grep counts as a word boundary, the \b and \< \> regex equivalents, -x for whole-line match, and BSD vs GNU differences.

How to Match a Whole Word with grep -w

grep cat also matches category, concatenate, and scatter. grep -w cat matches only the standalone word. The whole-word flag, what grep counts as a word boundary, the regex equivalents with \b and \< \>, the stricter -x whole-line cousin, and the BSD vs GNU differences that bite on macOS.

Use find ... -print0 | xargs -0 grep -l 'PATTERN' to find every file containing a string. When grep -r is enough, when to add find as a pre-filter for performance, multi-pattern matching with -E, and the safe NUL-delimited pipeline.

How to Find Files Containing Specific Text (find + grep)

find ... -print0 | xargs -0 grep -l 'PATTERN' finds every file containing a piece of text. The combo handles weird filenames, scales to huge trees, and replaces three other common but broken pipelines. When to use grep -r alone, when to add find as a pre-filter, and the BSD vs GNU pitfalls.

Use find -type d -empty to list empty directories and find -type f -empty for empty files. The -depth trap for deleting nested empty trees, the hidden-file gotcha, the safe two-pass cleanup, and BSD vs GNU find notes.

How to Find (and Delete) Empty Directories and Files

find . -type d -empty lists every empty directory; find . -type f -empty lists every empty file. The catch is what 'empty' means (a hidden file makes a directory not empty) and the -depth trap that lets find -delete collapse whole nested empty trees in one pass. The flag reference, the safe two-pass cleanup, the BSD vs GNU notes, and the mistakes that bite.

Connect to a GCP Compute Engine VM with plain OpenSSH and no gcloud CLI. Add a public key via instance metadata, ssh to the external IP, configure ~/.ssh/config, plus OS Login and IAP.

How to SSH into a Google Cloud VM Without gcloud

Connect to a GCP VM using plain OpenSSH, no gcloud required. Add a public key to instance metadata, fetch the external IP, and ssh in like any normal Linux box. Plus OS Login, IAP, and a Windows PuTTY path.

Search multiple patterns with grep: grep -e 'A' -e 'B', grep -E 'A|B' alternation, and grep -f patterns.txt. Covers -F fixed strings, AND logic with chained greps and PCRE lookahead, and BSD vs GNU differences on macOS.

How to Search Multiple Patterns with grep

grep can OR several patterns three ways: -e per pattern, -E with alternation, or -f reading the list from a file. The one-liner is grep -E 'ERROR|WARN|FATAL' file. Here is when to pick each, how -F speeds up literal multi-pattern search, why grep has no single-pass AND, and the BSD vs GNU differences that bite on macOS.

Use grep -C 3 'pattern' file to print 3 lines before and after each match. The -A, -B, -C context flags, the -- group separator, asymmetric context, recursive search, and BSD vs GNU grep differences.

How to Show Lines Before and After a grep Match (Context)

grep -C 3 'pattern' file prints the matching line plus 3 lines on each side. The three context flags (-A after, -B before, -C both), how the -- group separator works between match blocks, asymmetric context, recursive context search, and the macOS BSD vs GNU differences that bite.

find -exec ... {} + batches arguments into one command (fast). find ... -exec ... {} \; forks per file (slow). xargs adds shell flexibility but needs -0 for safety. The decision matrix and performance comparison.

find -exec vs xargs: Which to Use (and the {} + Trick That Beats Both)

find -exec ... {} + and find -print0 | xargs -0 are roughly equivalent for batch operations on matched files. find -exec ... {} \; forks once per match and is much slower. The decision matrix: when -exec is enough, when xargs adds value, and the safety rules for filenames with spaces, newlines, and quotes.

grep -c counts matching lines, not occurrences. Use grep -o piped into wc -l for the true count, grep -rc for per-file counts, grep -vc to count non-matching lines, plus the macOS BSD vs GNU differences.

How to Count Matches with grep -c (and the Line-vs-Occurrence Trap)

grep -c counts matching LINES, not occurrences. A line with three hits still counts as 1. The fix is grep -o piped into wc -l, which puts every match on its own line first. Per-file counts, filtering out the :0 noise, counting non-matching lines, and the BSD vs GNU differences.

Use find -type f -name '*.txt' for one extension, or group -name tests in escaped parens joined by -o for many (.jpg, .png, .gif). Case-insensitive -iname, files with no extension, the -regex shortcut, and BSD vs GNU find differences.

How to Find Files by Extension (One or Many) with find

find . -type f -name '*.txt' lists every file with one extension. For many extensions you group -name tests with escaped parens and join them with -o. This covers the single one-liner, the multi-extension OR pattern, why the parens are mandatory, case-insensitive -iname, files with no extension at all, the -regex shortcut, and the BSD vs GNU divergence that bites on macOS.

find -delete removes every matched file with no confirmation. The safe -print-first dry-run pattern, depth-first directory deletion, when to use -exec rm vs xargs rm -f, and the BSD vs GNU differences.

How to Find and Delete Files Safely with find -delete

find -delete removes every matched file with no confirmation and no undo. The safe pattern is to write the command with -print first, eyeball the list, then swap -print for -delete. Plus the directory-depth-first trap, when to use -exec rm instead, and the find -delete vs xargs rm -f tradeoff.

How to grep case-insensitively with grep -i. Combine -i with -r, -w, -v, -c, the locale caveat for non-ASCII case folding, the PCRE (?i) inline flag, and BSD vs GNU grep differences.

How to grep Case-Insensitively (grep -i)

grep -i 'pattern' file matches regardless of case. The flag pairs with -r, -w, -v, and -c the way you would expect, but -i only folds ASCII case reliably. Non-ASCII case folding (accented characters, the Turkish dotted-i) depends on your locale. The combinations, the locale caveat, the PCRE per-pattern (?i) flag, and the BSD vs GNU differences.

Use find -size +100M to list files larger than 100 megabytes. Unit suffixes (c/k/M/G), +/- sign convention, combine with sort -rn to surface the biggest files on disk, and BSD vs GNU rendering differences.

How to Find Files Larger Than a Size with find -size

find . -size +100M lists every file larger than 100 megabytes. The unit suffixes (c, k, M, G), the +/- sign convention, how to combine with sort to find the biggest files on disk, the BSD vs GNU divergence for printing sizes, and the wc -c trick for byte-exact thresholds.

Use grep -v 'pattern' file to print every line that does not match. Exclude multiple patterns with -e or -vE, strip comments and blank lines, count with -vc, and avoid the OR-becomes-AND double-negative trap.

How to Exclude Matches with grep -v (Invert Match)

grep -v 'pattern' file prints every line that does NOT match. The flag reference, how to exclude multiple patterns, the strip-comments-and-blank-lines pipeline, the double-negative trap where -v of an OR becomes an AND of negations, and the macOS BSD vs GNU differences.

Use find -mtime -7 to list files modified in the last 7 days. The off-by-one (-7 means under 7 days, +7 means over 7 days), -mmin for minute resolution, -newer for exact timestamps, and the BSD rounding gotcha on macOS.

How to Find Files Modified in the Last 7 Days (find -mtime)

find -mtime -7 lists every file modified in the last 7 days. The catch is the off-by-one: -7 means less than 7 days ago, +7 means more than 7 days ago, and exact-7 almost never matches what people expect. The flag reference, worked variations for hours and minutes, the BSD vs GNU rounding difference, and the safe cleanup patterns.

Use grep -r 'pattern' . to search every file in a directory tree. The -r vs -R symlink difference, --include and --exclude-dir filters, -rl and -rn, and the macOS BSD vs GNU grep gaps.

How to grep Recursively Through a Directory

grep -r 'pattern' . searches every file under a directory tree. The catch is the path argument people forget, the -r vs -R symlink difference, and the unfiltered crawl into node_modules and .git. The flag reference, the --include and --exclude-dir filters, the macOS BSD vs GNU gaps, and when to reach for ripgrep or git grep instead.

Bash for loop reference: brace-range {1..10}, sequence (seq), array, glob, C-style, nested, parallel with xargs. Plus safe file iteration with find -print0, globbing pitfalls, and macOS Bash 3.2 vs Linux Bash 4+ differences.

Bash For Loops: Syntax, Examples, and One-Liners

Every form of the Bash for loop with working examples: brace-range, sequence-expression, array, glob, C-style, nested, and parallel. Plus the safe file-iteration patterns, common pitfalls, and macOS Bash 3.2 vs Linux Bash 4+ gotchas.

AWS IAM policy examples by use case: S3 read-only with prefix, S3 read-write with delete denied, EC2 admin scoped to a region via aws:RequestedRegion, Lambda execute and read env vars but not write, iam:PassRole for service-linked roles, MFA-required via aws:MultiFactorAuthPresent, IP-restricted via aws:SourceIp, VPC-endpoint-only via aws:SourceVpce, tag-based prod-vs-dev isolation via aws:ResourceTag, plus the anatomy of a policy document and IAM Access Analyzer for least-privilege validation.

AWS IAM Policy Examples: S3, EC2, Lambda, and Least-Privilege Patterns

A working library of AWS IAM policy examples: S3 read-only with prefix, EC2 admin scoped to a region, Lambda execute-but-not-write, MFA-required, IP-restricted, VPC-endpoint-only, tag-based prod-vs-dev isolation, and the iam:PassRole pattern. Plus the anatomy of a policy document and how Access Analyzer narrows over-permissive Resource: "*" grants.

Bash arrays reference: declaration, indexing, [@] vs [*] quoting, iteration, appending, slicing, mapfile/readarray for lines, IFS-based string splitting, plus macOS Bash 3.2 limits.

Bash Arrays: Indexed, Associative, and Iteration Patterns

Bash array reference: indexed and associative declaration, the [@] vs [*] quoting gotcha, iterating values and indexes, appending, slicing, deleting, mapfile/readarray for reading lines, and the macOS Bash 3.2 vs Linux Bash 4+ differences.

AWS S3 CLI cheat sheet: aws s3 cp local-to-S3, S3-to-local, S3-to-S3 cross-region; aws s3 sync incremental with --delete; --exclude and --include patterns; --storage-class STANDARD_IA / INTELLIGENT_TIERING / GLACIER; --sse AES256 and --sse aws:kms; --acl bucket-owner-full-control; --dryrun for safety; concurrency tuning with max_concurrent_requests and multipart_chunksize; the trailing-slash gotcha that ruins half of all aws s3 cp invocations.

AWS S3 cp and sync Cheat Sheet: Copy, Move, and Sync Files with the CLI

A scannable AWS S3 CLI reference: aws s3 cp, sync, mv, rm, ls; recursive uploads and downloads; --exclude / --include filters; storage classes (STANDARD_IA, GLACIER, INTELLIGENT_TIERING); SSE encryption (AES256, aws:kms); --dryrun safety; the trailing-slash gotcha; concurrency tuning via max_concurrent_requests and multipart_chunksize; cross-account profiles.

Change an EC2 instance type without data loss: stop the instance, run aws ec2 modify-instance-attribute, start it again. Covers Nitro vs Xen compatibility, ENA and NVMe driver requirements, the instance-store ephemeral-data trap, and the zero-downtime ASG rolling-replace pattern.

How to Change an AWS EC2 Instance Type (Resize Without Data Loss)

Stop the instance, modify the instance type, start it. The exact gcloud-equivalent AWS CLI syntax, the compatibility matrix for moving between families and generations, the Nitro vs Xen gotcha, the instance-store data-loss trap, and the production sequence (snapshot AMI, scale out, replace) that gets you to a new instance type with zero downtime.

Grow an AWS EBS volume with zero downtime: aws ec2 modify-volume to enlarge, wait for the optimizing state, then sudo growpart to extend the partition and sudo resize2fs (ext4) or sudo xfs_growfs (XFS) to stretch the filesystem. No detach, no reboot, on a live EC2 instance.

How to Extend an AWS EBS Volume Without a Restart

Grow an EBS volume on a running EC2 instance in four steps. Modify the volume, wait for the optimizing state, expand the partition with growpart, then stretch the filesystem with resize2fs or xfs_growfs. No detach, no reboot.

Create an EBS volume with aws ec2 create-volume, attach it to a running EC2 instance, format with mkfs.ext4 or mkfs.xfs, mount it, and persist across reboots with a UUID-based /etc/fstab entry. Console, AWS CLI, and Terraform walkthroughs.

How to Add an EBS Volume to an EC2 Instance

Create an EBS volume, attach it to a running EC2 instance, format and mount it, and survive reboots with a UUID-based fstab entry. Console, AWS CLI, and Terraform walkthroughs plus the Nitro device-naming gotcha that trips everyone.

Connect to an AWS EC2 instance using plain SSH with a key pair, EC2 Instance Connect, AWS Systems Manager Session Manager, or an EC2 Instance Connect Endpoint for private instances. Default usernames, security group rules, and troubleshooting Permission denied and Connection timed out.

How to SSH into an AWS EC2 Instance

Connect to an EC2 instance four ways: plain SSH with a key pair, EC2 Instance Connect, Session Manager, and EC2 Instance Connect Endpoint. Default usernames, security group rules, and the troubleshooting matrix that fixes Permission denied and Connection timed out.

Using regex in Nginx with location blocks and the rewrite directive: location modifier priority, the rewrite directive flags, return-based redirects, and copy-paste config for HTTPS redirects, www normalization, trailing slashes, 301 redirects, clean URLs, and blocking by user-agent or IP.

How to Use Regex in Nginx (location and rewrite)

Use regex in Nginx with location blocks and the rewrite directive: how location modifiers and matching priority work, why return beats rewrite for redirects, and copy-paste config for HTTPS, www, trailing slashes, 301s, clean URLs, and access blocking.

Using regex in Apache .htaccess with mod_rewrite: RewriteRule and RewriteCond pattern syntax, rewrite flags, and copy-paste rules for HTTPS redirects, www normalization, trailing slashes, 301 redirects, clean URLs, and blocking by user-agent or IP.

How to Use Regex in .htaccess (Apache mod_rewrite)

Use regex in .htaccess with Apache mod_rewrite: how RewriteRule and RewriteCond patterns work, the per-directory quirk that breaks everyone, and copy-paste rules for HTTPS, www, trailing slashes, 301s, clean URLs, and access blocking.

Stand up a v3 .onion hidden service with Tor. HiddenServiceDir, HiddenServicePort, key backup, permissions, onion-location, single-onion mode, and operational gotchas.

Host a v3 .onion Hidden Service with Tor

End-to-end setup for a v3 .onion hidden service — torrc HiddenServiceDir and HiddenServicePort, key backup, permissions, onion-location header, single-onion mode, and the operational mistakes that get addresses leaked or lost.

Route curl, Python requests, and Node fetch through Tor's SOCKS5 proxy. Avoid DNS leaks with socks5-hostname and socks5h://. Working examples and verification.

Use Tor as a SOCKS5 Proxy with curl, Python, and Node

Route a single command, script, or HTTP client through Tor's SOCKS5 proxy — curl with --socks5-hostname, Python requests with socks5h://, Node with socks-proxy-agent — and avoid the DNS leak that catches everyone first time.