find is good at selecting files. tar is good at packing them into one archive. The job of gluing them together has one safe answer and several unsafe ones. The safe answer is to have find emit a NUL-delimited list of paths and have tar read that list from stdin:
find . -type f -name '*.log' -print0 | tar --null -czf logs.tar.gz --files-from=-That finds every .log file under the current directory and writes a gzip-compressed tar archive containing exactly those files. No temporary file list, no breakage when a filename has a space or a newline in it. This page is the reference I keep open whenever a deploy script or a log-rotation job needs to bundle a filtered set of files.
Set your values
Set your OS, search path, and archive name. Every tar command below updates with your values.
The safe one-liner
find :search_path -type f -name '*.log' -print0 | tar --null -czf :archive_name --files-from=-The Linux and macOS commands differ in exactly one place: GNU tar names the option --files-from=-, BSD tar (what macOS ships) spells the same idea -T -. Both read the file list from stdin (-), and both accept --null to say that list is NUL-delimited. Everything else is identical.
Breaking down the flags
The pipeline has two halves. find produces the list, tar consumes it.
| Flag | Side | What it does |
|---|---|---|
-print0 | find | Emit each matched path followed by a NUL byte instead of a newline |
--null | tar | Tell tar the incoming list is NUL-delimited, not newline-delimited |
--files-from=- | GNU tar | Read the list of files to archive from this file; - means stdin |
-T - | BSD tar | Same as --files-from=-; BSD tar's spelling |
-c | tar | Create a new archive |
-z | tar | Compress the archive with gzip |
-f :archive_name | tar | Write to this file (the f flag always takes the next argument as the filename) |
The -czf cluster is just -c -z -f collapsed. Order inside the cluster matters for the -f: whatever follows the cluster is taken as the archive filename, so -f must be last in the group.
Why -print0 and --null go together
A Unix filename can contain any byte except / and NUL. That includes spaces, tabs, and newlines. The naive pipeline:
find . -name '*.log' | tar -czf logs.tar.gz --files-from=-splits the list on newlines. The moment one of your log files is named app log.txt or, worse, weird\nname.log with an embedded newline, tar either archives the wrong path or fails outright. -print0 separates entries with NUL, which is the one byte that cannot appear in a filename, so the split is always unambiguous. --null tells tar to expect that separator. This is the exact same principle as find -print0 | xargs -0: NUL in, NUL out, nothing in between can corrupt the list.
If you only ever archive files you named yourself and you are certain none contain whitespace, the newline version works. I still use -print0 everywhere because the cost is one flag and the failure mode is a silently wrong archive.
The macOS BSD tar vs GNU tar difference
macOS ships bsdtar (from libarchive) as /usr/bin/tar. Linux distributions ship GNU tar. They agree on the common short flags (-c, -z, -f, -x, -t) but diverge on the long options.
| Behavior | GNU tar (Linux) | BSD tar (macOS) |
|---|---|---|
| Read file list from a file | --files-from=FILE or -T FILE | -T FILE (no --files-from) |
| Read file list from stdin | --files-from=- or -T - | -T - |
| NUL-delimited input list | --null | --null |
| Append to existing archive | -r / --append | -r / --append |
| Create with gzip | -czf | -czf |
| Create with xz | -cJf | -cJf |
| Create with zstd | --zstd | --zstd (newer libarchive) |
The portable choice is -T - plus --null: GNU tar accepts -T as a synonym for --files-from, so a script using -T - runs unchanged on both platforms. That is why the macOS variant above uses -T - and you can safely use it on Linux too. If you want GNU tar's behavior on macOS, install it with brew install gnu-tar and call gtar.
The -exec alternative (append mode)
You can skip the pipe entirely and have find invoke tar directly with -exec:
find . -type f -name '*.log' -exec tar -rvf logs.tar {} +-r is append mode: tar adds each batch of files to an existing (or new) archive. The {} + form batches many paths into one tar call, so this is not one fork per file.
The catch: you cannot append to a compressed archive. tar -r needs to seek to the end of the archive, and a gzip or xz stream is not seekable. So this is a two-step process:
find . -type f -name '*.log' -exec tar -rvf logs.tar {} +
gzip logs.tarFirst build the uncompressed logs.tar, then compress it to logs.tar.gz as a separate step. For most jobs the find -print0 | tar --null pipeline is simpler because it creates the compressed archive in one pass. Reach for -exec ... -r only when you genuinely need to append to an archive that already exists.
Archive files modified today or in the last N days
Because the file selection is just a find expression, any find test composes in. Add -mtime to archive by modification time:
find :search_path -type f -mtime -1 -print0 | tar --null -czf :archive_name -T --mtime -1 matches files modified in the last 24 hours, so this archives "everything changed today". For the last 7 days use -mtime -7; for minute resolution use -mmin -60 (last hour). The sign convention and the off-by-one rounding rule are covered in find files modified in the last 7 days. You can stack tests freely: find . -type f -name '*.log' -mtime -7 -print0 archives only the log files touched this week.
Directory structure: preserved vs flattened
tar stores whatever path string find hands it, verbatim. If find . emits ./var/log/app.log, the archive stores ./var/log/app.log, and extracting recreates var/log/app.log under your current directory. The structure is preserved because the paths are relative and include their directories.
Two things to know:
- Run
findfrom the directory you want as the archive root.find . -type f ...gives you relative paths;find /var/log -type f ...gives you absolute-ish paths starting/var/log/..., and GNU tar strips the leading/with a warning. Usecd /var/log && find . -type f ...so the archive is rooted cleanly. - To flatten (strip directories), tar cannot do it on create from a file list. If you genuinely need every file at the archive's top level, you need a copy step first or
tar --transform(GNU only). Flattening risks name collisions, so I avoid it unless the job specifically calls for it.
Extract and list contents
To see what is inside without unpacking:
tar -tzf logs.tar.gz-t lists, -z says it is gzip, -f names the file. To extract everything back:
tar -xzf logs.tar.gz-x extracts. Add -C /target/dir to extract somewhere other than the current directory: tar -xzf logs.tar.gz -C /tmp/restore. To pull out a single file, name it after the archive: tar -xzf logs.tar.gz ./var/log/app.log (the path must match exactly what tar -t shows).
Compression choice: gzip, bzip2, xz, zstd
tar itself does not compress; it pipes the archive through a compressor selected by a flag. The four common choices:
| Compressor | tar flag | Speed | Ratio | Notes |
|---|---|---|---|---|
| gzip | -z | Fast | Moderate | Universal, the safe default |
| bzip2 | -j | Slow | Better than gzip | Largely superseded by xz and zstd |
| xz | -J | Slowest | Best ratio | Great for archives you store and rarely touch |
| zstd | --zstd | Very fast | Near xz at high levels | Best speed-to-ratio balance; needs a recent tar |
For day-to-day log bundling I use -z (gzip): it is everywhere, decompresses fast, and the ratio is fine for text. For archives I am shipping over a slow link or storing long-term, --zstd is the modern pick. -J (xz) wins on pure ratio if archive size is the only thing that matters and you do not mind the CPU cost. -j (bzip2) has no real niche left.
The file extension is just convention: .tar.gz / .tgz for gzip, .tar.bz2 for bzip2, .tar.xz for xz, .tar.zst for zstd. tar does not enforce it, but matching the extension to the compressor keeps everyone sane.
Common mistakes
1. Newline-delimited list with whitespace filenames. find . | tar -T - without -print0 and --null breaks on any filename containing a space or newline. Always pair -print0 with --null.
2. Trying to append to a .tar.gz. tar -rf logs.tar.gz newfile fails: append mode needs a seekable archive and a gzip stream is not seekable. Build the .tar uncompressed, append to it, then compress as a final step.
3. Forgetting --null after using -print0. If find emits NUL-delimited paths but tar still expects newlines, tar sees the whole stream as one giant filename. The two flags are a matched pair.
4. Path vs basename confusion. The archive stores the exact path string find produced. find /var/log ... puts var/log/... in the archive (leading slash stripped); cd /var/log && find . ... puts ./.... Decide where the archive should be rooted and run find from there.
5. Using GNU --files-from on macOS. BSD tar does not recognize --files-from. Use -T -, which both tars accept, for portable scripts.
6. Archiving an absolute-path tree and being surprised on extract. GNU tar strips leading / on create and warns; on extract it lands relative to the current directory. If you expected files to restore to their original absolute locations, they will not (by design, for safety).
When NOT to use find + tar
This pipeline is for "pack a filtered snapshot of files into one archive". It is the wrong tool when:
- You need incremental sync.
rsynccopies only what changed and can mirror a directory efficiently. For keeping two locations in step, use find and rsync for selective transfers, not a fresh tar every time. - You need a cross-platform archive. Windows has no native tar in older versions. If a non-technical recipient or a Windows machine has to open the archive,
zipis the safer interchange format. On Windows itself, PowerShell'sCompress-Archiveproduces a.zip. - You need a real backup. tar is an archiver, not a backup system. It has no deduplication, no encryption, no retention policy, no integrity verification across snapshots. For actual backups use a tool built for it (restic, borg, or your platform's backup service). tar is fine as one building block inside a backup script, not as the whole thing.
- The selection is trivial. If you just want to archive a whole directory,
tar -czf out.tar.gz mydir/needs nofindat all. Bring infindonly when the selection is a real filter.
For Windows, Compress-Archive is the closest built-in equivalent:
Get-ChildItem -Path . -Recurse -File -Filter '*.log' | Compress-Archive -DestinationPath archive.zipSee also
- find Command Cheat Sheet: the full find reference for name, type, size, time, and exec patterns
- find -exec vs xargs: when to invoke a command per match versus piping a NUL-delimited list
- find and rsync for selective transfers: the right tool when you need sync, not a one-shot archive
- find files modified in the last 7 days: the
-mtimereference for archiving by modification time - External: GNU tar manual, GNU findutils manual.





