find / -xdev -type f -printf '%s %p\n' 2>/dev/null | sort -rn | head -20 gives you a ranked list of the 20 biggest files on the disk, largest first. This is the command I reach for the moment a server alerts on a full root filesystem and I need to know what is actually eating the space.
It is three tools doing one job. find walks the tree and prints each file's size and path. sort -rn ranks those lines by the leading number, descending. head -20 trims the output to something I can scan in one screen. The whole thing runs in seconds on a normal server and tells me, concretely, which files to investigate first.
This article is the focused "what is using my disk space" version. If you want the threshold-filter mechanics (the +100M sign convention, unit suffixes, size ranges), the find files larger than a size article covers that. Here the goal is different: I do not have a threshold in mind, I just want the disk's worst offenders ranked.
Set your values
Set your OS and search path. Every command below updates with your values.
The one-liner
find :search_path -xdev -type f -printf '%s %p\n' 2>/dev/null | sort -rn | head -20The output is <size_in_bytes> <path>, one file per line, biggest at the top. On GNU systems -printf '%s %p\n' prints the raw byte count and the path with almost no overhead because it never forks a subprocess. The BSD variant has to call stat once per file (more on that gap below). Bump head -20 to head -50 for a wider sweep.
A few of those flags carry real weight, so it is worth being precise about what each one buys.
Why -xdev matters
-xdev tells find to stay on a single filesystem and never cross a mount point. When you anchor a scan at /, that one flag is the difference between a clean result and a scan that runs for an hour and returns garbage.
Without -xdev, find / descends into every mounted filesystem under the root tree:
/procand/sysare virtual filesystems. The "files" in them are kernel interfaces, not disk data.findwill report nonsense sizes (/proc/kcorefamously looks like it is the size of all your RAM, sometimes terabytes) and waste time enumerating thousands of process entries.- Network mounts (NFS, SMB, sshfs) get walked over the wire. A scan that should take five seconds locally now takes minutes and hammers a remote server.
/devis a device filesystem. Block devices show up with their addressable size, so a disk device node reads as the size of the disk.
-xdev skips all of it. The scan walks only the filesystem that / itself lives on. If you have separate partitions for /var or /home, that means you run the command once per partition (find /var -xdev ..., find /home -xdev ...), which is exactly what you want when you are diagnosing which partition is full.
The 2>/dev/null is the companion flag in spirit: it discards the permission-denied errors find prints when it hits a directory you cannot read. Without it, those errors interleave with your results and clutter the output.
Human-readable sizes
Raw byte counts are precise but hard to eyeball. 4823949312 does not register as "4.5 GB" at a glance. On GNU systems, numfmt fixes that:
find :search_path -xdev -type f -printf '%s %p\n' 2>/dev/null | sort -rn | head -20 | numfmt --to=iec --field=1 --padding=8The sort happens on raw bytes first, then numfmt --to=iec converts only the displayed output to 4.5G, 812M, and so on. Doing it in that order matters: if you converted to human-readable strings before sorting, sort -rn would rank 9K above 8G because 9 is numerically larger than 8.
numfmt is part of GNU coreutils and is not installed on macOS by default. The BSD variant above uses an inline awk block to do the same conversion, which works on any system with awk (so, everything).
The du companion: per-directory totals
find answers "which individual file is biggest". That is the wrong question about as often as it is the right one. Frequently the disk is full not because of one huge file but because of ten thousand medium ones in a directory you forgot about: a runaway log directory, a Docker image cache, months of database backups. find would spread those across the bottom of the ranking and you would never see the pattern.
That is du's job. du sums sizes per directory subtree:
du -xh --max-depth=1 :search_path 2>/dev/null | sort -rh | head -20du -h prints human-readable sizes, -x is du's equivalent of find's -xdev (stay on one filesystem), and --max-depth=1 (GNU) or -d 1 (BSD) limits the output to immediate children rather than every nested directory. sort -rh does a human-readable numeric sort that understands the K, M, G suffixes, so 2.1G ranks above 900M correctly.
The decision is simple:
- Use
find ... -printf '%s %p'when you suspect one or a few large files (a forgotten core dump, a giant log, a stray ISO). It gives you per-file granularity. - Use
du -xh --max-depth=1when you suspect accumulation: a directory that is large because of its contents in aggregate, not any single file. It gives you per-directory totals.
In practice I run du first to find the heavy directory, then cd into it and run du again one level deeper, drilling down until I hit either a single fat file (switch to find) or a directory full of small files I can now explain.
ncdu: the interactive option
When the drill-down is more than two or three levels deep, repeating du gets tedious. ncdu ("NCurses disk usage") does the same scan once and gives you an interactive, navigable tree: arrow keys to move, Enter to descend, sizes recalculated as you go, and a delete key bound right in the interface.
# Debian/Ubuntu
sudo apt install ncdu
# macOS
brew install ncdu
# Scan a path, stay on one filesystem
ncdu -x /ncdu -x / scans once, then hands you a browsable view sorted by size at every level. For an unfamiliar server where I do not yet know the shape of the disk, ncdu is the fastest way to build a mental model. The -x flag is the same one-filesystem guard as find's -xdev and du's -x. The trade-off versus the one-liners: ncdu is interactive, so it does not compose into a script or a cron job. For automation, stay with find and du.
The full-server disk audit recipe
When a production box alerts on disk and I need to diagnose it quickly, this is the sequence I run, top to bottom:
# 1. Which filesystem is actually full?
df -h
# 2. Which top-level directories are heaviest on that filesystem?
du -xh --max-depth=1 / 2>/dev/null | sort -rh | head -20
# 3. Drill into the heavy directory (repeat as needed)
du -xh --max-depth=1 /var 2>/dev/null | sort -rh | head -20
# 4. Once narrowed down, rank individual files in the suspect path
find /var/log -xdev -type f -printf '%s %p\n' 2>/dev/null | sort -rn | head -20
# 5. Check for deleted-but-open files holding space (the silent killer)
lsof -nP 2>/dev/null | awk '/deleted/ && $7 ~ /^[0-9]+$/ && $7 > 104857600'Step 5 catches a case neither find nor du can see: a process that has deleted a large file but still holds the file descriptor open. The disk space is not freed until that process closes the handle or restarts, and the file no longer appears in the directory tree at all. df reports the space as used, du reports it as free, and the two disagreeing is the tell. When that happens, restart the process holding the descriptor (lsof names it) and the space comes back.
macOS BSD vs GNU find
The headline difference for this task is -printf. GNU find has it; BSD find (the default on macOS) does not. That single gap is why every example in this article ships a separate mac variant.
| Feature | GNU find (Linux) | BSD find (macOS default) |
|---|---|---|
-printf '%s %p' (size + path, no fork) | Supported | NOT supported |
-xdev (stay on one filesystem) | Supported | Supported (also spelled -x) |
Size ranking without -printf | n/a | -exec stat -f '%z %N' (forks per file) |
numfmt for human sizes | In coreutils | Not installed by default |
du --max-depth=N | Supported | Use -d N instead |
du -x (one filesystem) | Supported | Supported |
Because BSD find lacks -printf, the macOS one-liner falls back to -exec stat -f '%z %N' {} \;, which forks a stat process for every file. On a tree with hundreds of thousands of files that is noticeably slower than the GNU version. Two ways around it: install GNU findutils (brew install findutils, then use gfind with -printf), or skip per-file ranking entirely and lean on du, which is fast and identical in behavior on both platforms.
Common mistakes
1. Forgetting -xdev on a / scan. This is the big one. Without it, find walks /proc, /sys, /dev, and every network mount. You get bogus multi-terabyte "files" from kernel interfaces and a scan that takes an order of magnitude longer. Always pair a /-anchored find with -xdev.
2. Confusing logical size with physical size. find -printf '%s' reports the logical file size, the same number ls -l shows. That is not always how much disk the file consumes. The next two mistakes are specific cases of this.
3. Sparse files. A sparse file has a large logical size but holds far fewer actual blocks on disk (the unwritten regions are not allocated). VM disk images and database files are commonly sparse. find -printf '%s' shows the logical size and overstates the real disk impact. For allocated blocks, use find -printf '%b' (512-byte block count, GNU) or du, which reports actual allocation by default.
4. Compression and copy-on-write filesystems. On ZFS or btrfs with transparent compression, a file's logical size and its on-disk footprint can diverge sharply. find reports logical bytes; du reports allocated blocks. When the two disagree, trust du for "how much disk is this costing me".
5. Sorting human-readable output with sort -rn. If you run numfmt or du -h before sorting, sort -rn reads 9K and 8G as 9 and 8 and ranks the kilobyte file higher. Either sort raw bytes then format (the find | sort -rn | head | numfmt order), or use sort -rh which understands the suffixes.
6. Running the scan without 2>/dev/null. Permission-denied errors interleave with results and bury the actual data. Redirect stderr. (Conversely, if you suspect a permissions issue is hiding files, drop the redirect once to see the errors.)
When NOT to use this
find -printf '%s %p' | sort -rn is the per-file tool. Reach for something else when:
- You want per-directory totals, not per-file. Use
du -xh --max-depth=1. A disk filled by accumulation (thousands of small files in one tree) is invisible to a per-file ranking.duis the right lens. - You are exploring an unfamiliar disk interactively. Use
ncdu. Repeatedly re-runningduto drill down is slower and more error-prone than a singlencduscan you can navigate. - You need physical disk usage, not logical size. Use
du(allocated blocks) orfind -printf '%b'. On filesystems with sparse files or compression, logical size from-printf '%s'is misleading. - The space is genuinely gone but nothing shows up.
dfsays full,dusays there is room: that is a deleted-but-open file. Uselsof | grep deleted, not find. find cannot see a file that no longer has a directory entry. - You want a fixed-size threshold filter. "Everything over 1 GB" is
find -size +1G, covered in find files larger than a size. This article is for ranking when you do not have a threshold in mind.
See also
- find Command Cheat Sheet: the full find reference covering name, type, size, time, and exec patterns
- Find files larger than a size with find -size: the threshold-filter companion (
+100M, unit suffixes, size ranges) - Find files modified in the last 7 days: the
-mtimecompanion for time-based searches - External: GNU findutils manual, BSD find(1) man page, ncdu





