A password cracker is only as good as the candidates it feeds in, and the most important source of candidates is the wordlist. The right list cracks half a leaked table in minutes; the wrong one wastes a GPU-week finding nothing. The skill is not hoarding terabytes of lists, it is knowing which list to run first, how to mutate it with rules, and how to build a target-specific list when the generic ones run dry. This is the wordlist playbook.
TL;DR
rockyou.txt (14 million real passwords from the 2009 RockYou breach) is the universal starting list and still cracks a large fraction of any modern dump, because people keep reusing the same passwords. Beyond it, get SecLists (curated, well-organised) and a breach-derived collection like weakpass. Order matters: run your highest-yield list first, mutate with best66 rules, then go bigger only if needed. When generic lists fail, generate a target-specific list with cewl (scrape the target's website), crunch (pattern-based), or hashcat --stdout (rule and mask expansion). Quality and ordering beat raw size every time.
rockyou.txt: the universal starting point
In 2009 the social-app company RockYou was breached, exposing about 32 million accounts whose passwords were stored in plaintext. The de-duplicated list of those passwords, rockyou.txt, is roughly 14 million entries and has been the default cracking wordlist ever since.
It is still effective more than fifteen years later for one depressing reason: password choices have barely changed. 123456, password, iloveyou, and a long tail of names, dates, and keyboard walks are exactly what people still pick. Run rockyou against any real-world dump of fast hashes and it clears a meaningful slice before you have configured anything fancier.
On Kali Linux it ships at /usr/share/wordlists/rockyou.txt.gz (gunzip it first). Otherwise it is one of the most mirrored files on the internet; the SecLists repository includes it under Passwords/Leaked-Databases/.
# Kali: decompress the bundled copy
gunzip -k /usr/share/wordlists/rockyou.txt.gz
# Anyone: it lives in SecLists
git clone https://github.com/danielmiessler/SecLists.git
ls SecLists/Passwords/Leaked-Databases/rockyou*The curated lists worth having
- SecLists is the one repository to clone. It is a well-organised collection of passwords, usernames, fuzzing payloads, and discovery lists, maintained and sane. Most engagements never need anything outside it.
- Weakpass aggregates breach-derived password lists at every size from a few megabytes to enormous, with stats on cracking efficiency so you can pick by yield, not guesswork.
- Breach compilations (the various "RockYou2021/2024" mega-lists circulating as multi-billion-line files) are mostly noise: deduplicated dumps padded with junk and brute-force keyspaces. They look impressive at 8 billion lines but cost far more time than they return. Reach for them last, if ever.
The lesson buried in that last point: bigger is not better. A focused 14-million-word list plus good rules out-cracks a 6-billion-line megablob in a fraction of the time, because the megablob is mostly strings nobody ever chose.
Optimise and order your wordlists
Two lists in front of you, which runs first? The one with the higher hit rate per candidate. A few habits:
# De-duplicate and drop blank lines (smaller, no wasted guesses)
sort -u messy.txt | grep -v '^$' > clean.txt
# Keep only candidates within a realistic length band (e.g. 6 to 16)
awk 'length>=6 && length<=16' clean.txt > banded.txt
# Merge several lists in priority order, de-duplicated, keeping first occurrence
# (awk preserves order; sort -u would destroy your highest-yield-first ordering)
awk '!seen[$0]++' best.txt good.txt > final.txthashcat's companion hashcat-utils has purpose-built tools: rli removes entries already present in another list (so you never re-test cracked passwords), and splitlen buckets a list by length. The single highest-value habit is --loopback: feed your already-cracked passwords back as a wordlist, because if one user picked Summer2024!, another picked Winter2024!, and cracked passwords are the best predictor of the uncracked ones.
Generate a target-specific list
When the generic lists run dry, the cracks left are usually about the target: the company name, product names, local sports teams, the founder's dog. You build those candidates yourself.
cewl scrapes a website and builds a wordlist from the words it actually uses, perfect for a corporate target whose passwords reference their own products:
cewl https://target.example -d 2 -m 5 -w target-words.txtcrunch generates every string matching a pattern or character set, the standalone version of a mask, useful when you know a rigid format:
# "company" followed by three digits: company000 .. company999
# crunch placeholders: @ = lowercase, , = uppercase, % = digit, ^ = symbol
crunch 10 10 -t company%%% -o company-list.txthashcat --stdout turns a small seed list into a big candidate list by applying rules or hybrid masks without cracking anything, which is often the cleanest generator of all:
# Expand a seed list with best66, write the candidates to a file
hashcat --stdout seeds.txt -r rules/best66.rule > generated.txt
# Seed + 2 trailing digits (admin00 .. root99)
hashcat --stdout -a 6 seeds.txt ?d?d > seed-plus-digits.txtTools like CUPP go further, building a personalised list from facts about a specific person (names, dates, pet names) for targeted password recovery. Used responsibly, on your own accounts or an authorised engagement, this is how the last stubborn hashes fall.
Where to go next
- Mutate your lists for far more cracks: hashcat rules.
- Where wordlists sit among the attacks: dictionary vs brute force vs mask vs hybrid.
- Run them: how to use hashcat · hashcat cheat sheet.
- The defender's takeaway: long, non-dictionary passphrases beat every wordlist. See validating password strength.
Sources
Authoritative references this article was fact-checked against.
- SecLists, the security tester's companion (GitHub)github.com
- hashcat-utils, wordlist tools (official)hashcat.net
- Weakpass, wordlist collectionweakpass.com





