The most common real-world password-recovery job is not a database dump; it is a single locked file. An encrypted ZIP from years ago, a password-protected PDF, an Office document whose password is long forgotten. John the Ripper is purpose-built for this: it cannot read the file directly, but its *2john extractors pull a crackable hash out of almost any encrypted format, and then you crack that hash like any other. This is the extract-then-crack workflow, with real output. Tested on John the Ripper 1.9.0-jumbo-1.
TL;DR
The workflow is always two steps: extract the hash with the matching *2john tool, then crack it. For a ZIP: zip2john secret.zip > hash.txt, then john --wordlist=rockyou.txt hash.txt, then john --show hash.txt. Swap zip2john for rar2john, pdf2john.pl, or office2john.py for other formats. These are slow hashes (built on AES-based key derivation), so a wordlist plus rules is the only sensible attack, not brute force. hashcat can also crack the extracted hash if you want GPU speed. Only do this on files you own or are authorised to recover.
The pattern: extract, then crack
John cannot open a .zip and start guessing. Instead, a small helper reads the encrypted file's headers and produces a single hash string that encodes everything John needs (the cipher, the salt, a verifier). You crack that.
secret.zip ──zip2john──▶ hash.txt ──john──▶ passwordEvery format has its own extractor, but the second step is identical for all of them.
Crack a ZIP password
Here is the full workflow against a ZIP encrypted with the password infected:
zip2john secret.zip > zip.hashzip2john prints a hash line that starts with the format tag:
secret.zip/secret.txt:$pkzip2$1*2*2*0*16*a*4c25ff83*0*44*0*16*...Now crack it with a wordlist:
john --wordlist=rockyou.txt zip.hashLoaded 1 password hash (PKZIP [32/64])
infected (secret.zip/secret.txt)
Session completedAnd read it back cleanly:
john --show zip.hash
# secret.zip/secret.txt:infected:secret.txt:secret.zip::secret.zipThat infected is the recovered password. The same three commands, with a different extractor, handle every other format.
Homebrew gotcha: on macOS, Homebrew does not put the
*2johntools on yourPATH. They live in the John share directory (e.g./opt/homebrew/opt/john-jumbo/share/john/zip2john). Call them by full path, or add that directory to yourPATH. On Kali and most Linux distros they are on thePATHalready.
The other formats
Same pattern, different first command:
# RAR (RAR3 and RAR5)
rar2john archive.rar > rar.hash
john --wordlist=rockyou.txt rar.hash
# PDF (password-protected)
pdf2john.pl locked.pdf > pdf.hash
john --wordlist=rockyou.txt pdf.hash
# Office (Word, Excel, PowerPoint, 2007 onward)
office2john.py report.xlsx > office.hash
john --wordlist=rockyou.txt office.hash
# 7-Zip
7z2john.pl archive.7z > 7z.hash
john --wordlist=rockyou.txt 7z.hash
# KeePass database
keepass2john db.kdbx > keepass.hash
john --wordlist=rockyou.txt keepass.hashAdd --rules to any of these to mutate the wordlist and catch decorated passwords. The full extractor list and flag reference is in the John the Ripper cheat sheet.
Prefer hashcat? Use the matching mode
The hash that *2john produces also feeds hashcat, if you want GPU speed. Map the format to the mode:
| File | hashcat -m |
|---|---|
| ZIP (PKZIP) | 17200 / 17210 |
| WinZip (AES) | 13600 |
| RAR5 | 13000 |
| PDF 1.7 (Acrobat 9 / 10-11) | 10600 / 10700 |
| MS Office 2013 | 9600 |
| 7-Zip | 11600 |
| KeePass | 13400 |
For a single file, John's auto-detection and one-line workflow is usually quicker to drive; for a pile of files or a serious wordlist run, hashcat on a GPU wins.
Why brute force is the wrong move here
These formats do not use a fast hash. Modern ZIP/RAR/Office/PDF encryption derives the key with a deliberately slow, iterated function (AES with many KDF rounds), which is the same slow-hash principle as bcrypt. So the guess rate is low, and brute force is hopeless for anything but a trivially short password.
The right attack is a wordlist with rules: people protect files with passwords they can remember, which means real words, names, and dates, the exact thing a good wordlist catches. If you have any context about the file's owner, a custom wordlist (their company, their projects, relevant dates) is your best shot at the last stubborn ones.
Where to go next
- The tool in full: John the Ripper and the John cheat sheet.
- The candidates that crack these: wordlists and rules.
- The GPU alternative: how to use hashcat.
- The big picture: how password cracking works.
Sources
Authoritative references this article was fact-checked against.
- John the Ripper documentation (official)openwall.com
- John the Ripper jumbo, source (official)github.com
- hashcat, example hashes and modes (official)hashcat.net





