Speed Up Hashcat: Workload, Optimized Kernels, and Tuning

"hashcat is slow" almost always means one of two things: you are running it with conservative defaults, or you are running the wrong attack. The flags below recover real speed, but the single biggest lever is not a flag at all, it is the order you run your attacks in. A well-ordered run on default settings beats a badly-ordered run with every performance flag set. This is how to get the most out of hashcat, and where the hard ceiling is. Tested on hashcat 7.1.2.

TL;DR

The fast path: set the workload profile with -w 3 (or -w 4 on a headless box), add -O for optimized kernels when your passwords are short, and let hashcat autotune the rest. Confirm your real speed with hashcat -b -m <mode>. But the biggest win is attack ordering: run wordlist, then wordlist + rules, then hybrids, then masks, so you spend cycles where cracks actually are. And accept the ceiling: no flag makes a slow hash fast or a strong password weak. Optimization buys you speed within the attack; it does not change the maths.

What actually controls your speed

Three things, in order of impact:

The hash. A fast hash runs at billions per second; a slow one at thousands. You cannot change this, but it dictates everything else (it decides which attacks are even viable).
The attack you run. Running an exhaustive mask when a wordlist would crack it is the most common way to waste a GPU-week. This is the lever you control most.
The flags. Workload, kernels, device selection. Real, but smaller than the first two.

Most "hashcat is slow" problems are actually number 2 wearing number 3's clothing.

The workload profile (-w)

The workload profile trades desktop responsiveness for throughput. The four levels, verified from hashcat --help:

`-w`	Profile	Use when
1	Low	You are actively using the desktop and want it responsive
2	Default	General use
3	High	The machine is dedicated to cracking
4	Nightmare	A headless rig you never touch directly

On a dedicated cracking box, -w 3 (or -w 4) is free speed you are leaving on the table at the default:

bash

hashcat -m 0 -a 0 hashes.txt rockyou.txt -w 3

Optimized kernels (-O) and their catch

-O switches hashcat to optimized kernels, which are meaningfully faster, sometimes dramatically so. The catch: they cap the maximum password length (often to 31 or fewer characters, mode-dependent). For most real passwords that limit never bites, so -O is close to free speed:

bash

hashcat -m 0 -a 0 hashes.txt rockyou.txt -w 3 -O

When to drop -O: if you are attacking long candidates (passphrases, long combinator output) that exceed the cap, the optimized kernel would silently skip them. For those runs, leave -O off so hashcat uses the pure kernels with no length limit.

For very slow hashes, the companion flag is -S (slow-candidate mode), which can improve throughput on the likes of bcrypt.

Pick the right device

By default hashcat uses everything it can see. To check what that is, and to force a device type:

bash

hashcat -I                  # list backends and devices
hashcat -m 0 ... -D 2       # GPU only (device type 2; 1 is CPU)
hashcat -m 0 ... -d 1       # use only device number 1 (e.g. one of several GPUs)

On a multi-GPU rig, -d lets you dedicate specific cards to a job. On a laptop, forcing -D 2 ensures you are not accidentally cracking on the CPU.

If you are speccing a rig rather than tuning one, the current NVIDIA flagships are the fastest single-GPU options for hashcat: the GeForce RTX 5090 and the still-excellent RTX 4090. More cards scale almost linearly, so two mid-range GPUs often beat one flagship for the money.

Benchmark to know your real numbers

Do not guess at your speed; measure it. The benchmark gives you the guesses-per-second figure for a mode on your exact hardware, which is what you use to estimate whether an attack is feasible:

bash

hashcat -b -m 0       # benchmark MD5
hashcat -b -m 3200    # benchmark bcrypt (watch how much slower it is)
hashcat -b            # benchmark a broad set of modes

Comparing your benchmark to published numbers (the GTX 1080 Ti benchmark deep dive has a full table across modes) tells you whether your setup is performing as it should or whether a driver or thermal issue is holding it back.

The real optimization: attack ordering

No flag matters as much as running attacks in the right order. The principle is "cheapest, highest-yield first," so you crack the easy passwords immediately and only spend expensive cycles on what is left:

Wordlist (rockyou.txt). Seconds to set up, catches reused passwords.
Wordlist + rules (-r best66.rule). The highest-yield attack.
Hybrid (-a 6). For word + digits patterns.
Targeted masks for known shapes.
Bigger wordlists and rule stacks, then incremental masks, only if needed.

Run with --username and feed cracked passwords back with --loopback, because one cracked password predicts others. This ordering will out-crack a brute force with every performance flag set, in a fraction of the time. The full reasoning is in the attack types.

Keep the rig stable

Speed is worthless if the run crashes or the hardware throttles. On a long job, guard the temperature so a card backs off cleanly instead of overheating or producing errors:

bash

hashcat -m 0 ... -w 3 --hwmon-temp-abort=90

Good airflow and a sane abort temperature keep throughput consistent across a multi-hour run, which matters more than squeezing out the last few percent with manual kernel tuning. (The manual knobs -n, -u, and -T exist, but hashcat's autotune is good; only touch them if you know your hardware well and have benchmarked the difference.)

The ceiling you cannot tune past

Be clear-eyed about what optimization buys you. It makes a given attack run faster. It does not:

Make a slow hash fast. bcrypt at cost 12 is thousands of guesses per second no matter what flags you set.
Make a strong password weak. A long, random passphrase is out of reach at any speed.

When the estimated time is still years after you have tuned everything, the answer is not more tuning, it is a smarter attack (wordlist + rules) or accepting that this particular hash is not coming out. Knowing the difference is the real skill.

Where to go next

The attack ordering that matters most: the attack types and rules.
Real per-GPU numbers: hashcat GPU benchmarks.
The reference: hashcat cheat sheet.
The big picture: how password cracking works.

Usually one of three reasons: you are attacking a slow hash (bcrypt, Argon2, WPA) where thousands per second is normal; you are running the wrong attack (an exhaustive mask instead of a wordlist); or you are on conservative defaults. Set -w 3, add -O for short passwords, and fix your attack order before blaming the hardware.

-O enables optimized kernels, which are significantly faster but cap the maximum password length (often to 31 characters or fewer). For typical passwords this never matters, so it is close to free speed. Drop it only when attacking long candidates that would exceed the cap.

Use -w 3 (high) on a machine dedicated to cracking, -w 4 (nightmare) on a headless rig you do not interact with, and -w 1 on a desktop you are actively using so it stays responsive. The default is -w 2.

No. Optimization speeds up a given attack but cannot change the underlying maths. A deliberately slow hash like bcrypt stays slow, and a long random password stays out of reach, at any speed. When the estimate is still years after tuning, switch to a smarter attack or accept the hash will not crack.

Speed Up Hashcat: Workload, Optimized Kernels, and Tuning

TL;DR

What actually controls your speed

The workload profile (-w)

Optimized kernels (-O) and their catch

Pick the right device

Benchmark to know your real numbers

The real optimization: attack ordering

Keep the rig stable

The ceiling you cannot tune past

Where to go next

Sources

Ishan Karunaratne

Related posts

Securing a WordPress REST API Write Endpoint

Generate a UUID in JavaScript: crypto.randomUUID() (Browser and Node)

Bash For Loops: Syntax, Examples, and One-Liners

Why is hashcat so slow?

What does -O do in hashcat?

What is the best workload profile for hashcat?

Can optimization make hashcat crack any password?

Sources

Ishan Karunaratne