grep and Print a Specific Column: grep + awk Field Extraction (2026)

grep 'pattern' file | awk '{print $2}' is the one-liner: grep keeps only the lines that match your pattern, and awk pulls a single column out of each surviving line. It is the combination I reach for constantly when a log line or a command's output has the value I want buried in the third or fourth field.

The reason both tools show up is division of labour. grep is a line filter and knows nothing about columns. awk understands columns natively but writing its pattern-match syntax is slightly more typing than grep 'pattern'. Piping the two together is the readable middle ground. As you will see below, awk can actually do the filtering itself, so the pipe is a convenience, not a requirement.

Set your values

Try it with your own values

Set your OS, the file to search, and the pattern. Every command below updates with your values.

Operating systemFile to searchPattern

The one-liner

bash· Linux (GNU)

grep ':pattern' :search_path | awk '{print $2}'

grep ':pattern' :search_path emits every matching line. awk '{print $2}' then prints the second whitespace-separated field of each line grep handed it. Change $2 to whichever column you need.

On Windows, PowerShell has no awk. Select-String does the filtering, then ForEach-Object splits each matched line on whitespace and indexes into it. PowerShell arrays are zero-based, so the second column is index 1.

awk field basics

awk splits every input line into fields and numbers them from 1:

$1 is the first field, $2 the second, and so on.
$0 is the entire line, unsplit.
$NF is the last field. NF is a built-in variable holding the number of fields, and $NF dereferences it, so it always points at the final column no matter how many there are.
$(NF-1) is the second-to-last field, useful when lines have a variable column count but the value you want is always near the end.

By default awk splits on runs of whitespace (spaces and tabs). That default is forgiving: two spaces, a tab, or a mix all count as one separator, and leading or trailing whitespace is trimmed rather than producing empty fields.

bash

# Print the last field of every line
awk '{print $NF}' access.log

# Print the whole line (same as cat, but proves $0 works)
awk '{print $0}' access.log

Custom field separator with -F

When the file is not whitespace-delimited, tell awk what the separator is with -F. The classic case is /etc/passwd, which is colon-delimited:

bash· Linux (GNU)

grep ':pattern' /etc/passwd | awk -F: '{print $1}'

-F: sets the field separator to a colon, so $1 is the username. For CSV, use -F','. For tab-delimited files, -F'\t'. The separator can be a regex: -F'[,;]' splits on either a comma or a semicolon.

bash

# Username and shell from passwd: fields 1 and 7
awk -F: '{print $1, $7}' /etc/passwd

# First column of a CSV
awk -F',' '{print $1}' data.csv

Print multiple columns

List the field references separated by commas inside print:

bash· Linux (GNU)

grep ':pattern' :search_path | awk '{print $1, $3}'

The comma between $1 and $3 inserts awk's output field separator, which is a single space by default. Drop the comma ({print $1 $3}) and awk concatenates the two fields with no separator, which is almost never what you want. To join with something else, set OFS: awk -v OFS=',' '{print $1, $3}' produces comma-separated output.

grep -o covers the simple cases

If all you want is the matched text itself, not a positional column, grep -o does the job without awk. -o prints only the part of the line that matched the pattern:

bash· Linux (GNU)

grep -oE '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' :search_path

That extracts every IPv4-looking string regardless of which column it sits in. grep -o is the right tool when the value is identifiable by shape (an IP, a UUID, an email) rather than by position. When the value is "the third field", you need awk or cut, because position is something grep cannot express.

Real examples

Extract IPs from access-log lines that matched a pattern. A combined-format access log puts the client IP in field 1. Filter for the status code you care about, then print that field:

bash· Linux (GNU)

grep ' :pattern ' :search_path | awk '{print $1}' | sort -u

sort -u collapses the result to the unique set of IPs that hit a :pattern response.

Pull the PID column from filtered ps output. ps aux puts the PID in the second column. To get the PIDs of every matching process:

bash· Linux (GNU)

ps aux | grep ':pattern' | grep -v grep | awk '{print $2}'

The grep -v grep step drops the grep process itself, which would otherwise show up in its own output. (pgrep ':pattern' does this cleanly in one step on Linux and macOS, and is worth knowing, but the ps | grep | awk chain is the pattern people recognise.)

Get a specific field of matching /etc/passwd lines. Field 2 of passwd is the password placeholder, field 6 is the home directory:

bash

grep 'sshd' /etc/passwd | awk -F: '{print $6}'

When awk alone is enough

awk can filter and extract. A pattern written between slashes before the action block makes awk run that block only on matching lines:

bash· Linux (GNU)

awk '/:pattern/ {print $2}' :search_path

awk '/:pattern/ {print $2}' is exactly equivalent to grep ':pattern' | awk '{print $2}', with one fewer process and one fewer pipe. The /.../ part is awk's own regex match against $0.

So why does the grep+awk pipe still show up everywhere? Two honest reasons. First, readability: grep 'pattern' reads as plain English, and many people add the awk step interactively after they have already typed the grep. Second, regex flavour: grep -E and especially GNU grep -P support constructs awk's regex engine does not. If you need a lookbehind, keep grep -P in the pipe.

But for a plain pattern, prefer the single awk. It is faster, it is one process, and it does not need the grep -v grep self-match workaround because you can write awk '!/awk/ && /pattern/ {print $2}'. The pipe is fine and I still use it; just know it is a convenience and not a necessity.

cut: the simpler alternative for fixed delimiters

When the delimiter is a single fixed character and you only need to slice a column out, cut is lighter than awk:

bash· Linux (GNU)

grep ':pattern' /etc/passwd | cut -d: -f1

-d: sets the delimiter, -f1 selects field 1. cut -d: -f1,3 selects multiple fields, cut -d: -f2- selects field 2 to the end.

The catch that sends people back to awk: cut does not collapse repeated delimiters. With whitespace-aligned output like ps or ls -l, columns are padded with a variable number of spaces, and cut -d' ' -f2 treats every single space as its own delimiter, so a run of three spaces produces two empty fields. awk's default whitespace splitting handles that correctly. Use cut only when the delimiter is exactly one character every time (:, ,, \t); use awk for anything whitespace-aligned.

macOS BSD awk vs GNU awk

macOS ships the original "one true awk" (BSD awk, also called nawk). Most Linux distributions ship gawk (GNU awk), often symlinked as awk. For the field extraction in this article the behaviour is identical: $1, $2, $NF, -F, print, and /pattern/ work the same on both.

The differences appear in features beyond basic columns. gawk adds gensub(), true multidimensional arrays, the --posix and --re-interval switches, network and coprocess support, and a handful of extension functions. If a script uses gensub() or gawk-specific I/O, it will fail on macOS. The fix is brew install gawk, which installs GNU awk as gawk (and as awk if you opt into linking it). For the grep | awk '{print $N}' patterns here, no install is needed; system awk on macOS handles them fine.

Common mistakes

1. Forgetting -F for non-whitespace files. Running awk '{print $1}' on /etc/passwd prints the whole line, because there is no whitespace to split on, so $1 equals $0. The colon is the separator, and awk will not guess it. Add -F:.

2. Writing "$2" instead of $2. Inside an awk program, $2 is the second field. "$2" is the literal two-character string $2. print "$2" prints $2 on every line, which is a confusing result that looks almost right. No quotes around field references.

3. Assuming leading spaces create an empty $1. With the default separator, awk trims leading and trailing whitespace before splitting, so a line that starts with three spaces still has the first visible word as $1. This is the opposite of cut -d' ', which would give you empty fields. The two tools behave differently here, and it surprises people who switch between them.

4. Using grep+awk when awk alone suffices. grep 'pattern' file | awk '{print $2}' spends a whole process on something awk '/pattern/ {print $2}' does in one. Not a bug, but worth collapsing in scripts that run in a loop.

5. Dropping the comma in multi-column print. print $1 $3 concatenates the two fields with no gap. print $1, $3 inserts the output field separator. The comma is load-bearing.

6. Quoting the whole awk program in double quotes. awk "{print $2}" lets the shell expand $2 (a positional shell parameter, usually empty) before awk ever sees it, so awk runs {print }. Always single-quote the awk program so $2 reaches awk intact.

When NOT to use grep + awk

Reach for a structured-data tool instead when the input is not line-and-column text:

JSON. Use jq. JSON values span lines, nest, and contain quoted delimiters. awk has no concept of nesting, and a {print $2} against pretty-printed JSON is meaningless. jq '.items[].id' is the correct tool.
Real CSV with quoted fields. A CSV cell can legally contain a comma inside double quotes ("Smith, John"). awk -F',' splits that into two fields and corrupts the row. Use a CSV-aware tool: csvkit (csvcut), xsv, mlr (Miller), or a few lines of Python's csv module. awk -F',' is only safe on CSV you have personally verified has no quoted commas.
Fixed-width columns with no delimiter. Some legacy reports align columns by character position with no consistent separator. cut -c5-12 (character ranges) or awk with substr($0, 5, 8) is the right approach, not field splitting.
XML or HTML. Use an XML/HTML parser (xmllint --xpath, pup, htmlq). Tag-based markup is not columnar.

The rule of thumb: grep + awk is for text that is genuinely lines of whitespace-or-character-separated columns. The moment the format has structure, quoting, or nesting, switch to a parser built for it.

FAQ

Pipe grep into awk: grep 'pattern' file | awk '{print $2}'. grep keeps the matching lines, and awk '{print $2}' prints the second whitespace-separated field of each one. Change the field number to whichever column you need; $1 is the first, $NF is the last.

NF is a built-in variable holding the number of fields on the current line. Prefixing it with $ dereferences it, so $NF is always the last field regardless of how many columns the line has. $(NF-1) is the second-to-last. This is the reliable way to grab the final column when line length varies.

Yes. Put a regex between slashes before the action block: awk '/pattern/ {print $2}' matches the line and prints field 2 in one step. It is exactly equivalent to grep 'pattern' | awk '{print $2}' with one fewer process. Keep the grep in the pipe only when you need a regex feature awk lacks, such as a PCRE lookbehind from GNU grep -P.

Set the field separator with -F: awk -F: '{print $1}' /etc/passwd prints the username. For CSV use -F',', for tab-delimited use -F'\t'. Without -F, awk splits on whitespace and a colon-delimited line becomes a single field equal to the whole line.

Use cut when the delimiter is a single fixed character every time, such as cut -d: -f1 on /etc/passwd. It is lighter and the syntax is shorter. Use awk for whitespace-aligned output like ps or ls -l, because cut treats each space as its own delimiter and a run of spaces produces empty fields, while awk collapses runs of whitespace into one separator.

The two usual causes: the awk program was double-quoted, so the shell expanded the field reference before awk saw it; always single-quote the program. Or the file is not whitespace-delimited and -F was omitted, so the field index points at the whole line. Check the delimiter first, then confirm the program is in single quotes.

For column extraction, yes. macOS ships BSD awk (the original "one true awk", also called nawk) and most Linux distributions ship gawk, but $1, $NF, -F, print, and slash-pattern matching are identical on both. Differences only appear in gawk-specific features such as gensub() and true multidimensional arrays. Install gawk on macOS with brew install gawk if a script needs them.

How to grep and Print a Specific Column (grep + awk)

Set your values

The one-liner

awk field basics

Custom field separator with -F

Print multiple columns

grep -o covers the simple cases

Real examples

When awk alone is enough

cut: the simpler alternative for fixed delimiters

macOS BSD awk vs GNU awk

Common mistakes

When NOT to use grep + awk

See also

FAQ

Ishan Karunaratne

Related posts

How to grep Case-Insensitively (grep -i)

How to Find Files Containing Specific Text (find + grep)

How to Create a Group on Linux (groupadd)

How do I grep and print a specific column?

What does $NF mean in awk?

Can awk filter lines without grep?

How do I print a column from a colon-delimited file like /etc/passwd?

When should I use cut instead of awk?

Why does grep + awk print nothing or the wrong text?

Does awk behave the same on macOS and Linux?

Ishan Karunaratne