File Filters | head, tail, cut, sort & uniq

I built one-line log analytics by chaining filter tools: cut to extract fields, sort and uniq to count occurrences, and head to rank. The classic sort | uniq -c | sort -rn idiom turned raw logs into top-N reports without a script.

Objective & Context

Filters are the workhorses of log triage. This lab assembles the standard idioms for field extraction and frequency analysis, the manual counterpart to the Python parsing and SIEM labs.

Environment & Prerequisites

Linux shell with GNU coreutils.
A sample access or auth log.
Knowledge of the log's field delimiter.

flowchart LR L[Log] --> Cut[cut field] Cut --> S[sort] S --> U[uniq -c] U --> R[sort -rn | head]

Step-by-Step Execution

1. Top source IPs in an access log

cut -d' ' -f1 access.log | sort | uniq -c | sort -rn | head

2. Tail-follow with filtering

tail -f /var/log/syslog | grep -i error

3. Count matching lines

grep -c 401 access.log

  842 203.0.113.45
  311 198.51.100.7
   27

Validation & Testing

Run the top-N pipeline against a known log and verify the counts match a manual spot-check. Pass criteria: correct field extraction, accurate frequency counts, and a ranked top-N list.

Advanced: Troubleshooting

Wrong field: confirm the delimiter; some logs use tabs, not spaces.
uniq misses duplicates: uniq only collapses adjacent lines, so sort first.
Huge files slow: pre-filter with grep before sorting.

Key Results

Produced top-N source reports from raw logs in one line.
Counted error and status-code occurrences instantly.
Followed live logs filtered to relevant events.
Avoided scripting for tasks the filter idioms already solve.

File Filters and Text Processing with Core Tools