Skip to content

sort -u vs sort | uniq: a tiny Linux fork in the road

I recently fell into one of those algorithmic rabbit holes that only the internet can provide. The spark was a YouTube Short by @TechWithHazem: a rapid-fire terminal demo showing a neat little text-processing trick built entirely out of classic Linux tools. No frameworks, no dependencies, just pipes, filters, and decades of accumulated wisdom compressed into under two minutes.

That’s the modern paradox of Unix & Linux culture: tools older than many of us are being rediscovered through vertical videos and autoplay feeds. A generation raised on Shorts and Reels is bumping into sort, uniq, and friends, often for the first time, and asking very reasonable questions like: wait, why are there two ways to do this?

So let’s talk about one of those deceptively small choices.


The question

What’s better?

sort -u

or

sort | uniq

At first glance, they seem equivalent. Both give you sorted, unique lines of text. Both appear in scripts, blog posts, and Stack Overflow answers. Both are “correct”.

But Linux has opinions, and those opinions are usually encoded in flags.


The short answer

sort -u is almost always better.

The longer answer is where the interesting bits live.


What actually happens

sort -u tells sort to do two things at once:

  • sort the input
  • suppress duplicate lines

That’s one program, one job, one set of buffers, and one round of temporary files. Fewer processes, less data sloshing around, and fewer opportunities for your CPU to sigh quietly.

By contrast, sort | uniq is a two-step relay race. sort does the sorting, then hands everything to uniq, which removes duplicates — but only if they’re adjacent. That adjacency requirement is why the sort is mandatory in the first place.

This pipeline works because Linux tools compose beautifully. But composition has a cost: an extra process, an extra pipe, and extra I/O.

On small inputs, you’ll never notice. On large ones, sort -u usually wins on performance and simplicity.


Clarity matters too

There’s also a human factor.

When you see sort -u, the intent is explicit: “I want sorted, unique output.”
When you see sort | uniq, you have to mentally remember a historical detail: uniq only removes adjacent duplicates.

That knowledge is common among Linux people, but it’s not obvious. sort -u encodes the idea directly into the command.


When uniq still earns its keep

All that said, uniq is not obsolete. It just has a narrower, sharper purpose.

Use sort | uniq when you want things that sort -u cannot do, such as:

  • counting duplicates (uniq -c)
  • showing only duplicated lines (uniq -d)
  • showing only lines that occur once (uniq -u)

In those cases, uniq isn’t redundant — it’s the point.


A small philosophical note

This is one of those Linux moments that looks trivial but teaches a bigger lesson. Linux tools evolve. Sometimes functionality migrates inward, from pipelines into flags, because common patterns deserve first-class support.

sort -u is not “less Linuxy” than sort | uniq. It’s Linux noticing a habit and formalizing it.

The shell still lets you build LEGO castles out of pipes. It just also hands you pre-molded bricks when the shape is obvious.


The takeaway

If you just want unique, sorted lines:

sort -u

If you want insight about duplication:

sort | uniq …

Same ecosystem, different intentions.

And yes, it’s mildly delightful that a 1’30” YouTube Short can still provoke a discussion about tools designed in the 1970s. The terminal endures. The format changes. The ideas keep resurfacing — sorted, deduplicated, and ready for reuse.

Leave a Reply