Cool Shell One-Liner of the Day
awk -F, '{print $1}' CSV | sort | uniq -c | grep -vw 1 | tee /dev/tty | wc -l
UPDATE: I went back and saw this post and thought to myself, “Self, why didn’t you annotate this garbage, you cheeky bastard?” OK, so the first part is pretty clear: get the first (or whichever) column you want from a simple (unquoted) csv file, and then count dupes. The grep is where we remove non-dupes and should probably be grep -Ev '^ *1 '
to avoid matching any of the csv data. Now here’s the magic. The pipe to tee /dev/tty echoes everything to stdout
, but one copy of the output can go through more pipes before being displayed. So the wc -l
is actually counting the number of entries which have duplicates (not the total number of all duplicates!), and displays that number at the bottom.
Here’s the tail end of what I get from this on a sample csv:
1
2
3
4
5
|
6 WOOD
4 WRIGHT
3 YOUNG
2 ZIMMERMAN
360
|