The count from uniq
is preceded by spaces unless there are more than 7 digits in the count, so you need to do something like:
uniq -c | sort -nr | cut -c 9-
to get columns (character positions) 9 upwards. Or you can use sed
:
uniq -c | sort -nr | sed 's/^.{8}//'
or:
uniq -c | sort -nr | sed 's/^ *[0-9]* //'
This second option is robust in the face of a repeat count of 10,000,000 or more; if you think that might be a problem, it is probably better than the cut
alternative. And there are undoubtedly other options available too.
Caveat: the counts were determined by experimentation on Mac OS X 10.7.3 but using GNU uniq
from coreutils
8.3. The BSD uniq -c
produced 3 leading spaces before a single digit count. The POSIX spec says the output from uniq -c
shall be formatted as if with:
printf("%d %s", repeat_count, line);
which would not have any leading blanks. Given this possible variance in output formats, the sed
script with the [0-9]
regex is the most reliable way of dealing with the variability in observed and theoretical output from uniq -c
:
uniq -c | sort -nr | sed 's/^ *[0-9]* //'
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…