Python's simplicity philosophy

Curt curty at freeze.invalid
Sun Nov 23 06:39:10 EST 2003


"Michael Geary" <Mike at DeleteThis.Geary.com> writes:

> Curt...
 
> It is unfortunate that both the program name and the one-line description of
> uniq are misleading:
 
>        uniq - remove duplicate lines from a sorted file


The Gnu/Linux man page is not the only document in the world that's misleading 
about "uniq".

http://www.gnu.org/manual/textutils-2.0/html_chapter/textutils_7.html

7.2 uniq: Uniquify files

uniq writes the unique lines in the given `input', or standard input
if nothing is given or for an input name of `-'. Synopsis:

uniq [option]... [input [output]]

By default, uniq prints the unique lines in a sorted file, i.e.,
                                              ******
discards all but one of identical successive lines. Optionally, it can
instead show only lines that appear exactly once, or lines that appear
more than once.

The input must be sorted. If your input is not sorted, perhaps you
                  ****** 
want to use sort -u.


http://publib16.boulder.ibm.com/pseries/en_US/cmds/aixcmds5/uniq.htm

Description

The uniq command deletes repeated lines in a file. The uniq command
reads either standard input or a file specified by the InFile
parameter. The command first compares adjacent lines and then removes
the second and succeeding duplications of a line. Duplicated lines
must be adjacent. (Before issuing the uniq command, use the sort
                                                            ****
command to make all duplicate lines adjacent.)

http://www.tldp.org/LDP/abs/html/textproc.html

uniq

This filter removes duplicate lines from a sorted file. It is often
                                           ******  
seen in a pipe coupled with sort.
                            ****
cat list-1 list-2 list-3 | sort | uniq > final.list # Concatenates the
list files, # sorts them, # removes duplicate lines, # and finally
writes the result to an output file.

=======================================================================

It seems to me that this utility was developed to "uniquify" files; that
is, remove every and all duplicate lines in order that every line be 
"unique".  I can only assume that this was the primary, fundamental use
case in the mind of the author when he developed the program, and that's
why he called it "uniq".  If you find this misleading, maybe you should
file a bug report. ;-)

Anyway, I'm having lunch with him next Friday so I'll ask him what he
had in mind and let you know if he remembers just exactly what that was.





 




More information about the Python-list mailing list