[melbourne-pug] Question about adding members to list : hypothetical just for interest kind of question

Ben Finney ben+python at benfinney.id.au
Thu Feb 20 23:56:22 CET 2014


David Crisp <david.crisp at gmail.com> writes:

> name  | value
> ==========
> ItemOne : 10
> ItemOne : 10
> ItemOne : 10
> ItemOne : 10
> ItemTwo : 20
> ItemTwo : 20
> ItemTwo : 20
> ItemTwo : 20
> ItemThree : 30
> ItemThree : 30
> ItemThree : 30
> ItemThree : 30

If you're confident there will frequently be duplicated lines, and you
want to ignore the duplicates, I'd recommend (on Unix) filtering the
list to remove them::

    $ cat items | sort | uniq > items_dedup

Then you can read the ‘items_dedup’ file in your Python program.

You can even write your Python program as a filter (read the input lines
from ‘sys.stdin’, write the result to ‘sys.stdout’) and just hook it
into that command pipeline. If the program you're writing is named
‘do_more_processing’::

    $ cat items | sort | uniq | do_more_processing > outputfile

-- 
 \         “Science is a way of trying not to fool yourself. The first |
  `\     principle is that you must not fool yourself, and you are the |
_o__)               easiest person to fool.” —Richard P. Feynman, 1964 |
Ben Finney



More information about the melbourne-pug mailing list