removing duplication from a huge list.

bearophileHUGS at bearophileHUGS at
Fri Feb 27 03:58:39 EST 2009

> How big of a list are we talking about? If the list is so big that the
> entire list cannot fit in memory at the same time this approach wont
> work e.g. removing duplicate lines from a very large file.

If the data are lines of a file, and keeping the original order isn't
important, then the first to try may be to use the unix (or cygwin on
Windows) commands sort and uniq.


More information about the Python-list mailing list