removing duplication from a huge list.
Stefan Behnel
stefan_ml at behnel.de
Fri Feb 27 04:18:06 EST 2009
bearophileHUGS at lycos.com wrote:
> odeits:
>> How big of a list are we talking about? If the list is so big that the
>> entire list cannot fit in memory at the same time this approach wont
>> work e.g. removing duplicate lines from a very large file.
>
> If the data are lines of a file, and keeping the original order isn't
> important, then the first to try may be to use the unix (or cygwin on
> Windows) commands sort and uniq.
or preferably "sort -u", in case that's supported.
Stefan
More information about the Python-list
mailing list