removing duplication from a huge list.
stefan_ml at behnel.de
Fri Feb 27 10:18:06 CET 2009
bearophileHUGS at lycos.com wrote:
>> How big of a list are we talking about? If the list is so big that the
>> entire list cannot fit in memory at the same time this approach wont
>> work e.g. removing duplicate lines from a very large file.
> If the data are lines of a file, and keeping the original order isn't
> important, then the first to try may be to use the unix (or cygwin on
> Windows) commands sort and uniq.
or preferably "sort -u", in case that's supported.
More information about the Python-list