Sorting Large File (Code/Performance)

Martin Marcher martin at marcher.name
Thu Jan 24 17:49:52 EST 2008


On Thursday 24 January 2008 20:56 John Nagle wrote:

> Ira.Kovac at gmail.com wrote:
>> Hello all,
>> 
>> I have an Unicode text file with 1.6 billon lines (~2GB) that I'd like
>> to sort based on first two characters.
> 
>     Given those numbers, the average number of characters per line is
> less than 2.  Please check.

which would be true if 1.599.999.999 had 2 chars and the rest of the lines
just one :)

(but yes that would be an interesting question how to sort a 1 character
line based on the first 2 of that line)

martin





-- 
http://noneisyours.marcher.name
http://feeds.feedburner.com/NoneIsYours

You are not free to read this message,
by doing so, you have violated my licence
and are required to urinate publicly. Thank you.




More information about the Python-list mailing list