[Tutor] sorting a 2 gb file
Andrew D. Fant
fant at pobox.com
Tue Jan 25 17:05:58 CET 2005
Alan Gauld wrote:
>>My data set the below is taken from is over 2.4 gb so speed and
>
> memory
>
>>considerations come into play.
>
>
> To be honest, if this were my problem, I'd proably dump all the data
> into a database and use SQL to extract what I needed. Thats a much
> more effective tool for this kind of thing.
>
> You can do it with Python, but I think we need more understanding
> of the problem. For example what the various fields represent, how
> much of a comparison (ie which fields, case sensitivity etc) leads
> to "equality" etc.
>
And if the idea of setting up a full-blown SQL server for the problem
seems like a lot of work, you might try prototyping the sort and
solutions with sqlite, and only migrate to (full-fledged RDBMS of your
choice) if the prototype works as you want it too and sqlite seems too
slow for your needs.
Andy
More information about the Tutor
mailing list