[Tutor] sorting a 2 gb file

Andrew D. Fant fant at pobox.com
Tue Jan 25 17:05:58 CET 2005


Alan Gauld wrote:
>>My data set the below is taken from is over 2.4 gb so speed and
> 
> memory
> 
>>considerations come into play.
> 
> 
> To be honest, if this were my problem, I'd proably dump all the data
> into a database and use SQL to extract what I needed. Thats a much
> more effective tool for this kind of thing.
> 
> You can do it with Python, but I think we need more understanding
> of the problem. For example what the various fields represent, how
> much of a comparison (ie which fields, case sensitivity etc) leads
> to "equality" etc.
 >
And if the idea of setting up a full-blown SQL server for the problem 
seems like a lot of work, you might try prototyping the sort and 
solutions with sqlite, and only migrate to (full-fledged RDBMS of your 
choice) if the prototype works as you want it too and sqlite seems too 
slow for your needs.

Andy


More information about the Tutor mailing list