Processing large CSV files - how to maximise throughput?
Roy Smith
roy at panix.com
Fri Oct 25 20:22:44 EDT 2013
In article <mailman.1560.1382744694.18130.python-list at python.org>,
Dennis Lee Bieber <wlfraed at ix.netcom.com> wrote:
> Memory is cheap -- I/O is slow. <G> Just how massive are these CSV
> files?
Actually, these days, the economics of hardware are more like, "CPU is
cheap, memory is expensive".
I suppose it all depends on what kinds of problems you're solving, but
my experience is I'm much more likely to run out of memory on big
problems than I am to peg the CPU. Also, pegging the CPU leads to
well-behaved performance degradation. Running out of memory leads to
falling off a performance cliff as you start to page.
And, with the advent of large-scale SSD (you can get 1.6 TB SSD in 2.5
inch form-factor!), I/O is as fast as you're willing to pay for :-)
More information about the Python-list
mailing list