Reading a large csv file

python at bdurham.com python at bdurham.com
Tue Jun 23 14:05:34 EDT 2009


Mag,

If your source data is clean, it may also be faster for you to parse
your input files directly vs. use the CSV module which may(?) add some
overhead.

Check out the struct module and/or use the split() method of strings.

We do a lot of ETL processing with flat files and on a slow single core
processing workstation, we can typically process 2 Gb of data in ~5
minutes. I would think a worst case processing time would be less than
an hour for 14 Gb of data.

Malcolm 



More information about the Python-list mailing list