8 Jul
2010
8 Jul
'10
8:26 a.m.
Dear NumPy developers, I have to process some big data files with high-frequency financial data. I am trying to load a delimited text file having ~700 MB with ~ 10 million lines using numpy.genfromtxt(). The machine is a Debian Lenny server 32bit with 3GB of memory. Since the file is just 700MB I am naively assuming that it should fit into memory in whole. However, when I attempt to load it, python fills the entire available memory and then fails with Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/lib/python2.6/site-packages/numpy/lib/io.py", line 1318, in genfromtxt errmsg = "\n".join(errmsg) MemoryError Is there a way to load this file without crashing? Thanks, Hannes