[Numpy-discussion] memory-efficient loadtxt
chris.barker at noaa.gov
Mon Oct 1 15:07:19 EDT 2012
Nice to see someone working on these issues, but:
I'm not sure the problem you are trying to solve -- accumulating in a
list is pretty efficient anyway -- not a whole lot overhead.
But if you do want to improve that, it may be better to change the
accumulating method, rather than doing the double-read thing. I"ve
written, and posted here, code that provides an Acumulator that uses
numpy internally, so not much memory overhead. In the end, it's not
any faster than accumulating in a list and then converting to an
array, but it does use less memory.
I also have a Cython version that is not quite done (darn regular job
getting in the way) that is both faster and more memory efficient.
Also, frankly, just writing the array pre-allocation and re-sizeing
code into loadtxt would not be a whole lot of code either, and would
be both fast and memory efficient.
Let mw know if you want any of my code to play with.
> However, I got the impression that someone was
> working on a More Advanced (TM) C-based file reader, which will
> replace loadtxt;
yes -- I wonder what happened with that? Anyone?
this patch is intended as a useful thing to have
> while we're waiting for that to appear.
> The patch passes all tests in the test suite, and documentation for
> the kwarg has been added. I've modified all tests to include the
> seekable kwarg, but that was mostly to check that all tests are passed
> also with this kwarg. I guess it's bit too late for 1.7.0 though?
> Should I make a pull request? I'm happy to take any and all
> suggestions before I do.
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
Christopher Barker, Ph.D.
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
More information about the NumPy-Discussion