[Numpy-discussion] Memory error with numpy.loadtxt()

Joe Kington jkington at wisc.edu
Fri Feb 25 12:52:27 EST 2011

Do you expect to have very large integer values, or only values over a
limited range?

If your integer values will fit in into 16-bit range (or even 32-bit, if
you're on a 64-bit machine, the default dtype is float64...) you can
potentially halve your memory usage.

I.e. Something like:
data = numpy.loadtxt(filename, dtype=numpy.int16)

Alternately, if you're already planning on using a (scipy) sparse array
anyway, it's easy to do something like this:

import numpy as np
import scipy.sparse
I, J, V = [], [], []
with open('infile.txt') as infile:
    for i, line in enumerate(infile):
        line = np.array(line.strip().split(), dtype=np.int)
        nonzeros, = line.nonzero()
data = scipy.sparse.coo_matrix((V,(I,J)), dtype=np.int, shape=(i+1,

This will be much slower than numpy.loadtxt(...), but if you're just
converting the output of loadtxt to a sparse array, regardless, this would
avoid memory usage problems (assuming the array is mostly sparse, of

Hope that helps,

On Fri, Feb 25, 2011 at 9:37 AM, Jaidev Deshpande <
deshpande.jaidev at gmail.com> wrote:

> Hi
> Is it possible to load a text file 664 MB large with integer values and
> about 98% sparse? numpy.loadtxt() shows a memory error.
> If it's not possible, what alternatives could I have?
> The usable RAM on my machine running Windows 7 is 3.24 GB.
> Thanks.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110225/468abc60/attachment.html>

More information about the NumPy-Discussion mailing list