[Numpy-discussion] Create numpy array from a list error

Dave Wood davejwood at gmail.com
Wed Sep 23 10:06:46 EDT 2009


Hi all,

I've got a fairly large (but not huge, 58mb) tab seperated text file, with
approximately 200 columns and 56k rows of numbers and strings.

Here's a snippet of my code to create a numpy matrix from the data file...

####

data = map(lambda x : x.strip().split('\t'), sys.stdin.readlines())
data = array(data)

###

It causes the following error:

data = array(data)
  ValueError: setting an array element with a sequence

If I take the 1st 40,000 lines of the file, it works fine.
If I take the last 40,000 lines of the file, it also works fine, so it isn't
a problem with the file.

I've found a few other posts complaining of the same problem, but none of
their fixes work.

It seems like a memory problem to me. This was reinforced when I tried to
break the dataset into 3 chunks and stack the resulting arrays - I got an
error message saying "memory error".
Also, I don't really understand why reading in this 57mb txt file is taking
up ~2gb's of RAM.

Any advice? Thanks in advance

Dave
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20090923/04e9cd3b/attachment.html>


More information about the NumPy-Discussion mailing list