[Numpy-discussion] Numpy 2D array from a list error

Christopher Barker Chris.Barker at noaa.gov
Wed Sep 23 12:36:08 EDT 2009


Dave Wood wrote:
> Well, I suppose they are all considered to be strings here. I haven't 
> tried to convert the numbers to floats yet.

This could be an issue. For strings, numpy creates an array of strings, 
all of the same length, so each element is as big as the largest one:

In [13]: l
Out[13]: ['5', '34', 'this is a much longer string']

In [14]: np.array(l)
Out[14]:
array(['5', '34', 'this is a much longer string'],
       dtype='|S28')


Note that each element is 28 bytes (that's what the S28 means).

this means that your array would be much larger than the text file if 
you have even one long string it in. Also, as mentioned in this thread, 
in order to figure out how big to make each string element, the array() 
constructor has to scan through your entire list first, and I don't know 
how much intermediate memory it may use in that process.

This really isn't how numpy is meant to be used -- why would you want a 
big ol' array of mixed numbers and strings, all stored as strings?

structured arrays were meant for this, and np.loadtxt() is the easiest 
way to get one.

> I just tried preallocating the array and updating it one line at a time, 
> and that works fine.

what dtype do you end up with?

> This doesn't seem like the expected behaviour though and the error 
> message seems wrong.

yes, not a good error message at all -- it's hard to make sure good 
errors get triggered every time!


HTH,

-Chris



-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov



More information about the NumPy-Discussion mailing list