[Numpy-discussion] Numpy 2D array from a list error
Christopher Barker
Chris.Barker at noaa.gov
Wed Sep 23 12:36:08 EDT 2009
Dave Wood wrote:
> Well, I suppose they are all considered to be strings here. I haven't
> tried to convert the numbers to floats yet.
This could be an issue. For strings, numpy creates an array of strings,
all of the same length, so each element is as big as the largest one:
In [13]: l
Out[13]: ['5', '34', 'this is a much longer string']
In [14]: np.array(l)
Out[14]:
array(['5', '34', 'this is a much longer string'],
dtype='|S28')
Note that each element is 28 bytes (that's what the S28 means).
this means that your array would be much larger than the text file if
you have even one long string it in. Also, as mentioned in this thread,
in order to figure out how big to make each string element, the array()
constructor has to scan through your entire list first, and I don't know
how much intermediate memory it may use in that process.
This really isn't how numpy is meant to be used -- why would you want a
big ol' array of mixed numbers and strings, all stored as strings?
structured arrays were meant for this, and np.loadtxt() is the easiest
way to get one.
> I just tried preallocating the array and updating it one line at a time,
> and that works fine.
what dtype do you end up with?
> This doesn't seem like the expected behaviour though and the error
> message seems wrong.
yes, not a good error message at all -- it's hard to make sure good
errors get triggered every time!
HTH,
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
More information about the NumPy-Discussion
mailing list