[Numpy-discussion] Numpy 2D array from a list error

Dave Wood davejwood at gmail.com
Wed Sep 23 13:03:19 EDT 2009


Ignore that last mail, I hit send instead of save by mistake.

Between you you seem to be right, it's a problem with loading the array of
strings. There must be some large strings in the first 'rowname' column. If
this column is left out, it works fine (even as strings).

Many thanks, sorry for all the emails.

Dave




On 9/23/09, Dave Wood <davejwood at gmail.com> wrote:
>
> Appologies for the multiple posts, people. My posting to the forum was
> pending for a long time, so I deleted it and tried emailing directly. I
> didn't think they'd all be sent out.
> Gokan, thanks for the reply, I hope you get this one.
>
> "Here I use loadtxt to read ~89 MB txt file. Can you use loadtxt and share
> your results?
>
> I[14]: data = np.loadtxt('09_03_18_07_55_33.sau', dtype='float',
> skiprows=83).T
>
> I[15]: len data
> -----> len(data)
> O[15]: 66
>
> I[16]: len data[0]
> -----> len(data[0])
> O[16]: 117040
>
> I[17]: whos
> Variable   Type        Data/Info
> --------------------------------
> data       ndarray     66x117040: 7724640 elems, type `float64`, 61797120
> bytes (58 Mb)
>
>
>
> [gsever at ccn various]$ python sysinfo.py
>
> ================================================================================
> Platform     :
> Linux-2.6.29.6-217.2.3.fc11.i686.PAE-i686-with-fedora-11-Leonidas
> Python       : ('CPython', 'tags/r26', '66714')
> IPython      : 0.10
> NumPy        : 1.4.0.dev
> Matplotlib   : 1.0.svn
>
> ================================================================================
>
>
> --
> Gökhan"
>
>
>
>
> I tried using loadtxt and got the same error as before (with a little more
> information).
>
> "
>
> Traceback (most recent call last):
>   File "/home/dwood/workspace/GeneralScripts/src/test_clab2R.py", line
> 140, in <module>
>     main()
>   File "/home/dwood/workspace/GeneralScripts/src/test_clab2R.py", line 45,
> in main
>     data = loadtxt("inputfile.txt",dtype='string')
>   File
> "/apps/python/2.5.4/rhel4/lib/python2.5/site-packages/numpy/lib/io.py", line
> 505, in loadtxt
>     X = np.array(X, dtype)
> ValueError: setting an array element with a sequence
> "
>
> @Christopher Barker
> Thanks for the information. To fix my problem, I tried taking out the row
> names (leaving only numerical information), and converting the 2D list to
> floats. I still had the same problem.
>
>
> On 9/23/09, Christopher Barker <Chris.Barker at noaa.gov> wrote:
>>
>> Dave Wood wrote:
>> > Well, I suppose they are all considered to be strings here. I haven't
>> > tried to convert the numbers to floats yet.
>>
>> This could be an issue. For strings, numpy creates an array of strings,
>> all of the same length, so each element is as big as the largest one:
>>
>> In [13]: l
>> Out[13]: ['5', '34', 'this is a much longer string']
>>
>> In [14]: np.array(l)
>> Out[14]:
>> array(['5', '34', 'this is a much longer string'],
>>       dtype='|S28')
>>
>>
>> Note that each element is 28 bytes (that's what the S28 means).
>>
>> this means that your array would be much larger than the text file if
>> you have even one long string it in. Also, as mentioned in this thread,
>> in order to figure out how big to make each string element, the array()
>> constructor has to scan through your entire list first, and I don't know
>> how much intermediate memory it may use in that process.
>>
>> This really isn't how numpy is meant to be used -- why would you want a
>> big ol' array of mixed numbers and strings, all stored as strings?
>>
>> structured arrays were meant for this, and np.loadtxt() is the easiest
>> way to get one.
>>
>> > I just tried preallocating the array and updating it one line at a time,
>> > and that works fine.
>>
>> what dtype do you end up with?
>>
>> > This doesn't seem like the expected behaviour though and the error
>> > message seems wrong.
>>
>> yes, not a good error message at all -- it's hard to make sure good
>> errors get triggered every time!
>>
>>
>> HTH,
>>
>> -Chris
>>
>>
>>
>> --
>> Christopher Barker, Ph.D.
>> Oceanographer
>>
>> Emergency Response Division
>> NOAA/NOS/OR&R            (206) 526-6959   voice
>> 7600 Sand Point Way NE   (206) 526-6329   fax
>> Seattle, WA  98115       (206) 526-6317   main reception
>>
>> Chris.Barker at noaa.gov
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20090923/bec63207/attachment.html>


More information about the NumPy-Discussion mailing list