[Numpy-discussion] Help using numPy to create a very large multi dimensional array
Vincent Nijs
v-nijs at kellogg.northwestern.edu
Thu Apr 19 17:59:59 EDT 2007
I think it would be a great idea to have pylab.load in numpy. It also seems
to be a lot faster than scipy.io.
One thing that is very nice about pylab.load is that it can read-in dates.
However, it can't, as far a I know, handle other non-float data.
I played around with python's csv module and pylab.load for a while
resulting in a database class I posted in the cookbook section:
http://www.scipy.org/Cookbook/dbase
This class can read any type of data in a csv file, including dates, into a
dictionary but is based on both pylab.load and the csv module. I use cPickle
for storing the data once it is read-in once. I haven't tried PyTables but
hear a lot of good things about it.
Vincent
On 4/19/07 10:58 AM, "Christopher Barker" <Chris.Barker at noaa.gov> wrote:
> Lisandro Dalcin wrote:
>> I am also +1 on this, but this functionality should be implemented in
>> C, I think.
>
> well, maybe.
>
>> I've just tested numpy.fromfile('name.txt', sep=' ')
>> against pylab.load('name.txt') for a 35MB text file, the number are:
>>
>> numpy.fromfile: 2.66 sec.
>> pylab.load: 16.64 sec.
>
> exactly that's expected. fromfile is designed to do the easy cases as
> fast as possible, pylab.load is designed to be be flexible, I'm not user
> you need both the speed and flexibility at the same time.
>
> By the way, I haven't looked at pylab.load() for a while, but it could
> perhaps be sped up by using fromfile() and or fromstring internally.
> There may be some opportunity to special case the easy ones too (i.e.
> all columns desired, etc.)
>
> -Chris
>
>
>
>
--
More information about the NumPy-Discussion
mailing list