[Numpy-discussion] Help using numPy to create a very large multi dimensional array

Vincent Nijs v-nijs at kellogg.northwestern.edu
Thu Apr 19 17:59:59 EDT 2007


I think it would be a great idea to have pylab.load in numpy. It also seems
to be a lot faster than scipy.io.

One thing that is very nice about pylab.load is that it can read-in dates.
However, it can't, as far a I know, handle other non-float data.

I played around with python's csv module and pylab.load for a while
resulting in a database class I posted in the cookbook section:

http://www.scipy.org/Cookbook/dbase

This class can read any type of data in a csv file, including dates, into a
dictionary but is based on both pylab.load and the csv module. I use cPickle
for storing the data once it is read-in once. I haven't tried PyTables but
hear a lot of good things about it.

Vincent


On 4/19/07 10:58 AM, "Christopher Barker" <Chris.Barker at noaa.gov> wrote:

> Lisandro Dalcin wrote:
>> I am also +1 on this, but this functionality should be implemented in
>> C, I think.
> 
> well,  maybe.
> 
>> I've just tested numpy.fromfile('name.txt', sep=' ')
>> against pylab.load('name.txt') for a 35MB text file, the number are:
>> 
>> numpy.fromfile: 2.66 sec.
>> pylab.load:  16.64 sec.
> 
> exactly that's expected. fromfile is designed to do the easy cases as
> fast as possible, pylab.load is designed to be be flexible, I'm not user
> you need both the speed and flexibility at the same time.
> 
> By the way, I haven't looked at pylab.load() for a while, but it could
> perhaps be sped up by using fromfile() and or fromstring internally.
> There may be some opportunity to special case the easy ones too (i.e.
> all columns desired, etc.)
> 
> -Chris
> 
> 
> 
> 

-- 






More information about the NumPy-Discussion mailing list