On 4/18/07, numpy-discussion-request@scipy.org numpy-discussion-request@scipy.org wrote:
Message: 5 Date: Wed, 18 Apr 2007 09:11:32 -0700 From: Christopher Barker Chris.Barker@noaa.gov Subject: Re: [Numpy-discussion] Help using numPy to create a very large multi dimensional array To: Discussion of Numerical Python numpy-discussion@scipy.org Message-ID: 46264334.8080304@noaa.gov Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Bruno Santos wrote:
Finally I was able to read the data, by using the command you sair with some small changes: matrix = numpy.array([[float(x) for x in line.split()[1:]] for line in vecfile])
it doesn't sound like you're concerned about the speed of reading the files, but you can still use fromfile() or maybe fromstring() to do this. You just need to read past the text part first, then process it.
using fromstring:
matrix = numpy.vstack([numpy.fromstring(line.split(" ", 1)[1], sep=" ") for line in vecfile])
or something like that.
-Chris
I would strongly recommend pylab.load. It handles comments, selects columns, and is legible.
Examples from the docstring:
t,y = load('test.dat', unpack=True) # for two column data x,y,z = load('somefile.dat', usecols=(3,5,7), unpack=True)
A more advanced example from examples/load_converter.py:
dates, closes = load( 'data/msft.csv', delimiter=',', converters={0:datestr2num}, skiprows=1, usecols=(0,2), unpack=True)
Devs, is there any possibility of moving/copying pylab.load to numpy? I don't see anything in the source that requires the rest of matplotlib. Among convenience functions, I think that this function ranks pretty highly in convenience.
Take care, Nick
Nick Fotopoulos wrote:
Devs, is there any possibility of moving/copying pylab.load to numpy? I don't see anything in the source that requires the rest of matplotlib. Among convenience functions, I think that this function ranks pretty highly in convenience.
I'm supportive of this. But, it can't be named numpy.load.
How about
numpy.loadtxt numpy.savetxt
-Travis
A Dijous 19 Abril 2007 10:17, Travis Oliphant escrigué:
Nick Fotopoulos wrote:
Devs, is there any possibility of moving/copying pylab.load to numpy? I don't see anything in the source that requires the rest of matplotlib. Among convenience functions, I think that this function ranks pretty highly in convenience.
I'm supportive of this. But, it can't be named numpy.load.
How about
numpy.loadtxt numpy.savetxt
+1
On 4/19/07, Travis Oliphant oliphant.travis@ieee.org wrote:
Nick Fotopoulos wrote:
Devs, is there any possibility of moving/copying pylab.load to numpy? I don't see anything in the source that requires the rest of matplotlib. Among convenience functions, I think that this function ranks pretty highly in convenience.
I'm supportive of this. But, it can't be named numpy.load.
I am also +1 on this, but this functionality should be implemented in C, I think. I've just tested numpy.fromfile('name.txt', sep=' ') against pylab.load('name.txt') for a 35MB text file, the number are:
numpy.fromfile: 2.66 sec. pylab.load: 16.64 sec.
Lisandro Dalcin wrote:
I am also +1 on this, but this functionality should be implemented in C, I think.
well, maybe.
I've just tested numpy.fromfile('name.txt', sep=' ') against pylab.load('name.txt') for a 35MB text file, the number are:
numpy.fromfile: 2.66 sec. pylab.load: 16.64 sec.
exactly that's expected. fromfile is designed to do the easy cases as fast as possible, pylab.load is designed to be be flexible, I'm not user you need both the speed and flexibility at the same time.
By the way, I haven't looked at pylab.load() for a while, but it could perhaps be sped up by using fromfile() and or fromstring internally. There may be some opportunity to special case the easy ones too (i.e. all columns desired, etc.)
-Chris
I think it would be a great idea to have pylab.load in numpy. It also seems to be a lot faster than scipy.io.
One thing that is very nice about pylab.load is that it can read-in dates. However, it can't, as far a I know, handle other non-float data.
I played around with python's csv module and pylab.load for a while resulting in a database class I posted in the cookbook section:
http://www.scipy.org/Cookbook/dbase
This class can read any type of data in a csv file, including dates, into a dictionary but is based on both pylab.load and the csv module. I use cPickle for storing the data once it is read-in once. I haven't tried PyTables but hear a lot of good things about it.
Vincent
On 4/19/07 10:58 AM, "Christopher Barker" Chris.Barker@noaa.gov wrote:
Lisandro Dalcin wrote:
I am also +1 on this, but this functionality should be implemented in C, I think.
well, maybe.
I've just tested numpy.fromfile('name.txt', sep=' ') against pylab.load('name.txt') for a 35MB text file, the number are:
numpy.fromfile: 2.66 sec. pylab.load: 16.64 sec.
exactly that's expected. fromfile is designed to do the easy cases as fast as possible, pylab.load is designed to be be flexible, I'm not user you need both the speed and flexibility at the same time.
By the way, I haven't looked at pylab.load() for a while, but it could perhaps be sped up by using fromfile() and or fromstring internally. There may be some opportunity to special case the easy ones too (i.e. all columns desired, etc.)
-Chris
Whats wrong with scipy.io.read_array?
Am 19.04.2007 um 15:50 schrieb Lisandro Dalcin:
On 4/19/07, Travis Oliphant oliphant.travis@ieee.org wrote:
Nick Fotopoulos wrote:
Devs, is there any possibility of moving/copying pylab.load to numpy? I don't see anything in the source that requires the rest of matplotlib. Among convenience functions, I think that this function ranks pretty highly in convenience.
I'm supportive of this. But, it can't be named numpy.load.
I am also +1 on this, but this functionality should be implemented in C, I think. I've just tested numpy.fromfile('name.txt', sep=' ') against pylab.load('name.txt') for a 35MB text file, the number are:
numpy.fromfile: 2.66 sec. pylab.load: 16.64 sec.
-- Lisandro Dalcín
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) PTLC - Güemes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
It seems to be a lot slower than pylab.load for large arrays. Also, it doesn't handle dates.
Vincent
On 4/22/07 10:33 AM, "Markus Rosenstihl" markusro@element.fkp.physik.tu-darmstadt.de wrote:
Whats wrong with scipy.io.read_array?
Am 19.04.2007 um 15:50 schrieb Lisandro Dalcin:
On 4/19/07, Travis Oliphant oliphant.travis@ieee.org wrote:
Nick Fotopoulos wrote:
Devs, is there any possibility of moving/copying pylab.load to numpy? I don't see anything in the source that requires the rest of matplotlib. Among convenience functions, I think that this function ranks pretty highly in convenience.
I'm supportive of this. But, it can't be named numpy.load.
I am also +1 on this, but this functionality should be implemented in C, I think. I've just tested numpy.fromfile('name.txt', sep=' ') against pylab.load('name.txt') for a 35MB text file, the number are:
numpy.fromfile: 2.66 sec. pylab.load: 16.64 sec.
-- Lisandro Dalcín
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) PTLC - Güemes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion