
I find myself writing things like x = []; y = []; t = [] for line in open(filename).readlines(): xstr, ystr, tstr = line.split() x.append(float(xstr)) y.append(float(ystr)_ t.append(dateutil.parser.parse(tstr)) # or something similar x = asarray(x) y = asarray(y) t = asarray(t) I think it would be nice to be able to create empty arrays, and append the values onto the end as I loop through the file without creating the intermediate list. Is this reasonable? Is there a way to do this with existing methods or functions that I am missing? Is there a better way altogether? -Rob. ----- Rob Hetland, Assistant Professor Dept of Oceanography, Texas A&M University p: 979-458-0096, f: 979-845-6331 e: hetland@tamu.edu, w: http://pong.tamu.edu

Robert Hetland wrote:
I find myself writing things like
x = []; y = []; t = [] for line in open(filename).readlines(): xstr, ystr, tstr = line.split() x.append(float(xstr)) y.append(float(ystr)_ t.append(dateutil.parser.parse(tstr)) # or something similar x = asarray(x) y = asarray(y) t = asarray(t)
I think it would be nice to be able to create empty arrays, and append the values onto the end as I loop through the file without creating the intermediate list. Is this reasonable?
Not in the core array object, no. We can't make the underlying pointer point to something else (because you've just reallocated the whole memory block to add an item to the array) without invalidating all of the views on that array. This is also the reason that numpy arrays can't use the standard library's array module as its storage. That said:
Is there a way to do this with existing methods or functions that I am missing? Is there a better way altogether?
We've done performance tests before. The fastest way that I've found is to use the stdlib array module to accumulate values (it uses the same preallocation strategy that Python lists use, and you can't create views from them, so you are always safe) and then create the numpy array using fromstring on that object (stdlib arrays obey the buffer protocol, so they will be treated like strings of binary data). I posted timings one or two or three years ago on one of the scipy lists. However, lists are fine if you don't need blazing speed/low memory usage. -- Robert Kern robert.kern@gmail.com "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On 4/21/06, Robert Hetland <hetland@tamu.edu> wrote:
[...] I think it would be nice to be able to create empty arrays, and append the values onto the end as I loop through the file without creating the intermediate list. Is this reasonable? Is there a way to do this with existing methods or functions that I am missing? Is there a better way altogether?
Numpy arrays cannot grow in-place because there is no way for an array to tell if it's data is shared with other arrays. You can use python's standard library arrays instead of lists:
from numpy import * import array as a x = a.array('i',[]) x.append(1) x.append(2) x.append(3) ndarray(len(x), dtype=int, buffer=x) array([1, 2, 3])
Note that data is not copied:
ndarray(len(x), dtype=int, buffer=x)[1] = 20 x array('i', [1, 20, 3])

Hi, On 4/21/06, Robert Hetland <hetland@tamu.edu> wrote:
I find myself writing things like
x = []; y = []; t = [] for line in open(filename).readlines(): xstr, ystr, tstr = line.split() x.append(float(xstr)) y.append(float(ystr)_ t.append(dateutil.parser.parse(tstr)) # or something similar x = asarray(x) y = asarray(y) t = asarray(t)
I think you can read the ascii file directly into an array with numeric conversions (fromfile) then just reshape it to have x,y,z columns. For example: $[charris@E011704 ~]$ cat input.txt 1 2 3 4 5 6 7 8 9 Then after importing numpy into ipython: In [6]:fromfile('input.txt',sep=' ').reshape(-1,3) Out[6]: array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) Chuck
participants (4)
-
Charles R Harris
-
Robert Hetland
-
Robert Kern
-
Sasha