tim.hochberg at cox.net
Fri Jun 2 23:15:33 EDT 2006
Some time ago some people, myself including, were making some noise
about having 'array' iterate over iterable object producing ndarrays in
a manner analogous to they way sequences are treated. I finally got
around to looking at it seriously and once I came to the following three
1. All I really care about is the 1D case where dtype is specified.
This case should be relatively easy to implement so that it's
fast. Most other cases are not likely to be particularly faster
than converting the iterators to lists at the Python level and
then passing those lists to array.
2. 'array' already has plenty of special cases. I'm reluctant to add
3. Adding this to 'array' would be non-trivial. The more cases we
tried to make fast, the more likely that some of the paths would
be buggy. Regardless of how we did it though, some cases would be
much slower than other, which would probably be suprising.
So, with that in mind, I retreated a little and implemented the simplest
thing that did the stuff that I cared about:
fromiter(iterable, dtype, count) => ndarray of type dtype and length
This is essentially the same interface as fromstring except that the
values of dtype and count are always required. Some primitive
benchmarking indicates that 'fromiter(generator, dtype, count)' is about
twice as fast as 'array(list(generator))' for medium to large arrays.
When producing very large arrays, the advantage of fromiter is larger,
presumably because 'list(generator)' causes things to start swapping.
Anyway I'm about to bail out of town till the middle of next week, so
it'll be a while till I can get it clean enough to submit in some form
or another. Plenty of time for people to think of why it's a terrible
More information about the NumPy-Discussion