[Numpy-discussion] fromiter
David M. Cooke
cookedm at physics.mcmaster.ca
Sat Jun 10 17:42:03 EDT 2006
On Sat, Jun 10, 2006 at 01:18:05PM -0700, Tim Hochberg wrote:
>
> I finally got around to cleaning up and checking in fromiter. As Travis
> suggested, this version does not require that you specify count. From
> the docstring:
>
> fromiter(...)
> fromiter(iterable, dtype, count=-1) returns a new 1d array
> initialized from iterable. If count is nonegative, the new array
> will have count elements, otherwise it's size is determined by the
> generator.
>
> If count is specified, it allocates the full array ahead of time. If it
> is not, it periodically reallocates space for the array, allocating 50%
> extra space each time and reallocating back to the final size at the end
> (to give realloc a chance to reclaim any extra space).
>
> Speedwise, "fromiter(iterable, dtype, count)" is about twice as fast as
> "array(list(iterable),dtype=dtype)". Omitting count slows things down by
> about 15%; still much faster than using "array(list(...))". It also is
> going to chew up more memory than if you include count, at least
> temporarily, but still should typically use much less than the
> "array(list(...))" approach.
Can this be integrated into array() so that array(iterable, dtype=dtype)
does the expected thing?
Can you try to find the length of the iterable, with PySequence_Size() on
the original object? This gets a bit iffy, as that might not be correct
(but it could be used as a hint).
What about iterables that return, say, tuples? Maybe add a shape argument,
so that fromiter(iterable, dtype, count, shape=(None, 3)) expects elements
from iterable that can be turned into arrays of shape (3,)? That could
replace count, too.
--
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca
More information about the NumPy-Discussion
mailing list