[Numpy-discussion] FeatureRequest: support for array construction from iterators

Nathaniel Smith njs at pobox.com
Fri Dec 11 18:12:00 EST 2015

Constructing an array from an iterator is fundamentally different from
constructing an array from an in-memory data structure like a list,
because in the iterator case it's necessary to either use a
single-pass algorithm or else create extra temporary buffers that
cause much higher memory overhead. (Which is undesirable given that
iterators are mostly used exactly in the case where one wants to
reduce memory overhead.)

np.fromiter requires the dtype= argument because this is necessary if
you want to construct the array in a single pass.

np.array(list(iter)) can avoid the dtype argument, because it creates
that large memory buffer. IMO this is better than making
np.array(iter) internally call list(iter) or equivalent, because the
workaround (adding an explicit call to list()) is trivial, while also
making it obvious to the user what the actual cost of their request
is. (Explicit is better than implicit.)

In addition, the proposed API has a number of infelicities:
- We're generally trying to *reduce* the magic in functions like
np.array (e.g. the discussions of having less magic for lists with
mismatched numbers of elements, or non-list sequences)
- There's a strong convention in Python is when making a function like
np.array generic, it should accept any iter*able* rather any
iter*ator*. But it would be super confusing if np.array({1: 2})
returned array([1]), or if array("foo") returned array(["f", "o",
"o"]), so we don't actually want to handle all iterables the same.
It's somewhat dubious even for iterators (e.g. someone might want to
create an object array containing an iterator...)...

hope that helps,

On Fri, Dec 11, 2015 at 2:27 PM, Stephan Sahm <Stephan.Sahm at gmx.de> wrote:
> numpy.fromiter is neither numpy.array nor does it work similar to
> numpy.array(list(...)) as the dtype argument is necessary
> is there a reason, why np.array(...) should not work on iterators? I have
> the feeling that such requests get (repeatedly) dismissed, but until yet I
> haven't found a compelling argument for leaving this Feature missing (to
> remember, it is already implemented in a branch)
> Please let me know if you know about an argument,
> best,
> Stephan
> On 27 November 2015 at 14:18, Alan G Isaac <alan.isaac at gmail.com> wrote:
>> On 11/27/2015 5:37 AM, Stephan Sahm wrote:
>>> I like to request a generator/iterator support for np.array(...) as far
>>> as list(...) supports it.
>> http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromiter.html
>> hth,
>> Alan Isaac
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion

Nathaniel J. Smith -- http://vorpus.org

More information about the NumPy-Discussion mailing list