Devil's advocate here: np.array() has become the de-facto "constructor" for numpy arrays. Right now, passing it a generator results in what, IMHO, is a useless result:

>>> np.array((i for i in range(10)))

array(<generator object <genexpr> at 0x7f28b2beca00>, dtype=object)

Passing pretty much any dtype argument will cause that to fail:>>> np.array((i for i in range(10)))

array(<generator object <genexpr> at 0x7f28b2beca00>, dtype=object)

>>> np.array((i for i in range(10)), dtype=np.int_)

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

TypeError: long() argument must be a string or a number, not 'generator'

Therefore, I think it is not out of the realm of reason that passing a generator object and a dtype could then delegate the work under the hood to np.fromiter()? I would even go so far as to raise an error if one passes a generator without specifying dtype to np.array(). The point is to reduce the number of entry points for creating numpy arrays.

By the way, any reason why this works?

>>> np.array(xrange(10))

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

>>> np.array(xrange(10))

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Cheers!

Ben Root

On Sat, Dec 12, 2015 at 6:02 PM, Juan Nunez-Iglesias <jni.soma@gmail.com> wrote:

Hey Nathaniel,Fascinating! Thanks for the primer! I didn't know that it would check dtype of values in the whole array. In that case, I would agree that it would be bad to infer it magically from just the first value, and this can be left to the users.Thanks!Juan.On Sat, Dec 12, 2015 at 7:00 PM, Nathaniel Smith <njs@pobox.com> wrote:On Fri, Dec 11, 2015 at 11:32 PM, Juan Nunez-Iglesias

<jni.soma@gmail.com> wrote:

> Nathaniel,

>

>> IMO this is better than making np.array(iter) internally call list(iter)

>> or equivalent

>

> Yeah but that's not the only option:

>

> from itertools import chain

> def fromiter_awesome_edition(iterable):

> elem = next(iterable)

> dtype = whatever_numpy_does_to_infer_dtypes_from_lists(elem)

> return np.fromiter(chain([elem], iterable), dtype=dtype)

>

> I think this would be a huge win for usability. Always getting tripped up by

> the dtype requirement. I can submit a PR if people like this pattern.

This isn't the semantics of np.array, though -- np.array will look at

the whole input and try to find a common dtype, so this can't be the

implementation for np.array(iter). E.g. try np.array([1, 1.0])

I can see an argument for making the dtype= argument to fromiter

optional, with a warning in the docs that it will guess based on the

first element and that you should specify it if you don't want that.

It seems potentially a bit error prone (in the sense that it might

make it easier to end up with code that works great when you test it

but then breaks later when something unexpected happens), but maybe

the usability outweighs that. I don't use fromiter myself so I don't

have a strong opinion.

> btw, I think np.array(['f', 'o', 'o']) would be exactly the expected result

> for np.array('foo'), but I guess that's just me.

In general np.array(thing_that_can_go_inside_an_array) returns a

zero-dimensional (scalar) array -- np.array(1), np.array(True), etc.

all work like this, so I'd expect np.array("foo") to do the same.

-n

--

Nathaniel J. Smith -- http://vorpus.org

_______________________________________________

NumPy-Discussion mailing list

NumPy-Discussion@scipy.org

https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________

NumPy-Discussion mailing list

NumPy-Discussion@scipy.org

https://mail.scipy.org/mailman/listinfo/numpy-discussion