Re: [Numpy-discussion] FeatureRequest: support for array construction from iterators

14 Dec 2015


      Devil's advocate here: np.array() has become the de-facto "constructor" for
numpy arrays. Right now, passing it a generator results in what, IMHO, is a
useless result:
...
...
...
np.array((i for i in range(10)))
array(, dtype=object)
Passing pretty much any dtype argument will cause that to fail:
...
...
...
np.array((i for i in range(10)), dtype=np.int_)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: long() argument must be a string or a number, not 'generator'
Therefore, I think it is not out of the realm of reason that passing a
generator object and a dtype could then delegate the work under the hood to
np.fromiter()? I would even go so far as to raise an error if one passes a
generator without specifying dtype to np.array(). The point is to reduce
the number of entry points for creating numpy arrays.


By the way, any reason why this works?
...
...
...
np.array(xrange(10))
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Cheers!
Ben Root


On Sat, Dec 12, 2015 at 6:02 PM, Juan Nunez-Iglesias 
wrote:
...
Hey Nathaniel,
Fascinating! Thanks for the primer! I didn't know that it would check
dtype of values in the whole array. In that case, I would agree that it
would be bad to infer it magically from just the first value, and this can
be left to the users.
Thanks!
Juan.
On Sat, Dec 12, 2015 at 7:00 PM, Nathaniel Smith  wrote:
...
...
Nathaniel,
...
IMO this is better than making np.array(iter) internally call
On Fri, Dec 11, 2015 at 11:32 PM, Juan Nunez-Iglesias
 wrote:
list(iter)
...
...
or equivalent
Yeah but that's not the only option:
from itertools import chain
def fromiter_awesome_edition(iterable):
    elem = next(iterable)
    dtype = whatever_numpy_does_to_infer_dtypes_from_lists(elem)
    return np.fromiter(chain([elem], iterable), dtype=dtype)
I think this would be a huge win for usability. Always getting tripped
up by
the dtype requirement. I can submit a PR if people like this pattern.
This isn't the semantics of np.array, though -- np.array will look at
the whole input and try to find a common dtype, so this can't be the
implementation for np.array(iter). E.g. try np.array([1, 1.0])
I can see an argument for making the dtype= argument to fromiter
optional, with a warning in the docs that it will guess based on the
first element and that you should specify it if you don't want that.
It seems potentially a bit error prone (in the sense that it might
make it easier to end up with code that works great when you test it
but then breaks later when something unexpected happens), but maybe
the usability outweighs that. I don't use fromiter myself so I don't
have a strong opinion.
...
btw, I think np.array(['f', 'o', 'o']) would be exactly the expected
result
for np.array('foo'), but I guess that's just me.
In general np.array(thing_that_can_go_inside_an_array) returns a
zero-dimensional (scalar) array -- np.array(1), np.array(True), etc.
all work like this, so I'd expect np.array("foo") to do the same.
-n
--
Nathaniel J. Smith -- http://vorpus.org
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] FeatureRequest: support for array construction from iterators

Benjamin Root