Devil's advocate here: np.array() has become the de-facto "constructor" for numpy arrays. Right now, passing it a generator results in what, IMHO, is a useless result:
np.array((i for i in range(10))) array(<generator object <genexpr> at 0x7f28b2beca00>, dtype=object)
Passing pretty much any dtype argument will cause that to fail:
np.array((i for i in range(10)), dtype=np.int_) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: long() argument must be a string or a number, not 'generator'
Therefore, I think it is not out of the realm of reason that passing a generator object and a dtype could then delegate the work under the hood to np.fromiter()? I would even go so far as to raise an error if one passes a generator without specifying dtype to np.array(). The point is to reduce the number of entry points for creating numpy arrays. By the way, any reason why this works?
np.array(xrange(10)) array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Cheers! Ben Root On Sat, Dec 12, 2015 at 6:02 PM, Juan Nunez-Iglesias <jni.soma@gmail.com> wrote:
Hey Nathaniel,
Fascinating! Thanks for the primer! I didn't know that it would check dtype of values in the whole array. In that case, I would agree that it would be bad to infer it magically from just the first value, and this can be left to the users.
Thanks!
Juan.
On Sat, Dec 12, 2015 at 7:00 PM, Nathaniel Smith <njs@pobox.com> wrote:
Nathaniel,
IMO this is better than making np.array(iter) internally call
On Fri, Dec 11, 2015 at 11:32 PM, Juan Nunez-Iglesias <jni.soma@gmail.com> wrote: list(iter)
or equivalent
Yeah but that's not the only option:
from itertools import chain def fromiter_awesome_edition(iterable): elem = next(iterable) dtype = whatever_numpy_does_to_infer_dtypes_from_lists(elem) return np.fromiter(chain([elem], iterable), dtype=dtype)
I think this would be a huge win for usability. Always getting tripped up by the dtype requirement. I can submit a PR if people like this pattern.
This isn't the semantics of np.array, though -- np.array will look at the whole input and try to find a common dtype, so this can't be the implementation for np.array(iter). E.g. try np.array([1, 1.0])
I can see an argument for making the dtype= argument to fromiter optional, with a warning in the docs that it will guess based on the first element and that you should specify it if you don't want that. It seems potentially a bit error prone (in the sense that it might make it easier to end up with code that works great when you test it but then breaks later when something unexpected happens), but maybe the usability outweighs that. I don't use fromiter myself so I don't have a strong opinion.
btw, I think np.array(['f', 'o', 'o']) would be exactly the expected result for np.array('foo'), but I guess that's just me.
In general np.array(thing_that_can_go_inside_an_array) returns a zero-dimensional (scalar) array -- np.array(1), np.array(True), etc. all work like this, so I'd expect np.array("foo") to do the same.
-n
-- Nathaniel J. Smith -- http://vorpus.org _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion