Nathaniel,
IMO this is better than making np.array(iter) internally call list(iter) or equivalent
Yeah but that's not the only option: from itertools import chain def fromiter_awesome_edition(iterable): elem = next(iterable) dtype = whatever_numpy_does_to_infer_dtypes_from_lists(elem) return np.fromiter(chain([elem], iterable), dtype=dtype) I think this would be a huge win for usability. Always getting tripped up by the dtype requirement. I can submit a PR if people like this pattern. btw, I think np.array(['f', 'o', 'o']) would be exactly the expected result for np.array('foo'), but I guess that's just me. Juan. On Sat, Dec 12, 2015 at 10:12 AM, Nathaniel Smith <njs@pobox.com> wrote:
Constructing an array from an iterator is fundamentally different from constructing an array from an in-memory data structure like a list, because in the iterator case it's necessary to either use a single-pass algorithm or else create extra temporary buffers that cause much higher memory overhead. (Which is undesirable given that iterators are mostly used exactly in the case where one wants to reduce memory overhead.)
np.fromiter requires the dtype= argument because this is necessary if you want to construct the array in a single pass.
np.array(list(iter)) can avoid the dtype argument, because it creates that large memory buffer. IMO this is better than making np.array(iter) internally call list(iter) or equivalent, because the workaround (adding an explicit call to list()) is trivial, while also making it obvious to the user what the actual cost of their request is. (Explicit is better than implicit.)
In addition, the proposed API has a number of infelicities: - We're generally trying to *reduce* the magic in functions like np.array (e.g. the discussions of having less magic for lists with mismatched numbers of elements, or non-list sequences) - There's a strong convention in Python is when making a function like np.array generic, it should accept any iter*able* rather any iter*ator*. But it would be super confusing if np.array({1: 2}) returned array([1]), or if array("foo") returned array(["f", "o", "o"]), so we don't actually want to handle all iterables the same. It's somewhat dubious even for iterators (e.g. someone might want to create an object array containing an iterator...)...
hope that helps, -n
On Fri, Dec 11, 2015 at 2:27 PM, Stephan Sahm <Stephan.Sahm@gmx.de> wrote:
numpy.fromiter is neither numpy.array nor does it work similar to numpy.array(list(...)) as the dtype argument is necessary
is there a reason, why np.array(...) should not work on iterators? I have the feeling that such requests get (repeatedly) dismissed, but until yet I haven't found a compelling argument for leaving this Feature missing (to remember, it is already implemented in a branch)
Please let me know if you know about an argument, best, Stephan
On 27 November 2015 at 14:18, Alan G Isaac <alan.isaac@gmail.com> wrote:
On 11/27/2015 5:37 AM, Stephan Sahm wrote:
I like to request a generator/iterator support for np.array(...) as far as list(...) supports it.
http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromiter.html
hth, Alan Isaac _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
-- Nathaniel J. Smith -- http://vorpus.org _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion