[Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster

josef.pktd at gmail.com josef.pktd at gmail.com
Sat Feb 13 21:48:31 EST 2016


On Sat, Feb 13, 2016 at 9:43 PM, <josef.pktd at gmail.com> wrote:

>
>
> On Sat, Feb 13, 2016 at 8:57 PM, Antony Lee <antony.lee at berkeley.edu>
> wrote:
>
>> Compare (on Python3 -- for Python2, read "xrange" instead of "range"):
>>
>> In [2]: %timeit np.array(range(1000000), np.int64)
>> 10 loops, best of 3: 156 ms per loop
>>
>> In [3]: %timeit np.arange(1000000, dtype=np.int64)
>> 1000 loops, best of 3: 853 µs per loop
>>
>>
>> Note that while iterating over a range is not very fast, it is still much
>> better than the array creation:
>>
>> In [4]: from collections import deque
>>
>> In [5]: %timeit deque(range(1000000), 1)
>> 10 loops, best of 3: 25.5 ms per loop
>>
>>
>> On one hand, special cases are awful. On the other hand, the range
>> builtin is probably important enough to deserve a special case to make this
>> construction faster. Or not? I initially opened this as
>> https://github.com/numpy/numpy/issues/7233 but it was suggested there
>> that this should be discussed on the ML first.
>>
>> (The real issue which prompted this suggestion: I was building sparse
>> matrices using scipy.sparse.csc_matrix with some indices specified using
>> range, and that construction step turned out to take a significant portion
>> of the time because of the calls to np.array).
>>
>
>
> IMO: I don't see a reason why this should be supported. There is np.arange
> after all for this usecase, and from_iter.
> range and the other guys are iterators, and in several cases we can use
> larange = list(range(...)) as a short cut to get python list.for python 2/3
> compatibility.
>
> I think this might be partially a learning effect in the python 2 to 3
> transition. After using almost only python 3 for maybe a year, I don't
> think it's difficult to remember the differences when writing code that is
> py 2.7 and py 3.x compatible.
>
>
> It's just **another** thing to watch out for if milliseconds matter in
> your application.
>


side question: Is there a simple way to distinguish a iterator or generator
from an iterable data structure?

Josef



>
> Josef
>
>
>>
>> Antony
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20160213/f9318b9b/attachment.html>


More information about the NumPy-Discussion mailing list