[New-bugs-announce] [issue24138] Speed up range() by caching and modifying long objects

Larry Hastings report at bugs.python.org
Wed May 6 18:18:56 CEST 2015


New submission from Larry Hastings:

This probably shouldn't be checked in.  But it was an interesting experiment, and I did get it to work.

My brother forwarded me this question from Stack Overflow:
    http://stackoverflow.com/questions/23453133/is-there-a-reason-python-3-enumerates-slower-than-python-2

The debate brought up a good point: a lot of the overhead of range() is in creating and destroying the long objects.  I wondered, could I get rid of that?  Long story short, yes.

rangeiterobject is a special-case range object for when you're iterating over integer values and the values all fit inside C longs.  Otherwise it has to use the much slower general-purpose longrangeiterobject.)  rangeiter_next is simple: it computes the new value then returns PyLong_FromLong of that value.

First thought: cache a reference to the previous value.  If its reference count is 1, we have the only reference.  Overwrite its value and return it.  But that doesn't help in the general case, because in "for x in range(1000)" x is holding a reference at the time __next__ is called on the iterator.

The trick: cache *two* old yielded objects.  In the general case, by the second iteration, everyone else has dropped their references to the older cached object and we can modify it safely.

The second trick: if the value you're yielding is one of the interned "small ints", you *have* to return the interned small int.  (Otherwise you break 0 == 0, I kid you not.)

With this patch applied all regression tests pass.  And, on my desktop machine, the benchmark they used in the above link:

./python -mtimeit -n 5 -r 2 -s"cnt = 0" "for i in range(10000000): cnt += 1"

drops from 410ms to 318ms ("5 loops, best of 2: 318 msec per loop").


This implementation requires the rangeiterobject to have intimate knowledge of the implementation of the PyLongObject, including copy-and-pasting some information that isn't otherwise exposed (max small int essentially).  At the very least that information would need to be exposed properly so rangeiterobject could use it correctly before this could be checked in.

It might be cleaner for longobject.c to expose a private "set this long object to this value" function.  It would fail if we can't do it safely: if the value was an interned (small) int, or if the long was of the wrong size (Py_SIZE(o) != 1).


Is this interesting enough to pursue?  I'm happy to drop it.

----------
assignee: larry
components: Interpreter Core
files: larry.range.hack.1.txt
messages: 242685
nosy: larry, pitrou, rhettinger, serhiy.storchaka
priority: normal
severity: normal
stage: patch review
status: open
title: Speed up range() by caching and modifying long objects
type: performance
versions: Python 3.5
Added file: http://bugs.python.org/file39307/larry.range.hack.1.txt

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue24138>
_______________________________________


More information about the New-bugs-announce mailing list