[Python-Dev] On the possibility of "optimizing" range() calls in for-loops

13 Jun 2003 21:09:03 -0700

I've been thinking about the possibility of an optimization of range()
calls in for-loops, replacing the list generation with a lazy range
object.

xrange(), in Python 2.3, now has an iterator, so it should be faster
than range() in for-loops, which is an incentive to use it more.
However, this does go against the overall desire to move away from
using xrange(), just as we have moved away from xreadlines(), etc.

So, I've been considering possible ways in which to 'optimize' the
following types of loops:

  for i in range( 1000 ):
      pass

or

  n = 1000
  for i in range( n ):
      pass

The main issue, as I see it, is substituting some form of lazy range
iterator, for the range() function, "behind the curtains" as it were.
People could gain the benefits of xrange() (less memory consumption,
probably faster looping), without having to promote the continued use
of xrange().

There would have to be logic to determine that the range function
being used is the builtin range, and not a shadowed version.  I assume
this can be done in a fairly straightforward manner at compile time.
I could use advice on whether this is feasible (but see below, first)

A further issue is the substitution itself.  As of 2.3b1 the range()
builtin is able to handle long integer objects larger than
sys.maxint.  xrange() is not so equipped, so it cannot be substituted
blindly for range() without examining the range() arguments.

However, as the second example above shows, the value of the arguments
would have to be determined at runtime.  In principle, a function
could be made which checked the arguments, and if they could be
handled by an xrange() object, it could be returned.  Otherwise, the
results of range() could be returned.  However, it is unclear whether
the results of this indirection (and double checking of the arguments)
would be a speed win (although the reduced memory requirements would
still exist).

Guido has already stated his opinion that xrange should NOT be
extended to handle values larger than sys.maxint (ie. he doesn't want
to give xrange() any more features).  A related option is to make a
kind of private lazy range-iterator object, would could be
unconditionally substituted for the range function, in for loops, as
compile time.  Based on my understanding of how the code generation
works, this could be made to work in principle.  However, without
introducing some new byte codes, it doesn't seem possible to have a
hidden, "private" iterator generator function.

So, without going into much further detail it seems there would be
only three feasible options.

1) Extend xrange() to handle long int arguments, to bring parity with
range(), and allowing unconditional substitution with range, in
for-loops.

2) Change byte code generation to allow substituting a private range
iterator (or range() iterator generator) that can handle all possible
ranges.

3) Forget it.  Not worth bothering about.

After thinking about it a lot, it seems number 3 has become the most
attractive (and likely) option. :)  Number 2 seems like a maintenance
nightmare, and number 1 would require the BDFL and community to change
the stance on extending the feature set of xrange().

I admit I haven't done extensive testing to determine whether this
optimization would really provide much benefit on the whole.
Certainly a grep through any large Python code base (say the standard
libraries) indicates that there are many uses of range() in for loops,
but typically for fairly small ranges, so the memory issue may not be
such a problem.  Also, should there ever be a Python 3, where range()
itself probably becomes lazy, the porting headaches, for this
particular issue, should be minimal compared to others.  (ie. xrange()
would become a synonym, of sorts, for the new range())

Anyway, I only bring it up because it is an issue that does get
discussed from time to time, usually with the idea that it "could"
possibly be done in the future.  I think if it should ever be done, it
should probably be done soon, just so that using xrange() will grow
less attractive, rather than more.  I'm interested to hear other
thoughts about this.

-- 
Chad Netzer <cnetzer (at) sonic (dot) net>