[Patches] [ python-Patches-1181334 ] range() in for loops, again

SourceForge.net noreply at sourceforge.net
Fri Apr 15 10:05:01 CEST 2005


Patches item #1181334, was opened at 2005-04-12 09:00
Message generated for change (Comment added) made by arigo
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1181334&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Armin Rigo (arigo)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: range() in for loops, again

Initial Comment:
Here is (yet) a(nother) patch to make range() lazy and fast in the context of for loops.

This implementation allows any built-in function or method to declare two C implementations if it wants to.  The second implementation, if present, is used when the caller can know for sure that it will only ever need an iterator over the result.  This is done with a new ml_flag_iter member of PyMethodDef.  To minimize compatibility issues a new METH_ITERABLE flag must be set when ml_flag_iter is valid.

The CALL_FUNCTION bytecode uses that if the immediately following bytecode is GET_ITER.

The patch updates range() to use this flag and build an xrange iterator object.

Of course the same technique could be applied to other built-in functions, but none spring to mind.  Maybe a simple patch that only special-cases range() would do just as well...

FWIW, I get a 5% speed-up on pystone with this patch.

----------------------------------------------------------------------

>Comment By: Armin Rigo (arigo)
Date: 2005-04-15 08:05

Message:
Logged In: YES 
user_id=4771

All these examples would imply a slight semantic change, I think:

   for x in filter(f, y):

if y is an iterator, the above line exhausts y before entering the loop.  The same for map and zip.

I want to avoid this at all costs, otherwise we'll see strange bug reports whose "solution" is: "well, just write instead

   seq=filter(f,y); for x in seq:

"... which would be very bad, in my opinion.

I'm already slightly concerned that

   r=range(10**7); for x in r: break

behaves quite differently than

   for x in range(10**7): break

with my patch.  I believe that in this case it is an acceptable tread-off, but I would certainly not check this patch in without explicit BDFL approval.

Guido, assigning to you for said principle approval or rejection...

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2005-04-15 05:11

Message:
Logged In: YES 
user_id=80475

There are several builtins that could offer an iterator
alternative:
  range defers to xrange
  filter to itertools.ifilter
  map to itertools.imap (1)
  zip to itertools.izip

All of these offer huge space savings and a little speed
(with izip providing the best improvement because it avoids
tons of tuple allocations).

(1) an alternate imap needs to be provided since the
itertools version does not provide the None fill-in feature.


----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2005-04-13 19:56

Message:
Logged In: YES 
user_id=357491

No, I don't remember you ever mentioning that, but that is a
good idea since that would allow easy faking of things.

----------------------------------------------------------------------

Comment By: Armin Rigo (arigo)
Date: 2005-04-13 10:24

Message:
Logged In: YES 
user_id=4771

Ok, here is a patch special-casing range() only.  It's a bit simpler, and faster too.

Did I already mention around here that the class/type unification of Python 2.2 might have been better done, in my opinion, simply by replacing the built-in type() with lambda x: x.__class__?

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2005-04-13 00:31

Message:
Logged In: YES 
user_id=357491

I vote for a patch to special-case range().  This won't be
an issue in Python 3000 when stuff like this become
generators.  Still wish there was a way to come up with a
range object that has ``type(range(1)) == list`` be true,
but check the refcount on the object so that if it is 1 it
can act as a generator instead.

----------------------------------------------------------------------

Comment By: Armin Rigo (arigo)
Date: 2005-04-12 11:34

Message:
Logged In: YES 
user_id=4771

Doesn't work, because the dict can be modified during iteration.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2005-04-12 11:03

Message:
Logged In: YES 
user_id=33168

dict.keys(), dict.values(), and dict.items() could also be
possibilities.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1181334&group_id=5470


More information about the Patches mailing list