[Python-ideas] Keep free list of popular iterator objects

Raymond Hettinger raymond.hettinger at gmail.com
Sun Sep 15 18:38:39 CEST 2013


On Sep 14, 2013, at 11:26 PM, Kyle Fisher <anthonyfk at gmail.com> wrote:

> I've realized that my original example is far too complex, so I've simplified it:
> 
> Status quo:
> ./python -m timeit -r 100 -s "a=[1]" "iter(a)"
> 10000000 loops, best of 100: 0.0662 usec per loop
> 
> With patch:
> ./python -m timeit -r 100 -s "a=[1]" "iter(a)"
> 10000000 loops, best of 100: 0.0557 usec per loop
> List iter allocations: 6
> List iter reuse through freelist: 1011111554
> 100.00% reuse rate
> 
> Which seems to show a 15% speedup.  I'd be curious what others get.

This 15% claim is incredibly deceptive.  You're looping over a list of length one and the "benefits" fall away immediately for anything longer.  It seems like it is intentionally ignoring that you're optimizing an O(1) setup step in an O(n) operation.  And the timing loop does not exercise the cases where the freelist misses.

More important effects are being masked by the tight timing loop that only exercises the most favorable case.  In real programs, your patch may actually make performance worse.  The default freelisting scheme is heavily used and tends to always be in cache.  In contrast, a freelist for less frequently used objects tend to not be in cache when you need them.   Similar logic applies to branch prediction here as well. (In short, I believe that the patch serves only to optimize an unrealistic benchmark and would make actual programs worse-off).

I'm -1 on adding freelisting to iterators.  I recently removed the freelist scheme from Objects/setobject.c because it provided no incremental benefit over the default freelisting scheme.

Please focus your optimization efforts elsewhere in the code.  There are real improvements to be had for operations that matter.  The time to create an iterator is one of the least important operations in Python.   If this had been a real win, I would have incorporated it into itertools long ago. 


Raymond


P.S.  If you want to help benchmark the effects of aligned versus unaligned memory allocations, that is an area this likely to bear fruit (for example, if integer objects with 32 byte aligned, it would guarantee that the object head and body would be in the same cache line).















-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130915/033489ae/attachment.html>


More information about the Python-ideas mailing list