[Python-ideas] Keep free list of popular iterator objects

Sun Sep 15 07:04:53 CEST 2013

On Sat, Sep 14, 2013 at 9:28 PM, Raymond Hettinger <
raymond.hettinger at gmail.com> wrote:

>
> It is surprising that you saw any performance gain at all.
>
> Python already has a default Python freelist scheme
> in the _PyObject_Malloc() function in Objects/obmalloc.c.
>
> Another thought is that this isn't an inner-loop optimization.
> The O(1) time for iterator creation is dominated by the O(n)
> time to actually iterate over the dict keys, values, and items.
>
> Raymond
>

Hi Raymond,

Taking a look at _PyObject_Malloc in Objects/obmalloc.c, I see that it
needs to do some lock and unlock operations.  Perhaps it's the avoidance of
this overhead that I'm seeing?  After all, there must be a reason that
dict, tuple and others are keeping their own free lists, right?

I'm curious what the overhead in creating the iterator is compared to the
time to iterate.  Obviously there's an O(1) / O(n) difference, but perhaps
the constant time outweighs smaller values of n?  In our case, we are often
doing something like the following (2.7):

def onNewData(datapoints):
    for dp in datapoints:
        for val in dp.outputs.itervalues():
            # Do things with val
        for status in dp.statuses.itervalues():
            # Do things with status

Where datapoints can have 100000 items and "outputs" and "statuses" tend to
be small.  So, while creating the iterator obviously isn't the slowest part
of the code, it does have some impact.

Cheers,
-Kyle

P.S. - I'm a newbie to the mailing list, so if I'm replying "wrong" sorry
about that!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130914/4990b18f/attachment.html>