[Python-ideas] Keep free list of popular iterator objects
anthonyfk at gmail.com
Sun Sep 15 07:04:53 CEST 2013
On Sat, Sep 14, 2013 at 9:28 PM, Raymond Hettinger <
raymond.hettinger at gmail.com> wrote:
> It is surprising that you saw any performance gain at all.
> Python already has a default Python freelist scheme
> in the _PyObject_Malloc() function in Objects/obmalloc.c.
> Another thought is that this isn't an inner-loop optimization.
> The O(1) time for iterator creation is dominated by the O(n)
> time to actually iterate over the dict keys, values, and items.
Taking a look at _PyObject_Malloc in Objects/obmalloc.c, I see that it
needs to do some lock and unlock operations. Perhaps it's the avoidance of
this overhead that I'm seeing? After all, there must be a reason that
dict, tuple and others are keeping their own free lists, right?
I'm curious what the overhead in creating the iterator is compared to the
time to iterate. Obviously there's an O(1) / O(n) difference, but perhaps
the constant time outweighs smaller values of n? In our case, we are often
doing something like the following (2.7):
for dp in datapoints:
for val in dp.outputs.itervalues():
# Do things with val
for status in dp.statuses.itervalues():
# Do things with status
Where datapoints can have 100000 items and "outputs" and "statuses" tend to
be small. So, while creating the iterator obviously isn't the slowest part
of the code, it does have some impact.
P.S. - I'm a newbie to the mailing list, so if I'm replying "wrong" sorry
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-ideas