[Python-Dev] Proposal to eliminate PySet_Fini

Mon Jul 3 17:10:16 CEST 2006

Tim Peters wrote:
> With current trunk that printed
> 
> [2.9363677646013846, 2.9489729031005703, 2.9689538729183949]
> 
> After changing
> 
> #define MAXSAVEDTUPLES  2000
> 
> to
> 
> #define MAXSAVEDTUPLES  0
> 
> the times zoomed to
> 
> [4.5894824930441587, 4.6023111649343242, 4.629560027293957]
> 
> That's pretty dramatic.

Interesting. I ran this through gprof, and found the following
changes to the number of function calls

                   with-cache   without-cache
PyObject_Malloc         59058        24055245
tupletraverse           33574        67863194
visit_decref           131333       197199417
visit_reachable        131333       197199417
collect                    17           33006
(for reference:)
tuplerepeat          30000000        30000000

According to gprof, these functions (excluding tuplerepeat)
together account for 40% of the execution time in the without-cache
(i.e. MAXSAVEDTUPLES 0) case.

So it appears that much of the slowdown in disabling the fast
tuple allocator is due to the higher frequency of garbage collection
in your example.

Can you please re-run the example with gc disabled?

Of course, it's really no surprise that GC is called more often:
if the tuples are allocated from the cache, that doesn't count
as an allocation wrt. GC. It so happens that your example just
triggers gc a few times in its inner loop; I wouldn't attribute
that overhead to obmalloc per se.

Regards,
Martin