[Python-Dev] Tunable parameters in dictobject.c (was dictnotes.txt out of date?)

Mark Shannon mark at hotpy.org
Thu Jun 14 12:45:31 CEST 2012


Raymond Hettinger wrote:
> 
> On Jun 13, 2012, at 2:37 PM, Mark Shannon wrote:
> 
>> I think that for combined tables a growth factor of x2 is best,
>> but I don't have any hard evidence to back that up.
> 
> I believe that change should be reverted.  
> You've undone work that was based on extensive testing and timings of 
> many python apps.
> In particular, it harms the speed of building-up all large dictionaries,
> and it greatly harms apps with steady-size dictionaries with changing keys.
> 
> The previously existing parameter were well studied
> and have been well-reviewed by the likes of Tim Peters.
> They shouldn't be changed without deep thought and study.
> Certainly, "I think a growth factor of x2 is best" is insufficient.

Indeed, "I think a growth factor of x2 is best" is insufficient,
but so is "based on extensive testing and timings of many python apps"
unless you provide those timings and apps.

So here is some evidence.
I have compared tip (always resize by x2) with a x4 variant
(resize split-dict x2 and combined-dicts x4).

All benchmarks from http://hg.python.org/benchmarks/
For my old 32bit machine, numbers are for the x4 variant relative to tip.

For the 2n3 suite (24 micro benchmarks)
Average speed up: None (~0.05% on average)
Average memory use +4%.

GC: 1% faster, no change to memory use.
Mako: 4% slower, 4% more memory
2to3: 3% faster, 32% more memory.

Overall: No change to speed, 5% more memory.

The results seem to indicate that resizing is now sufficiently fast
that changing from x4 to x2 makes no difference in terms of speed.
However, for some programs (notable 2to3) the change from x4 to x2 can
save a lot of memory.

Cheers,
Mark.


More information about the Python-Dev mailing list