[Python-Dev] insertdict slower?
M.-A. Lemburg
mal@lemburg.com
Sat, 03 Feb 2001 13:23:54 +0100
Tim Peters wrote:
>
> [MAL]
> > It doesn't have anything to do with icache conflicts or
> > other esoteric magic at dye-level. The reason for the slowdown
> > is that the benchmark uses integers as keys and these have to
> > go through the whole rich compare machinery to find their way into
> > the dictionary.
>
> But insertdict doesn't do any compares at all (besides C pointer comparison
> to NULL). There's more than one mystery here. The one I was addressing is
> why the profiler said *insertdict* got slower. Jeremy's profile did not
> give any reason to suspect that lookdict got slower (which is where the
> compares are); to the contrary, it said lookdict got 4.5% *faster* in 2.1.
>
> > Please see my other post on the subject -- I think we need
> > an optimized API especially for checking for equality.
>
> Quite possibly, but if Jeremy remains keen to help with investigating timing
> puzzles, he needs to figure out why his profiling approach is pointing him
> at the wrong functions. That has long-term value far beyond patching the
> regression du jour.
>
> it's-not-either/or-it's-both-ly y'rs -tim
Ok, let's agree on "it's both" :)
I was referring to the slowdown which shows up in the DictCreation
benchmark. It uses bundles of these operations to check for
dictionary creation speed:
d1 = {}
d2 = {}
d3 = {}
d4 = {}
d5 = {}
d1 = {1:2,3:4,5:6}
d2 = {2:3,4:5,6:7}
d3 = {3:4,5:6,7:8}
d4 = {4:5,6:7,8:9}
d5 = {6:7,8:9,10:11}
Note that the number of inserted items is 3; the minimum size
of the allocated table is 4. Apart from the initial allocation
of the dictionary table, no further resizes are done.
One of the micro-optimizations which I used in my patched Python
version deals with these rather common situations: small dictionaries
are inlined (up to a certain size) in the object itself rather
than stored in a separatly malloced table. I found that a limit
of 8 slots gives you the best ratio between performance boost and
memory overhead.
This is another area where Valdimir's pymalloc could help, since it
favours small memory chunks.
--
Marc-Andre Lemburg
______________________________________________________________________
Company: http://www.egenix.com/
Consulting: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/