[Patches] [ python-Patches-671454 ] Optimize dictionary resizing

SourceForge.net noreply@sourceforge.net
Mon, 20 Jan 2003 15:13:41 -0800


Patches item #671454, was opened at 2003-01-20 18:08
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=671454&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
>Assigned to: Neal Norwitz (nnorwitz)
Summary: Optimize dictionary resizing

Initial Comment:
Revisiting Oren’s fastnames patch, I think a simpler 
approach is warranted.  Using a setitem macro skips 
a function call but doesn’t get to the root of problem.  
Likewise, a pointer search is not helpful because 
lookupstring() already attempts a pointer 
comparison first and a hash comparison as a 
fallback (also, a pointer search fails to find non-
interned comparisons). There is more promise in 
optimizing the search/fail/fallback sequence of 
looking for locals, globals, and builtins, but the 
approach of using custom look-ups and having 
negative entries is too invasive.

The real issue is that the dictionaries are too full.  
Whenever, it hits 2/3’s full, a resize is triggered which 
doubles the size of the dictionary, still leaving it 
relatively full and relatively close to the next 
(expensive) resize operation.

This patch makes the setitem resize operation 
quadruple the size of the dictionary.  This leads to 
fewer total resize operations and leaves the dictionary 
more sparsely populated (not only leading to fewer 
collisions, but allowing immediate detection of NOT IN 
without doing *any* comparisons of pointers or 
hashcodes).

The patch is minimally invasive.  It is only one line!  
However, is does change the order of dictionaries, so 
doctest style test code will sense the change 
immediately.

Four timing suites are attached.  Two, directly 
measure speed-ups for accessing globals and 
__builtins__.  The other two are real world apps 
which tap into all facets of python including 
subclassing, lambdas and functionals, large and 
small dicts, a wide array of data structures, heavy 
processing of strings, tuples, and numbers.   Each of 
the suites shows improvements ranging from 5% to 
25%.  PyStone also shows a marginal improvement 
which is surprising because it is dominated by fast 
locals.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=671454&group_id=5470