[Python-3000] Specializing the dicts in __dict__
Jack Diederich
jack at performancedrivers.com
Tue Apr 18 01:58:55 CEST 2006
In an old thread ("dictionary tuning", 2003[1]) Raymond H laid out the
typical use cases of dicts and found there were many subclasses of
"typical" and speeding one up hurt the others. The results are in
dist/Object/dictnotes.txt A couple of the particular use cases have
since been satisfied by sets and the new defaultdict. The dicts used
in symbol table lookup (global, local, class) have their own pattern.
I mentioned then and would like to resurrect now the idea of making
a dict-alike that has two extra properties:
1: the keys can only be str or unicode (py3k: just unicode)
2: the dict-alike is ordered
The str/unicode restriction allows for some creative implementation
details (maximizing cache hits on small dicts). I don't think this
would break much (if any) code.
The ordering would eliminate a frequent re-implementation in ORMs,
frameworks, and DSLs that use hackery to know in what order the
elements of a class/class-as-namespace were defined:
class Thing:
a = IntField() # increments a counter to 1
b = StringField() # counter == 2
c = StringField() # counter == 3
Thing.b = munge_field(Thing.b)
The suggested order would be by key creation so that reassigning
a key after manipulation would maintain the original order as in
the Thing.b case above.
There have been a number of suggested patches over the years to
speed up name access but they all impact general dict performance.
Isolating symbol table dicts would allow for easy experimentation.
To mimic the defaultdict class I would suggest the name symdict,
but (Big But) I don't much care what the name is and don't care
at all how big your thesaurus is. Easy and arbitrary decisions
should be made by proclamation; there may be benefits to driving
on one side of the road or the other but the most important thing
is that one side is chosen.
Guido has asked for working code recently, I'll have a chance
to attempt a patch at the "Need for Speed" sprint. One of the
suggested goals is an ordered dict, so that works out nicely.
afk-to-write-some-checks-to-Uncle-Sam-ly,
-Jack
[1] http://thread.gmane.org/gmane.comp.python.devel/12160/focus=12195
More information about the Python-3000
mailing list