[Python-ideas] Fast global cacheless lookup
Neil Toronto
ntoronto at cs.byu.edu
Fri Nov 23 09:26:19 CET 2007
Eyal Lotem wrote:
> Hey, I had a very similar idea and implementation back in June (that
> also passed all regression tests):
> http://mail.python.org/pipermail/python-ideas/2007-June/000902.html
>
> When I read Neil's mail I almost thought it was my old mail :-)
>
> Unfortunately, when I posted my optimization, it pretty much got
> ignored. Maybe I have not worded it properly.
>
> The main difference between our implementations, if I understand
> Neil's explanation correctly, is that you use direct ptrs into the
> dict and notify the ptr holders of relocations.
>
> I used a different method, where you call a new PyDict_ExportKey
> method and it creates a mediating element. The mediating element has
> a fixed position so it can be dereferenced directly. Direct access to
> it is just as fast, but it may be slightly affecting dict performance.
Nicely done. :)
> I think a hybrid approach similar to Neil's, but with a mediating
> object to represent the access to the dict and do the observing for
> its user could be nicer (hell, Neil might already be doing this).
I am, actually. I originally had the observer be the function object
itself, but that presented problems with generators, which create a
frame object from a function and then dump the function. I had assumed
that a frame would never outlast the function object it was created from
and ended up with dangling pointers. D'oh!
Anyway, it's correct now and the details are well-abstracted. The
mediating object is called PyFastGlobalsAdapter. ("Adapter" because it
allows you to getitem/setitem a dict like you do a list - using an
index.) It gets bunted about among functions, frames, and eval code. It
basically has the following members:
PyObject *globals; /* PyDictObject only */
PyObject *names; /* From func_code->co_names */
PyDictEntry **entries; /* Struct pointers into globals entries */
(I've omitted the details for builtins because I haven't got them
totally worked out yet. I'll probably have a PyObject *builtins and a
PyDictEntry *builtins_entry pointing at the globals dict so I can detect
when __builtins__ is replaced within globals.)
On init, it registers itself with the globals dict and starts keeping
track of pointers to dict entries in "entries". "entries" is the same
length as "names". Getting the value globals[names[i]] is done by just
referencing entries[i]->me_value. It's very quick. :)
There's a PyFastGlobals_GetItem(PyObject *fg, int index) that does it
for you and also does the necessary bookkeeping to update the
PyDictEntry pointers when there's a miss. (entries[i] == NULL; happens
when a key is anticipated but not in the dict at first.)
I agree that the dict observer interface + an adapter is a good way to
go. The dict part should be flexible, lean and fast (like dicts
themselves), and the simple observer interface does just that. The
adapter keeps it all correct and refcount-y, and provides a convenient
way to get values by index.
Would it be worth it to expose dict adapters as a Python object? Then
Python code could do this kind of crazy stuff:
d = {<some dict>}
a = dictadapter(d, ('keys', 'i', 'want', 'fast', 'access', 'to'))
a[0] == d['keys'] and a[1] == d['i'] #..., etc. => True
That could make it a lot easier to experiment with fast cacheless dict
lookups in other contexts. The problem is, I have no idea what those
contexts might be, at least within Python code. :)
> P.S: I also had a more ambitious plan, after eliminating
> globals/builtins dict lookups, to use mro caches more aggressively
> with this optimization and type-specialization on code objects, to
> also eliminate class-side dict lookups. The user can also eliminate
> instance-side dict lookupts via __slots__ - effectively allowing the
> conversion of virtually _all_ namespace dict lookups in pure Python
> code to be direct memory dereferences, isn't that exciting? :-)
Extremely! There's no reason it couldn't be done. But I'm not exactly
sure what you mean (seeing the idea only from the context of my own
hacks), so could you elaborate a bit? :D When you said "mro" I
immediately thought of this adapter layout:
PyList of MRO class dicts
PyTuple of common attribute names (or list of tuples)
PyDictEntry * array per MRO class
But where would the name index come from? The co_names tuples are
different for each method.
Neil
More information about the Python-ideas
mailing list