[Python-ideas] Exporting dict Items for direct lookups of specific keys

Josiah Carlson jcarlson at uci.edu
Mon Jun 11 18:45:41 CEST 2007


"Eyal Lotem" <eyal.lotem at gmail.com> wrote:
> On 6/11/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> > "Eyal Lotem" <eyal.lotem at gmail.com> wrote:
> > > On 6/10/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> > > > "Eyal Lotem" <eyal.lotem at gmail.com> wrote:
> > > > > On 6/10/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> > > > > Only access of exported items is O(1) time (when accessed via your
> > > > > PyDictItem_obj->value), other items must be accessed normally and they
> > > > > take just as much time (or as I explained and you reiterated, a tad
> > > > > longer, as it requires a bitmap check and in the case of exported
> > > > > items another dereference).
> > > >
> > > > But you still don't explain *how* these exported keys are going to be
> > > > accessed.  Walk me through the steps required to improve access times in
> > > > the following case:
> > > >
> > > > def foo(obj):
> > > >     return obj.foo
> > > >
> > > >
> > > I think you missed what I said - I said that the functionality should
> > > probably not be exported to Python - as Python has little to gain from
> > > it (it would have to getattr a C method just to request the exported
> > > item -- which will nullify the speed benefit).
> > >
> > > It is the C code which can suddenly do direct access to access the
> > > exported dict items - not Python code.
[snip]
> While extensions are an optimization target, the main target is
> global/builtin/attribute accessing code.

Or really, module globals and __builtin__ accessing.  Arbitrary
attribute access is one of those "things most commonly done in Python". 
But just for the sake of future readers of this thread, could you
explicitly enumerate *which* things you intend to speed up with this
work.


> > > Since a "static lookup" costs a dereference and a conditional, and a
> > > dynamic lookup entails at least 4 C function calls (including C stack
> > > setup/unwinds), a few C assignments and C conditionals, I believe it
> > > is likely that this will pay off as a serious improvement in Python's
> > > performance, when combined with a psyco-like system (not an
> > > architecture-dependent ones).
> >
> > It's really only useful if you are accessing fixed attributes of a fixed
> > object many times.  The only case I can think of where this kind of
> > thing would be useful (sufficient accesses to make a positive difference)
> > is in the case of module globals, but in that case, we can merely change
> > how module globals are implemented (more or less like self.__dict__ = ...
> > in the module's __init__ method).
> 
> That's not true.
> 
> As I explained, getattr accesses the types's mro dicts as well. So
> even if you are accessing a lot of different instances, and those have
> a shared (fixed) type, you can speed up the type-side dict lookup
> (even if you still pay for a whole instance-side lookup).  Also,

That's MRO caching, which you have already stated is orthogonal to this
particular proposal.


> "fixed-object" access can occur when you have a small number of
> objects whose attributes are looked up many times. In such a case, a
> psyco-like system can create a specialized code object specifically
> for _instances_ (not just for types), each code object using "static
> lookups" on the instance's dict as well, and not just on the class's
> dict.

If you re-read my last posting, which you quoted above and I re-quote,
you can easily replace 'fixed attributes of a fixed object' with 'fixed
attributes of a small set of fixed objects' and get what you say.  Aside
from module globals, when is this seen?


> > You may want to change the name.  "Literal" implies a constant, like 1
> > or "hello", as in 'x = "hello"'.  LOAD_GLOBAL_FAST would seem to make
> > more sense to me, considering that is what it intends to do.
> 
> Well, LOAD_GLOBAL_FAST can only be used when the string that's being
> looked up is known at the code-object creation time, which means that
> the attribute name was indeed literal.

A literal is a value.  A name/identifier is a reference.

In:
    a = "hello"
... "hello" is a literal.

In:
    hello = 1
... hello is a name/identifier.

In:
    b.hello = 1
... hello is a named attribute of an object named/identified by b.


 - Josiah




More information about the Python-ideas mailing list