[Python-Dev] Idea: Dictionary references

Franklin? Lee leewangzhong+python at gmail.com
Thu Dec 17 18:41:29 EST 2015


I already know that we can't use recursion, because it bypasses MRO.
I'm also not yet sure whether it makes sense to use refs for classes
in the first place.

As I understood it, an attribute will resolve in this order:
- __getattribute__ up the MRO. (raises AttributeError)
- __dict__ up the MRO. (raises KeyError)
- __getattr__ up the MRO. (raises AttributeError)


My new understanding:
- __getattribute__. (raises AttributeError)
    - (default implementation:) __dict__.__getitem__. (raises KeyError)
- __getattr__ up the MRO. (raises AttributeError)

If this is the case, then (the default) __getattribute__ will be
making the repeated lookups, and might be the one requesting the
refcells (for the ones it wants).


Descriptors seem to be implemented as:
    Store a Descriptor object as an attribute. When a Descriptor is
accessed, if it is being accessed from its owner, then unbox it and
use its methods. Otherwise, it's a normal attribute.

Then Descriptors are in the dict, so MIGHT benefit from refcells. The
memory cost might be higher, though.


On Thu, Dec 17, 2015 at 5:17 PM, Andrew Barnert <abarnert at yahoo.com> wrote:
> On Dec 17, 2015, at 13:37, Andrew Barnert via Python-Dev <python-dev at python.org> wrote:
>>
>> On Thursday, December 17, 2015 11:19 AM, Franklin? Lee <leewangzhong+python at gmail.com> wrote:
>>
>>
>>> ...
>>> as soon as I figure out how descriptors actually work...
>>
>>
>> I think you need to learn what LOAD_ATTR and the machinery around it actually does before I can explain why trying to optimize it like globals-vs.-builtins doesn't make sense. Maybe someone who's better at explaining than me can come up with something clearer than the existing documentation, but I can't.
>
> I take that back. First, it was harsher than I intended. Second, I think I can explain things.

I appreciate it! Tracking function definitions in the source can make
one want to do something else.


> First, for non-attribute lookups:
>
> (Non-shared) locals just load and save from an array.
>
> Free variables and shared locals load and save by going through an extra dereference on a cell object in an array.

In retrospect, of course they do.

It sounds like the idea is what's already used there, except the refs
are synced to the locals array instead of a hash table.


> Globals do a single dict lookup.

A single dict lookup per function definition per name used? That's
what I'm proposing.

For example, (and I only just remembered that comprehensions and gen
expressions create scope)

    [f(x) for x in range(10000)]

would look up the name `f` at most twice (once in globals(), once in
builtins() if needed), and will always have the latest version of `f`.

And if it's in a function, the refcell(s) would be saved by the function.


> Builtins do two dict lookups.

Two?


> Class attributes (including normal methods, @property, etc.) do two or more dict lookups--first the instance, then the class, then each class on the class's MRO. Then, if the result has a __get__ method, it's called with the instance and class to get the actual value. This is how bound methods get created, property lookup functions get called, etc. The result of the descriptor call can't get cached (that would mean, for example, that every time you access the same @property on an instance, you'd get the same value).

Yeah, I would only try to save in a dict lookup to get the descriptor,
and I'm not sure it's worth it.

(Victor's response says that class attributes are already efficient, though.)


> So, if the globals->builtins optimization is worth doing, don't tie it to another optimization that's much more complicated and less useful like this, or we'll never get your simple and useful idea.

Sure. I couldn't figure out where to even save the refcells for
attributes, so I only really saw an opportunity for name lookups.
Since locals and nonlocals don't require dict lookups, this means
globals() and __builtin__.


More information about the Python-Dev mailing list