[Python-Dev] Idea: Dictionary references

Andrew Barnert abarnert at yahoo.com
Thu Dec 17 17:17:39 EST 2015


On Dec 17, 2015, at 13:37, Andrew Barnert via Python-Dev <python-dev at python.org> wrote:
> 
> On Thursday, December 17, 2015 11:19 AM, Franklin? Lee <leewangzhong+python at gmail.com> wrote:
> 
> 
>> ...
>> as soon as I figure out how descriptors actually work...
> 
> 
> I think you need to learn what LOAD_ATTR and the machinery around it actually does before I can explain why trying to optimize it like globals-vs.-builtins doesn't make sense. Maybe someone who's better at explaining than me can come up with something clearer than the existing documentation, but I can't.

I take that back. First, it was harsher than I intended. Second, I think I can explain things.

First, for non-attribute lookups:

(Non-shared) locals just load and save from an array.

Free variables and shared locals load and save by going through an extra dereference on a cell object in an array.

Globals do a single dict lookup.

Builtins do two dict lookups.

So, the only thing you can optimize there is builtins. But maybe that's worth it.

Next, for attribute lookups (not counting special methods):

Everything calls __getattribute__. Assuming that's not overridden and uses the object implementation:

Instance attributes do one dict lookup.

Class attributes (including normal methods, @property, etc.) do two or more dict lookups--first the instance, then the class, then each class on the class's MRO. Then, if the result has a __get__ method, it's called with the instance and class to get the actual value. This is how bound methods get created, property lookup functions get called, etc. The result of the descriptor call can't get cached (that would mean, for example, that every time you access the same @property on an instance, you'd get the same value).

Dynamic attributes from a __getattr__ do all that plus whatever __getattr__ does.

If __getattribute__ is overloaded, it's entirely up to that implementation to do whatever it wants.

Things are similar for set and del: they call __setattr__/__delattr__, and the default versions of those look in the instance dict first, then look for a descriptor the same as with get except that they call a different method on the descriptor (and if it's not a descriptor, instead of using it, they ignore it and go back to the instance dict).

So, your mechanism can't significantly speed up method lookups, properties, or most other things. It could speed up lookups for class attributes that aren't descriptors, but only at the cost of increasing the size of every instance--and how often do those matter anyway?

A different mechanism that cached references to descriptors instead of to the resulting attributes could speed up method lookups, etc., but only by a very small amount, and with the same space cost.

A mechanism that didn't try to get involved with the instance dict, and just flattened out the MRO search once that failed (and was out of the way before the descriptor call or __getattr__ even entered the picture) might speed methods up in deeply nested hierarchies, and with only a per-class rather than a per-instance space cost. But how often do you have deeply-nested hierarchies? And the speedup still isn't going to be that big: You're basically turning 5 dict lookups plus 2 method calls into 2 dict lookups plus 2 method calls. And it would still be much harder to guard than the globals dict: if any superclass changes its __bases__ or adds or removes a __getattribute__ or various other things, all of your references have to get re-computed. That's rare enough that the speed may not matter, but the code complexity probably does.

If short: if you can't cache the bound methods (and as far as I can tell, in general you can't--even though 99% of the time it would work), I don't think there's any other significant win here. 

So, if the globals->builtins optimization is worth doing, don't tie it to another optimization that's much more complicated and less useful like this, or we'll never get your simple and useful idea. 



More information about the Python-Dev mailing list