[Python-ideas] PEP 447: Adding type.getdescriptor

Sun Dec 3 06:36:41 EST 2017

> On 3 Dec 2017, at 03:58, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> On 2 December 2017 at 01:12, Ronald Oussoren <ronaldoussoren at mac.com> wrote:
>> On 1 Dec 2017, at 12:29, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>> 2. Define it as a class method, but have the convention be for the
>>> *caller* to worry about walking the MRO, and hence advise class
>>> implementors to *never* call super() from __getdescriptor__
>>> implementations (since doing so would nest MRO walks, and hence
>>> inevitably have weird outcomes). Emphasise this convention by passing
>>> the current base class from the MRO as the second argument to the
>>> method.
>> 
>> But that’s how I already define the method, that is the PEP proposes to change
>> the MRO walking loop to:
>> 
>>   for cls in mro_list:
>>        try:
>>            return cls.__getdescriptor__(name)  # was cls.__dict__[name]
>>        except AttributeError:                  # was KeyError
>>            pass
>> 
>> Note that classes on the MRO control how to try to fetch the name at that level. The code is the same for __getdescriptor__ as a classmethod and as a method on the  metaclass.
> 
> That's not exactly the same as what I'm suggesting, and it's the part
> that has Mark concerned about an infinite regression due to the
> "cls.__getdescriptor__" subexpression.
> 
> What I'm suggesting:
> 
>    try:
>        getdesc = base.__dict__["__getdescriptor__”]

>    except KeyError:
>        # Use existing logic
>    else:
>        try:
>            getdesc(cls, base, name)
>        except AttributeError:
>            pass

I honestly don’t understand why that’s better than what is in the PEP, other than the way to locate __getdescriptor__. In particular: why change the signature of __getdescriptor__ from __getdescriptor__(base, name) to __getdescriptor__(base, cls, name)? 

And to keep things clear, what are “cls” and “base”? Based on your initial proposal I’m assuming that “cls” is the type of the object whose attribute we’re looking up, and “base” is the current class on the MRO that we’re looking at, that is:

   def _PyType_Lookup(cls, name):
       mro = cls.mro()

       for base in mro:
           …

Getting back to the way __getdescriptor__ is accessed: Locating this using base.__dict__[‘__getdescriptor__’] instead of base.__getdescriptor__ makes it clear that this method is not accessed using normal attribute lookup, and matches the (currently non-functional) patch that access slots on the type object from C code. The current implementation access the slot on the meta type, that is type(base).__dict__[‘__getdescriptor__’], but that doesn’t fundamentally change the mechanics.

My proposal would then end up as:

for base in mro:
    try:
        getdescr = base.__dict__[“__getdescriptor__”]
    except KeyError:
        try:
              return base.__dict__[name]
        except KeyError:
              pass

    else:
        getdesc(base, name)

This is for the classmethod version of the PEP that I’m currently preferring.

The way the __getdescriptor__ implementation is located also might explain the confusion: I’m reasoning based on the implementation and that doesn’t match the PEP text in this regard. I didn’t get how this affected comprehension of the proposal, but think I’m getting there :-)

> 
> * Neither type nor object implement __getdescriptor__

I’m not convinced that this is strictly necessary, but also am not opposed to this.

Another reason for not implementing __getdescriptor__ on type and object is that this means that the existence of the methods cannot confuse users (as this is a pretty esoteric API that most users will never have to touch).

BTW. My patch more or less inlines the default __getdescriptor__, but that’s for performance reasons. Not implementing __getdescriptor__ on object and type would avoid possible confusion about this, and would make it clearer that the attribute lookup cache is still valid (and removing that cache would definitely be a bad idea)

> * Calling super() in __getdescriptor__ would be actively discouraged
> without a base class to define the cooperation rules
> * If it's missing in the base class dict, fall back to checking the
> base class dict directly for the requested attribute
> * cls is injected into the call by the MRO walking code *not* the
> normal bound method machinery

As mentioned above I don’t understand why the change in interface is needed, in particular why __getdescriptor__ needs to have access to the original class and not just the current class on the MRO. 

> * Only "base.__dict__" needs to be assured of getting a hit on every base class
> 
> What's currently in the PEP doesn't clearly define how it thinks
> "cls.__getdescriptor__" should work without getting itself into an
> infinite loop.

See above. The PEP doesn’t explain that “cls.__getdescriptor__” isn’t normal attribute access but uses slot access.

> 
>> I don’t think there’s a good technical reason to pick either option, other than that the metaclass option forces an exception when creating a class that inherits (using multiple inheritance) from two classes that have a custom __getdescriptor__. I’m not convinced that this is a good enough reason to go for the metaclass option.
> 
> I'm still not clear on how you're planning to break the recursive loop
> for "cls.__getdescriptor__" when using a metaclass.

> 
>>> The reason I'm liking option 2 is that it leaves the existing
>>> __getattribute__ implementations fully in charge of the MRO walk, and
>>> *only* offers a way to override the "base.__dict__[name]"  part with a
>>> call to "base.__dict__['__getdescriptor__'](cls, base, name)" instead.
>> 
>> Right. That’s why I propose __getdescriptor__ in the first place. This allows Python coders to do extra work (or different work) to fetch an attribute of a specific class in the MRO and works both with regular attribute lookup as well as lookup through super().
> 
> Yeah, I'm definitely in agreement with the intent of the PEP. I'm just
> thinking that we should aim to keep the expected semantics of the hook
> as close as we can to the semantics of the code it's replacing, rather
> than risking the introduction of potentially nested MRO walks (we
> can't outright *prevent* the latter, but we *can* quite clearly say
> "That's almost certainly a bad idea, avoid it if you possibly can”).

Does my updated proposal (base.__dict__[‘__getdescriptor__’]) adres this issue?  My intend with the PEP was indeed to stay as close as  possible to the current behaviour and just replace peeking into base.__dict__ by calling an overridable special method on base.

The alternative is to make it possible to use something that isn’t builtin.dict as the __dict__ for types, but that requires significant changes to CPython because the C code currently assumes that __dict__ is an instance of exactly builtin.dict and changing that is likely significantly more work than just replacing PyDict_GetItem by PyMapping_GetItem (different failure modes, different refcount rules, different re-entrancy, …)

BTW. Thanks for feedback, I’m finally starting to understand how the PEP isn’t clear enough and how to fix that.

Ronald

[Python-ideas] PEP 447: Adding type.__getdescriptor__

[Python-ideas] PEP 447: Adding type.getdescriptor