[Python-ideas] PEP 447: Adding type.__getdescriptor__

Nick Coghlan ncoghlan at gmail.com
Sun Dec 3 09:03:49 EST 2017


On 3 December 2017 at 21:36, Ronald Oussoren <ronaldoussoren at mac.com> wrote:
>
>
>> On 3 Dec 2017, at 03:58, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>
>> On 2 December 2017 at 01:12, Ronald Oussoren <ronaldoussoren at mac.com> wrote:
>>> On 1 Dec 2017, at 12:29, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>>> 2. Define it as a class method, but have the convention be for the
>>>> *caller* to worry about walking the MRO, and hence advise class
>>>> implementors to *never* call super() from __getdescriptor__
>>>> implementations (since doing so would nest MRO walks, and hence
>>>> inevitably have weird outcomes). Emphasise this convention by passing
>>>> the current base class from the MRO as the second argument to the
>>>> method.
>>>
>>> But that’s how I already define the method, that is the PEP proposes to change
>>> the MRO walking loop to:
>>>
>>>   for cls in mro_list:
>>>        try:
>>>            return cls.__getdescriptor__(name)  # was cls.__dict__[name]
>>>        except AttributeError:                  # was KeyError
>>>            pass
>>>
>>> Note that classes on the MRO control how to try to fetch the name at that level. The code is the same for __getdescriptor__ as a classmethod and as a method on the  metaclass.
>>
>> That's not exactly the same as what I'm suggesting, and it's the part
>> that has Mark concerned about an infinite regression due to the
>> "cls.__getdescriptor__" subexpression.
>>
>> What I'm suggesting:
>>
>>    try:
>>        getdesc = base.__dict__["__getdescriptor__”]
>
>>    except KeyError:
>>        # Use existing logic
>>    else:
>>        try:
>>            getdesc(cls, base, name)
>>        except AttributeError:
>>            pass
>
> I honestly don’t understand why that’s better than what is in the PEP, other than the way to locate __getdescriptor__.

It's the specifically the way we locate __getdescriptor__ that I'm
interested in :)

> In particular: why change the signature of __getdescriptor__ from __getdescriptor__(base, name) to __getdescriptor__(base, cls, name)?

So that the __getdescriptor__ implementation in the base has access to
the class currently being accessed. That way you can have a common
algorithm in the base class, but then tune that algorithm based on
class level attributes on the class being defined.

My inspiration for the suggestion is a combination of the signature of
https://docs.python.org/3/reference/datamodel.html#object.__get__
(where you always get passed the class that holds the property, but
are only sometimes passed the instance being accessed), and the
signature of super() itself (which you have to pass both the dynamic
class, and the current position in the MRO).

> And to keep things clear, what are “cls” and “base”? Based on your initial proposal I’m assuming that “cls” is the type of the object whose attribute we’re looking up, and “base” is the current class on the MRO that we’re looking at, that is:
>
>    def _PyType_Lookup(cls, name):
>        mro = cls.mro()
>
>        for base in mro:
>
Yep (I wrote out the full version a few posts further up the thread,
but only quoted the inner snippet here).

> Getting back to the way __getdescriptor__ is accessed: Locating this using base.__dict__[‘__getdescriptor__’] instead of base.__getdescriptor__ makes it clear that this method is not accessed using normal attribute lookup, and matches the (currently non-functional) patch that access slots on the type object from C code. The current implementation access the slot on the meta type, that is type(base).__dict__[‘__getdescriptor__’], but that doesn’t fundamentally change the mechanics.
>
> My proposal would then end up as:
>
> for base in mro:
>     try:
>         getdescr = base.__dict__[“__getdescriptor__”]
>     except KeyError:
>         try:
>               return base.__dict__[name]
>         except KeyError:
>               pass
>
>     else:
>         getdesc(base, name)
>
> This is for the classmethod version of the PEP that I’m currently preferring.
>
> The way the __getdescriptor__ implementation is located also might explain the confusion: I’m reasoning based on the implementation and that doesn’t match the PEP text in this regard. I didn’t get how this affected comprehension of the proposal, but think I’m getting there :-)

Yep, I was going off the description in the PEP, which just shows a
normal "base.__getdescriptor__" access. If the implementation already
doesn't do that and instead works as you suggest above, then that's a
good thing, and we're actually already in agreement about how it
should work - the PEP just needs to be updated to say that :)

>> * Neither type nor object implement __getdescriptor__
>
> I’m not convinced that this is strictly necessary, but also am not opposed to this.
>
> Another reason for not implementing __getdescriptor__ on type and object is that this means that the existence of the methods cannot confuse users (as this is a pretty esoteric API that most users will never have to touch).
>
> BTW. My patch more or less inlines the default __getdescriptor__, but that’s for performance reasons. Not implementing __getdescriptor__ on object and type would avoid possible confusion about this, and would make it clearer that the attribute lookup cache is still valid (and removing that cache would definitely be a bad idea)

Right, I think the null case here is to *not* include them, and then
see if that causes any insurmountable problems. I don't expect it
will, which would leave us with the simpler option of omitting them.

>> * Calling super() in __getdescriptor__ would be actively discouraged
>> without a base class to define the cooperation rules
>> * If it's missing in the base class dict, fall back to checking the
>> base class dict directly for the requested attribute
>> * cls is injected into the call by the MRO walking code *not* the
>> normal bound method machinery
>
> As mentioned above I don’t understand why the change in interface is needed, in particular why __getdescriptor__ needs to have access to the original class and not just the current class on the MRO.

I think we could likely live without it, but it's easy to pass in, and
adding it later would be difficult. If you use the signature order
"(cls, base, name)", it also means that __getdescriptor__
implementations will behave more like regular class methods, even
though they're not actually looked up that way.

>> Yeah, I'm definitely in agreement with the intent of the PEP. I'm just
>> thinking that we should aim to keep the expected semantics of the hook
>> as close as we can to the semantics of the code it's replacing, rather
>> than risking the introduction of potentially nested MRO walks (we
>> can't outright *prevent* the latter, but we *can* quite clearly say
>> "That's almost certainly a bad idea, avoid it if you possibly can”).
>
> Does my updated proposal (base.__dict__[‘__getdescriptor__’]) adres this issue?  My intend with the PEP was indeed to stay as close as  possible to the current behaviour and just replace peeking into base.__dict__ by calling an overridable special method on base.

Aye, it's sounding like it really is just the "cls.__getdescriptor__"
short hand in the PEP that was confusing me, and the way you're
intending to implement it matches the way I'm suggesting it should
work.

> The alternative is to make it possible to use something that isn’t builtin.dict as the __dict__ for types, but that requires significant changes to CPython because the C code currently assumes that __dict__ is an instance of exactly builtin.dict and changing that is likely significantly more work than just replacing PyDict_GetItem by PyMapping_GetItem (different failure modes, different refcount rules, different re-entrancy, …)

I agree with that - changing the execution namespace is one thing, but
changing the actual storage underlying cls.__dict__ would be a far
more difficult prospect.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list