special method lookup: how much do we care?

A while ago, Guido declared that all special method lookups on new-style classes bypass __getattr__ and __getattribute__. This almost completely consistent now, and I've been working on patching up a few incorrect cases. I've know hit __enter__ and __exit__. The compiler generates LOAD_ATTR instructions for these, so it uses the normal lookup. The only way I can see to fix this is add a new opcode which uses _PyObject_LookupSpecial, but I don't think we really care this much. Opinions?

Benjamin Peterson wrote:
A while ago, Guido declared that all special method lookups on new-style classes bypass __getattr__ and __getattribute__. This almost completely consistent now, and I've been working on patching up a few incorrect cases. I've know hit __enter__ and __exit__. The compiler generates LOAD_ATTR instructions for these, so it uses the normal lookup. The only way I can see to fix this is add a new opcode which uses _PyObject_LookupSpecial, but I don't think we really care this much. Opinions?
1.More consistent attribute lookup is, to me, a feature of 3.x and I appreciate you working on this. 2. I am puzzled why those two methods should be extra special, but don't know enough to say more. 3. If there are only those two or a couple of other exceptions, I'd like them listed in the 'Special method lookup' ref doc section.
tjr

2009/5/8 Terry Reedy tjreedy@udel.edu:
- I am puzzled why those two methods should be extra special, but don't
know enough to say more.
They're not supposed to be special, which is the reason for this message. :) Currently the interpreter will call __getattr__ when looking them up. This is not the way it should be.

Benjamin Peterson wrote:
2009/5/8 Terry Reedy tjreedy@udel.edu:
- I am puzzled why those two methods should be extra special, but don't
know enough to say more.
They're not supposed to be special, which is the reason for this message. :) Currently the interpreter will call __getattr__ when looking them up. This is not the way it should be.
I was trying to ask the same question as Daniel did more clearly, and which you answered: they are special special methods because they are not in the PyTypeObject struct like the other special (name) methods. And that, I presume, is because they are specific to context manager objects, while all other 'special' methods (that I notice in 'Special method names') are more general in being applicable to multiple types.
Since built-in functions are compiled to load_global, call_function and operations to various special op codes, I could imagine that .__enter__ and .__exit__ are currently the only implicitly invoked special names that explicitly appear in code objects. I can see why you ask before burning an opcode (with parameter) to avoid that.
There are two issues: 1) bypass instance lookup; 2) bypass .__getattribute__() calling. I presume you have or can do at least the first with a custom .__getattribute__ method.
Terry Jan Reedy

On Fri, May 8, 2009 at 1:09 PM, Benjamin Peterson benjamin@python.orgwrote:
I've know hit __enter__ and __exit__. The compiler generates LOAD_ATTR instructions for these, so it uses the normal lookup. The only way I can see to fix this is add a new opcode which uses _PyObject_LookupSpecial, but I don't think we really care this much. Opinions?
Why does this problem arise only with __enter__ and __exit__?
-- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC http://stutzbachenterprises.com

2009/5/8 Daniel Stutzbach daniel@stutzbachenterprises.com:
On Fri, May 8, 2009 at 1:09 PM, Benjamin Peterson benjamin@python.org wrote:
I've know hit __enter__ and __exit__. The compiler generates LOAD_ATTR instructions for these, so it uses the normal lookup. The only way I can see to fix this is add a new opcode which uses _PyObject_LookupSpecial, but I don't think we really care this much. Opinions?
Why does this problem arise only with __enter__ and __exit__?
Normally special methods use slots of the PyTypeObject struct. typeobject.c looks up all those methods on Python classes correctly. In the case of __enter__ and __exit__, the compiler generates bytecode to look them up, and that bytecode use PyObject_Getattr.

On Fri, May 8, 2009 at 6:14 PM, Benjamin Peterson benjamin@python.orgwrote:
Normally special methods use slots of the PyTypeObject struct. typeobject.c looks up all those methods on Python classes correctly. In the case of __enter__ and __exit__, the compiler generates bytecode to look them up, and that bytecode use PyObject_Getattr.
Would this problem apply to all special methods that don't use a slot in PyTypeObject, then? I know of several other examples:
__reduce__ __setstate__ __reversed__ __length_hint__ __sizeof__
(unless I misunderstand the definition of "special methods", which is possible)
-- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC http://stutzbachenterprises.com

2009/5/8 Daniel Stutzbach daniel@stutzbachenterprises.com:
On Fri, May 8, 2009 at 6:14 PM, Benjamin Peterson benjamin@python.org wrote:
Normally special methods use slots of the PyTypeObject struct. typeobject.c looks up all those methods on Python classes correctly. In the case of __enter__ and __exit__, the compiler generates bytecode to look them up, and that bytecode use PyObject_Getattr.
Would this problem apply to all special methods that don't use a slot in PyTypeObject, then? I know of several other examples:
Yes. I didn't think of those.
__reduce__ __setstate__ __reversed__ __length_hint__ __sizeof__
(unless I misunderstand the definition of "special methods", which is possible)

Benjamin Peterson wrote:
2009/5/8 Daniel Stutzbach daniel@stutzbachenterprises.com:
On Fri, May 8, 2009 at 6:14 PM, Benjamin Peterson benjamin@python.org wrote:
Normally special methods use slots of the PyTypeObject struct. typeobject.c looks up all those methods on Python classes correctly. In the case of __enter__ and __exit__, the compiler generates bytecode to look them up, and that bytecode use PyObject_Getattr.
Would this problem apply to all special methods that don't use a slot in PyTypeObject, then? I know of several other examples:
Yes. I didn't think of those.
__reduce__ __setstate__ __reversed__ __length_hint__ __sizeof__
(unless I misunderstand the definition of "special methods", which is possible)
__reversed__, at least, is called by the reversed() builtin, so there is no LOAD_ATTR k (__reversed__) byte code. So for that, the problem is reduced to accessing type(it).__reversed__ without going thru type(it).__getattribute__. I would think that a function that did that would work for the others on the list (all 4?) that also have no LOAD_ATTR bytecode. Would a modified version of object.__getattribute__ work?
tjr

2009/5/8 Terry Reedy tjreedy@udel.edu:
Benjamin Peterson wrote:
2009/5/8 Daniel Stutzbach daniel@stutzbachenterprises.com:
On Fri, May 8, 2009 at 6:14 PM, Benjamin Peterson benjamin@python.org wrote:
Normally special methods use slots of the PyTypeObject struct. typeobject.c looks up all those methods on Python classes correctly. In the case of __enter__ and __exit__, the compiler generates bytecode to look them up, and that bytecode use PyObject_Getattr.
Would this problem apply to all special methods that don't use a slot in PyTypeObject, then? I know of several other examples:
Yes. I didn't think of those.
__reduce__ __setstate__ __reversed__ __length_hint__ __sizeof__
(unless I misunderstand the definition of "special methods", which is possible)
__reversed__, at least, is called by the reversed() builtin, so there is no LOAD_ATTR k (__reversed__) byte code. So for that, the problem is reduced to accessing type(it).__reversed__ without going thru type(it).__getattribute__. I would think that a function that did that would work for the others on the list (all 4?) that also have no LOAD_ATTR bytecode. Would a modified version of object.__getattribute__ work?
No, it's easier to just use _PyObject_LookupSpecial there.

Benjamin Peterson wrote:
__reduce__ __setstate__ __reversed__ __length_hint__ __sizeof__
No, it's easier to just use _PyObject_LookupSpecial there.
Does that mean that the above 5 'work correctly' (or can easily be made to do so)? Leaving just __entry__ and __exit__ as problems?

2009/5/9 Terry Reedy tjreedy@udel.edu:
Benjamin Peterson wrote:
__reduce__ __setstate__ __reversed__ __length_hint__ __sizeof__
No, it's easier to just use _PyObject_LookupSpecial there.
Does that mean that the above 5 'work correctly' (or can easily be made to do so)? Leaving just __entry__ and __exit__ as problems?
Yes, __enter__ and __exit__ are the tricky ones.

Are we solving an actual problem by changing the behaviour here, or is it just a case of foolish consistency?
Seems to me that trying to pin down exactly what constitutes a "special method" is a fool's errand, especially if you want it to include __enter__ and __exit__ but not __reduce__, etc.

2009/5/9 Greg Ewing greg.ewing@canterbury.ac.nz:
Are we solving an actual problem by changing the behaviour here, or is it just a case of foolish consistency?
"No implementation detail is obscure enough."
For example, Maciek Fijalkowski of PyPy told me that he cares about this because someone is bound to eventually rely on it, and PyPy will have to follow CPython.
Seems to me that trying to pin down exactly what constitutes a "special method" is a fool's errand, especially if you want it to include __enter__ and __exit__ but not __reduce__, etc.
IMO, if it's a callable that begins with __ and ends with __, it's a special method.

Benjamin Peterson schrieb:
A while ago, Guido declared that all special method lookups on new-style classes bypass __getattr__ and __getattribute__. This almost completely consistent now, and I've been working on patching up a few incorrect cases. I've know hit __enter__ and __exit__. The compiler generates LOAD_ATTR instructions for these, so it uses the normal lookup. The only way I can see to fix this is add a new opcode which uses _PyObject_LookupSpecial, but I don't think we really care this much. Opinions?
It's easier to introduce a separate opcode like SETUP_WITH; the compilation of a with statement produces quite a lot of bytecode which could be made more efficient that way.
Georg

Benjamin Peterson wrote:
A while ago, Guido declared that all special method lookups on new-style classes bypass __getattr__ and __getattribute__. This almost completely consistent now, and I've been working on patching up a few incorrect cases. I've know hit __enter__ and __exit__. The compiler generates LOAD_ATTR instructions for these, so it uses the normal lookup. The only way I can see to fix this is add a new opcode which uses _PyObject_LookupSpecial, but I don't think we really care this much. Opinions?
As Georg pointed out, the expectation was that we would eventually add a SETUP_WITH opcode that used the special method lookup (and hopefully speed with statements up to a point where they're competitive with writing out the associated try statement directly). The current code is the way it is because there is no "LOAD_SPECIAL" opcode and adding type dereferencing logic to the expansion would have been difficult without a custom opcode.
For other special methods that are looked up from Python code, the closest we can ever get is to bypass the instance (i.e. using "type(obj).__method__(obj, *args)") to avoid metaclass confusion. The type slots are even *more* special than that because they bypass __getattribute__ and __getattr__ even on the metaclass for speed reasons.
There's a reason the docs already say that for a guaranteed override you *must* actually define the special method on the class rather than merely making it accessible via __getattr__ or even __getattribute__.
The PyPy guys are right to think that some developer somewhere is going to rely on these implementation details in CPython at some point. However lots of developers rely on CPython ref counting as well, no matter how many times they're told not to do that if they want to support alternative interpreters.
Cheers, Nick.

Nick Coghlan wrote:
Benjamin Peterson wrote:
A while ago, Guido declared that all special method lookups on new-style classes bypass __getattr__ and __getattribute__. This almost completely consistent now, and I've been working on patching up a few incorrect cases. I've know hit __enter__ and __exit__. The compiler generates LOAD_ATTR instructions for these, so it uses the normal lookup. The only way I can see to fix this is add a new opcode which uses _PyObject_LookupSpecial, but I don't think we really care this much. Opinions?
As Georg pointed out, the expectation was that we would eventually add a SETUP_WITH opcode that used the special method lookup (and hopefully speed with statements up to a point where they're competitive with writing out the associated try statement directly). The current code is the way it is because there is no "LOAD_SPECIAL" opcode and adding type dereferencing logic to the expansion would have been difficult without a custom opcode.
For other special methods that are looked up from Python code, the closest we can ever get is to bypass the instance (i.e. using "type(obj).__method__(obj, *args)") to avoid metaclass confusion. The type slots are even *more* special than that because they bypass __getattribute__ and __getattr__ even on the metaclass for speed reasons.
There's a reason the docs already say that for a guaranteed override you *must* actually define the special method on the class rather than merely making it accessible via __getattr__ or even __getattribute__.
The PyPy guys are right to think that some developer somewhere is going to rely on these implementation details in CPython at some point. However lots of developers rely on CPython ref counting as well, no matter how many times they're told not to do that if they want to support alternative interpreters.
It's actually very annoying for things like writing Mock or proxy objects when this behaviour is inconsistent (sorry should have spoken up earlier).
The Python interpreter bases some of its decisions on whether these methods exist at all - and when you have objects that provide methods through __getattr__ then you can accidentally get screwed if magic method lookup returns an object unexpectedly when it should have raised an AttributeError.
Of course for proxy objects it might be more convenient if *all* attribute access did go through __getattr__ - but with that not the case it is much better for it to be consistent rather than have to put in specific workaround code.
All the best,
Michael
Cheers, Nick.

Michael Foord wrote:
Nick Coghlan wrote:
Benjamin Peterson wrote:
A while ago, Guido declared that all special method lookups on new-style classes bypass __getattr__ and __getattribute__. This almost completely consistent now, and I've been working on patching up a few incorrect cases. I've know hit __enter__ and __exit__. The compiler generates LOAD_ATTR instructions for these, so it uses the normal lookup. The only way I can see to fix this is add a new opcode which uses _PyObject_LookupSpecial, but I don't think we really care this much. Opinions?
As Georg pointed out, the expectation was that we would eventually add a SETUP_WITH opcode that used the special method lookup (and hopefully speed with statements up to a point where they're competitive with writing out the associated try statement directly). The current code is the way it is because there is no "LOAD_SPECIAL" opcode and adding type dereferencing logic to the expansion would have been difficult without a custom opcode.
For other special methods that are looked up from Python code, the closest we can ever get is to bypass the instance (i.e. using "type(obj).__method__(obj, *args)") to avoid metaclass confusion. The type slots are even *more* special than that because they bypass __getattribute__ and __getattr__ even on the metaclass for speed reasons.
There's a reason the docs already say that for a guaranteed override you *must* actually define the special method on the class rather than merely making it accessible via __getattr__ or even __getattribute__.
The PyPy guys are right to think that some developer somewhere is going to rely on these implementation details in CPython at some point. However lots of developers rely on CPython ref counting as well, no matter how many times they're told not to do that if they want to support alternative interpreters.
It's actually very annoying for things like writing Mock or proxy objects when this behaviour is inconsistent (sorry should have spoken up earlier).
The Python interpreter bases some of its decisions on whether these methods exist at all - and when you have objects that provide methods through __getattr__ then you can accidentally get screwed if magic method lookup returns an object unexpectedly when it should have raised an AttributeError.
Of course for proxy objects it might be more convenient if *all* attribute access did go through __getattr__ - but with that not the case it is much better for it to be consistent rather than have to put in specific workaround code.
Suggestion: have something like "from __future__" but affecting compile-time behaviour (like pragmas in some other languages), such as causing Python to generate bytecodes which perform all attribute access through __getattr__.

On Sun, May 10, 2009 11:51PM, Nick Coghlan wrote:
However lots of developers rely on CPython ref counting as well, no matter how many times they're told not to do that if they want to support alternative interpreters.
Cheers, Nick.
From socket.py:
# Wrapper around platform socket objects. This implements # a platform-independent dup() functionality. The # implementation currently relies on reference counting # to close the underlying socket object. class _socketobject(object):
You don't know how much time I've spent trying to understand why test_httpserver.py hanged indefinitely when I was experimenting with new opcodes in my VM.
Cheers, Cesare
participants (9)
-
Benjamin Peterson
-
Cesare Di Mauro
-
Daniel Stutzbach
-
Georg Brandl
-
Greg Ewing
-
Michael Foord
-
MRAB
-
Nick Coghlan
-
Terry Reedy