On Wed, Jun 23, 2021 at 03:47:05PM +1000, Chris Angelico wrote:
Okay. Lemme give it to you *even more clearly* since the previous example didn't satisfy.
@extend(list) def in_order(self): return sorted(self)
def frob(stuff): return stuff.in_order()
from file1 import frob thing = [1, 5, 2] frob(thing) # == [1, 2, 5] def otherfrob(stuff): return stuff.in_order() otherfrob(thing) # AttributeError
Am I correct so far? The function imported from file1 has the extension method, the code in file2 does not. That's the entire point here, right?
Correct so far.
Okay. Now, what if getattr is brought into the mix?
To a first approximation (ignoring shadowing) every dot lookup can be replaced with getattr and vice versa:
obj.name <--> getattr(obj, 'name')
A simple source code transformation could handle that, and the behaviour of the code should be the same. Extension methods shouldn't change that.
# file3.py @extend(list) def in_order(self): return sorted(self)
def fetch1(stuff, attr): if attr == "in_order": return stuff.in_order if attr == "unordered": return stuff.unordered return getattr(stuff, attr)
def fetch2(stuff, attr): return getattr(stuff, attr)
In file3's scope, there is no list.unordered method, so any call like
some_list.unordered getattr(some_list, 'unordered')
will fail, regardless of which list some_list is, or where it was created. That implies that:
fetch1(some_list, 'unordered') fetch2(some_list, 'unordered')
will also fail. It doesn't matter who is calling the functions, or what module they are called from. What matters is the context where the attribute lookup occurs, which in fetch1 and fetch2 is the file3 scope.
# file4.py from file3 import fetch1, fetch2
Doesn't matter that fetch1 and fetch2 are imported into file4. They are still executed in the global scope of file3. If they called `globals()`, they would see file3's globals, not file4's. Same thing for extension methods.
@extend(list) def unordered(self): return random.shuffle(self[:])
I think that's going to always return None :-)
def fetch3(stuff, attr): if attr == "in_order": return stuff.in_order if attr == "unordered": return stuff.unordered return getattr(stuff, attr)
def fetch4(stuff, attr): return getattr(stuff, attr)
In the scope of file4, there is no list method "in_order", but there is a list method "unordered". So
some_list.in_order getattr(some_list, 'in_order')
will fail. That implies that:
fetch3(some_list, 'unordered') fetch4(some_list, 'unordered')
will also fail. It doesn't matter who is calling the functions, or what module they are called from. What matters is the context where the attribute lookup occurs, which in fetch3 and fetch4 is the file4 scope.
(By the way, I think that your example here is about ten times more obfuscated than it need be, because of the use of generic, uninformative names with numbers.)
thing = [1, 5, 2] fetch1(thing, "in_order")() fetch2(thing, "in_order")() fetch3(thing, "in_order")() fetch4(thing, "in_order")() fetch1(thing, "unordered")() fetch2(thing, "unordered")() fetch3(thing, "unordered")() fetch4(thing, "unordered")()
Okay. *NOW* which ones raise AttributeError, and which ones give the extension method?
Look at the execution context.
fetch1(thing, "in_order") and fetch2(thing, "in_order") execute in the scope of file3, where lists have an in_order extension method.
It doesn't matter that they are called from file4: the body of the fetchN functions, where the attribute access takes place, executes where the global scope is file3 and hence the extension method "in_order" is found and returned.
For the same reason, both fetch1(thing, "unordered") and fetch2(thing, "unordered") will fail.
It doesn't matter that they are called from file4: their execution context is their global scope, file3, and just as they see file3's globals, not the callers, they will see file3's extension methods.
(I say "the module's extension methods", not necessarily to imply that the extension methods are somehow attached to the module, but only that there is some sort of registry that says, in effect, "if your execution context is module X, then these extension methods are in use".)
Similarly, the body of fetch3 and fetch4 execute in the execution context of file4, where list has been extended with an unordered method. So fetch3(thing, "unordered") and fetch4(thing, "unordered") both return that unordered method.
For the same reason (the execution context), fetch3(thing, "in_order") and fetch4(thing, "in_order") both fail.
What exactly are the semantics of getattr?
Oh gods, I don't know the exact semantics of attribute look ups now! Something like this, I think:
obj.attr (same as getattr(obj, 'attr'):
if type(obj).__dict__['attr'] exists and is a data descriptor: # data descriptors are the highest priority return type(obj).__dict__['attr'].__get__()
elif obj.__dict__ exists and obj.__dict__['attr'] exists: # followed by instance attributes in the instance dict return obj.__dict__['attr']
elif type(obj) defines __slots__ and there is an 'attr' slot: # then instance attributes in slots if the slot is filled: return contents of slot 'attr' else: raise AttributeError
elif type(obj).__dict__['attr'] exists: if it is a non-data descriptor: return type(obj).__dict__['attr'].__get__() else: return type(obj).__dict__['attr']
elif type(obj) defines a __getattr__ method: return type(obj).__getattr__(obj)
else: # search the superclass hierarchy ... # if we get all the way to the end raise AttributeError
I've left out `__getattribute__`, I *think* that gets called right at the beginning. Also the result of calling `__getattr__` is checked for descriptor protocol too. And the look ups on classes are slightly different. Also when looking up on classes, metaclasses may get involved. And super() defines its own `__getattribute__` to customize the lookups. (As other objects may do too.) And some of the fine details may be wrong.
But, overall, the "big picture" should be more or less correct:
1. check for data descriptors; 2. check for instance attributes (dict or slot); 3. check for non-data descriptors and class attributes; 4. call __getattr__ if it exists; 5. search the inheritance hierarchy; 6. raise AttributeError if none of the earlier steps matched.
If we follow C# semantics, extension methods would be checked after step 4 and before step 5:
if the execution context is using extensions for this class: and 'attr' is an extension method, return that method
Please explain exactly what the semantics of getattr are, and exactly which modules it is supposed to be able to see. Remember, it is not a compiler construct or an operator. It is a function, and it lives in its own module (the builtins).
You seem to think that getattr being a function makes a difference. Why?
Aside from the possibility that it might be shadowed or deleted from builtins, can you give me any examples where `obj.attr` and `getattr(obj. 'attr')` behave differently? Even *one* example?
Okay, this is Python. You could write a class with a `__getattr__` or `__getattribute__` method that inspected the call chain and did something different if it spotted a function called "getattr". Congratulations, you are very smart and Python is very dynamic.
You might even write a __getattr__ that, oh, I don't know, returned a method if the execution context had opted in to a system that provided extra methods to your class. But I digress.
But apart from custom-made classes that deliberately play silly buggers if they see that getattr is involved, can you give an example of where it behaves differently to dot syntax?
Not a rhetorical question: is that how it works in something like Swift, or Kotlin?
I have no idea. I'm just asking how you intend it to work in Python. If you want to cite other languages, go ahead, but I'm not assuming that they already have the solution, because they are different languages. Also not a rhetorical question: Is their getattr equivalent actually an operator or compiler construct, rather than being a regular function? Because if it is, then the entire problem doesn't exist.
I really don't know why you think getattr being a function makes any difference here. It's a builtin function, written in C, and can and does call the same internal C routines used by dot notation.
And what about this?
f = functools.partial(getattr, stuff) f("in_order")
NOW which extension methods should apply? Those registered here? Those registered in the builtins? Those registered in functools?
partial is just a wrapper around its function argument, so that should behave *exactly* the same as `getattr(stuff, 'in_order')`.
So if it behaves exactly the same way that getattr would, then is it exactly the same as fetch2 and fetch4? If not, how is it different?
Okay, let's look at the partial object:
>>> import functools >>> f = functools.partial(getattr, [10, 20]) >>> f('index')(20) 1
Partial objects like f don't seem to have anything like a __globals__ attribute that allow me to tell what the execution context would be. I *think* that for Python functions (def or lambda) they just inherit the execution context from the function. For builtins, I'm not sure. I presume their execution context will be the current scope.
Right now, I've already spent multiple hours on these posts, and I have more important things to do now than argue about the minutia of partial's behaviour. But if you wanted to do an experiment, you could do something like comparing the behaviour of:
# module A.py f = lambda: globals() g = partial(globals)
# module B.py from A import f, g f() g()
and see whether f and g behave identically. I expect that f would return A's globals regardless of where it was called from, but I'm not sure what g would do. It might very well return the globals of the calling site.
In any case, with respect to getattr, the principle would be the same: the execution context defines whether the partial object sees the extension methods or not. If the execution context is A, and A has opted in to use extension methods, then it will see extension methods. If the context is B, and B hasn't opted in, then it won't.
What about other functions implemented in C? If I write a C module that calls PyObject_GetAttr, does it behave as if dot notation were used in the module that called me, or does it use my module's extension methods?
That depends. If you write a C module that calls PyObject_GetAttr right now, is that *exactly* the same as dot notation in pure-Python code?
The documentation is terse:
but if it is correct that it is precisely equivalent to dot syntax, then the same rules will apply. Has the current module opted in? If so, then does the class have an extension method of the requested name?
Same applies to code objects evaluated without a function, or whatever other exotic corner cases you think of. Whatever you think of, the answer will always be the same:
- if the execution context is a module that has opted to use extension methods, then attribute access will see extension methods;
- if not, then it won't.
If you think of a scenario where you are executing code where there is no module scope at all, and all global lookups fail, then "no module" cannot opt in to use extension methods and so the code won't see them.
If you can think of a scenario where you are executing code where there are multiple module scopes that fight for supremacy using their two weapons of fear, surprise and a fanatical devotion to the Pope, then the winner will determine the result.