the error that raises an AttributeError should be passed to __getattr__

Hi everyone, A while back I had a conversation with some folks over on python-list. I was having issues implementing error handling of `AttributeError`s using `__getattr__`. My problem is that it is currently impossible for a `__getattr__` in Python to know which method raised the `AttributeError` that was caught by `__getattr__` if there are nested methods. For example, we cannot tell the difference between `A.x` not existing (which would raise an AttributeError) and some attribute inside `A.x` not existing (which also raises an AttributeError). This is evident from the stack trace that gets printed to screen, but `__getattr__` doesn't get that stack trace. I propose that the error that triggers an `AttributeError` should get passed to `__getattr__` (if `__getattr__` exists of course). Then, when handling errors, users could dig into the problematic error if they so desire. What do you think? Best, Jason

On Mon, Jun 19, 2017 at 04:06:56PM -0500, Jason Maldonis wrote:
I didn't understand what you were talking about here at first. If you write something like A.x.y where y doesn't exist, it's A.x.__getattr__ that is called, not A.__getattr__. But I went and looked at the thread in Python-Ideas and discovered that you're talking about the case where A.x is a descriptor, not an ordinary attribute, and the descriptor leaks AttributeError. Apparently you heavily use properties, and __getattr__, and find that the two don't interact well together when the property getters and setters themselves raise AttributeError. I think that's relevant information that helps explain the problem you are hoping to fix. So I *think* this demonstrates the problem: class A(object): eggs = "text" def __getattr__(self, name): if name == 'cheese': return "cheddar" raise AttributeError('%s missing' % name) @property def spam(self): return self.eggs.uper() # Oops. a = A() a.spam Which gives us Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 6, in __getattr__ AttributeError: spam missing But you go on to say that:
I can't reproduce that! As you can see from the above, the stack trace doesn't say anything about the actual missing attribute 'uper'. So I must admit I don't actually understand the problem you are hoping to solve. It seems to be different from my understanding of it.
What precisely will be passed to __getattr__? The exception instance? The full traceback object? The name of the missing attribute? Something else? It is hard to really judge this proposal without more detail. I think the most natural thing to pass would be the exception instance, but AttributeError instances don't record the missing attribute name directly (as far as I can tell). Given: try: ''.foo except AttributeError as e: print(e.???) there's nothing in e we can inspect to get the name of the missing exception, 'foo'. (As far as I can see.) We must parse the error message itself, which we really shouldn't do, because the error message is not part of the exception API and could change at any time. So... what precisely should be passed to __getattr__, and what exactly are you going to do with it? Having said that, there's another problem: adding this feature (whatever it actually is) to __getattr__ will break every existing class that uses __getattr__. The problem is that everyone who writes a __getattr__ method writes it like this: def __getattr__(self, name): not: def __getattr__(self, name, error): so the class will break when the method receives two arguments (excluding self) but only has one parameter. *If* we go down this track, it would probably require a __future__ import for at least one release, probably more: - in 3.7, use `from __future__ import extra_getattr_argument` - in 3.8, deprecate the single-argument form of __getattr__ - in 3.9 or 4.0 no longer require the __future__ import. That's a fairly long and heavy process, and will be quite annoying to those writing cross-version code using __getattr__, but it can be done. But only if it actually helps solve the problem. I'm not convinced that it does. It comes down to the question of what this second argument is, and how do you expect to use it? -- Steve

On Tue, Jun 20, 2017 at 10:18 AM, Steven D'Aprano <steve@pearwood.info> wrote:
I'm quoting Steven's post, but I'm addressing the OP. One good solution to this is a "guard point" around your property functions. def noleak(*exc): def deco(func): @functools.wraps(func) def wrapper(*a, **kw): try: return func(*a, **kw) except exc: raise RuntimeError return wrapper return deco @property @noleak(AttributeError) def spam(self): return self.eggs.uper() In fact, you could make this into a self-wrapping system if you like: def property(func, *, _=property): return _(noleak(AttributeError)(func)) Now, all your @property functions will be guarded: any AttributeErrors they raise will actually bubble as RuntimeErrors instead. Making this work with setters and deleters is left as an exercise for the reader. ChrisA

On Tue, Jun 20, 2017 at 11:31:34AM +1000, Chris Angelico wrote:
You've still got to write it in the first place. That's a pain, especially since (1) it doesn't do you any good before 3.7 if not later, and (2) even if this error parameter is useful (which is yet to be established), it's a pretty specialised use. Most of the time, you already know the name that failed (its the one being looked up). Perhaps a better approach is to prevent descriptors from leaking AttributeError in the first place? Change the protocol so that if descriptor.__get__ raises AttributeError, it is caught and re-raised as RuntimeError, similar to StopIteration and generators. Or maybe we decide that it's actually a feature, not a problem, for an AttributeError inside self.attr.__get__ to look like self.attr is missing. I don't know. (Also everything I said applies to __setattr__ and __delattr__ as well.) -- Steve

On 06/19/2017 07:44 PM, Steven D'Aprano wrote:
On Mon, Jun 19, 2017 at 07:36:09PM -0700, Ethan Furman wrote:
On 06/19/2017 07:26 PM, Steven D'Aprano wrote:
value and name are attributes of every Enum member; to be specific, they are descriptors, and so live in the class namespace. Enum members also live in the class namespace, so how do you get both a member name value and the value descriptor to both live in the class namespace? Easy. ;) Have the "value" and "name" descriptors check to see if they are being called on an instance, or an the class. If called on an instance they behave normally, returning the "name" or "value" data; but if called on the class the descriptor raises AttributeError, which causes Python to try the Enum class' __getattr__ method, which can find the member and return it... or raise AttributeError again if there is no "name" or "value" member. Here's the docstring from the types.DynamicClassAttribute in question: class DynamicClassAttribute: """Route attribute access on a class to __getattr__. This is a descriptor, used to define attributes that act differently when accessed through an instance and through a class. Instance access remains normal, but access to an attribute through a class will be routed to the class's __getattr__ method; this is done by raising AttributeError. This allows one to have properties active on an instance, and have virtual attributes on the class with the same name (see Enum for an example). """ -- ~Ethan~

On Tue, Jun 20, 2017 at 12:26 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Gotcha, yep. I was just confused by your two-parter that made it look like it would be hard (or impossible) to write code that would work on both 3.6 and the new protocol.
This can't be done globally, because that's how a descriptor can be made conditional (it raises AttributeError to say "this attribute does not, in fact, exist"). But it's easy enough - and safe enough - to do it just for your own module, where you know in advance that any AttributeError is a leak. The way generators and StopIteration interact was more easily fixed, because generators have two legitimate ways to emit data (yield and return), but there's no easy way for a magic method to say "I don't have anything to return" other than an exception. Well, that's not strictly true. In JavaScript, they don't raise StopIteration from iterators - they always return a pair of values ("done" and the actual value, where "done" is either false for a yield or true for a StopIteration). That complicates the normal case but it does make the unusual case a bit easier. Also, it's utterly and fundamentally incompatible with the current system, so it'd have to be a brand new competing protocol. ChrisA

First, I apologize for the poor post. Your corrections were exactly correct: This is only relevant in the context of properties/descriptors, and the property swallows the error message and it isn't printed to screen. I should not be typing without testing. passed into __getattr__, but I don't have a strong opinion on that. I'll assume that's true for the rest of this post, however. To clarify my mistakes in my first post, your example illustrates what I wanted to show: class A(object): eggs = "text" def __getattr__(self, name): if name == 'cheese': return "cheddar" raise AttributeError('%s missing' % name) @property def spam(self): return self.eggs.uper() # Oops. a = A() a.spam Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 6, in __getattr__ AttributeError: spam missing This swallows the AttributeError from `eggs.uper()` and it isn't available. Even if it were available, I see your point that it may not be especially useful. In one iteration of my code I was looking through the stack trace using the traceback module to find the error I wanted, but I quickly decided that was a bad idea because I couldn't reliably find the error. However, with the full error, it would (I think) be trivial to find the relevant error in the stack trace. With the full stack trace, I would hope that you could properly do any error handling you wanted. However, if the error was available in __getattr__, we could at least `raise from` so that the error isn't completely swallowed. I.e. your example would be slightly modified like this: class A(object): eggs = "text" def __getattr__(self, name, error): if name == 'cheese': return "cheddar" raise AttributeError('%s missing' % name) from error ... which I think is useful. this non-backwards-compatible-change just isn't worth it. Maybe if there are multiple small updates to error handling it could be worth it (which I believe I read was something the devs care quite a bit about atm), but I don't think that this change is a huge deal. that I am simply "renaming" the error from one I can't handle (due to name conflicts) to one I can. But the modification is just name change -- if I can handle the RuntimeError correctly, I feel like I should have just been able to handle the original AttributeError correctly (because in practice they should be raising an error in response to the exact same problem). That said, your decorator works great and gives me the functionality I needed.
On Mon, Jun 19, 2017 at 10:10 PM, Chris Angelico <rosuav@gmail.com> wrote:

On Mon, Jun 19, 2017 at 04:06:56PM -0500, Jason Maldonis wrote:
I didn't understand what you were talking about here at first. If you write something like A.x.y where y doesn't exist, it's A.x.__getattr__ that is called, not A.__getattr__. But I went and looked at the thread in Python-Ideas and discovered that you're talking about the case where A.x is a descriptor, not an ordinary attribute, and the descriptor leaks AttributeError. Apparently you heavily use properties, and __getattr__, and find that the two don't interact well together when the property getters and setters themselves raise AttributeError. I think that's relevant information that helps explain the problem you are hoping to fix. So I *think* this demonstrates the problem: class A(object): eggs = "text" def __getattr__(self, name): if name == 'cheese': return "cheddar" raise AttributeError('%s missing' % name) @property def spam(self): return self.eggs.uper() # Oops. a = A() a.spam Which gives us Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 6, in __getattr__ AttributeError: spam missing But you go on to say that:
I can't reproduce that! As you can see from the above, the stack trace doesn't say anything about the actual missing attribute 'uper'. So I must admit I don't actually understand the problem you are hoping to solve. It seems to be different from my understanding of it.
What precisely will be passed to __getattr__? The exception instance? The full traceback object? The name of the missing attribute? Something else? It is hard to really judge this proposal without more detail. I think the most natural thing to pass would be the exception instance, but AttributeError instances don't record the missing attribute name directly (as far as I can tell). Given: try: ''.foo except AttributeError as e: print(e.???) there's nothing in e we can inspect to get the name of the missing exception, 'foo'. (As far as I can see.) We must parse the error message itself, which we really shouldn't do, because the error message is not part of the exception API and could change at any time. So... what precisely should be passed to __getattr__, and what exactly are you going to do with it? Having said that, there's another problem: adding this feature (whatever it actually is) to __getattr__ will break every existing class that uses __getattr__. The problem is that everyone who writes a __getattr__ method writes it like this: def __getattr__(self, name): not: def __getattr__(self, name, error): so the class will break when the method receives two arguments (excluding self) but only has one parameter. *If* we go down this track, it would probably require a __future__ import for at least one release, probably more: - in 3.7, use `from __future__ import extra_getattr_argument` - in 3.8, deprecate the single-argument form of __getattr__ - in 3.9 or 4.0 no longer require the __future__ import. That's a fairly long and heavy process, and will be quite annoying to those writing cross-version code using __getattr__, but it can be done. But only if it actually helps solve the problem. I'm not convinced that it does. It comes down to the question of what this second argument is, and how do you expect to use it? -- Steve

On Tue, Jun 20, 2017 at 10:18 AM, Steven D'Aprano <steve@pearwood.info> wrote:
I'm quoting Steven's post, but I'm addressing the OP. One good solution to this is a "guard point" around your property functions. def noleak(*exc): def deco(func): @functools.wraps(func) def wrapper(*a, **kw): try: return func(*a, **kw) except exc: raise RuntimeError return wrapper return deco @property @noleak(AttributeError) def spam(self): return self.eggs.uper() In fact, you could make this into a self-wrapping system if you like: def property(func, *, _=property): return _(noleak(AttributeError)(func)) Now, all your @property functions will be guarded: any AttributeErrors they raise will actually bubble as RuntimeErrors instead. Making this work with setters and deleters is left as an exercise for the reader. ChrisA

On Tue, Jun 20, 2017 at 11:31:34AM +1000, Chris Angelico wrote:
You've still got to write it in the first place. That's a pain, especially since (1) it doesn't do you any good before 3.7 if not later, and (2) even if this error parameter is useful (which is yet to be established), it's a pretty specialised use. Most of the time, you already know the name that failed (its the one being looked up). Perhaps a better approach is to prevent descriptors from leaking AttributeError in the first place? Change the protocol so that if descriptor.__get__ raises AttributeError, it is caught and re-raised as RuntimeError, similar to StopIteration and generators. Or maybe we decide that it's actually a feature, not a problem, for an AttributeError inside self.attr.__get__ to look like self.attr is missing. I don't know. (Also everything I said applies to __setattr__ and __delattr__ as well.) -- Steve

On 06/19/2017 07:44 PM, Steven D'Aprano wrote:
On Mon, Jun 19, 2017 at 07:36:09PM -0700, Ethan Furman wrote:
On 06/19/2017 07:26 PM, Steven D'Aprano wrote:
value and name are attributes of every Enum member; to be specific, they are descriptors, and so live in the class namespace. Enum members also live in the class namespace, so how do you get both a member name value and the value descriptor to both live in the class namespace? Easy. ;) Have the "value" and "name" descriptors check to see if they are being called on an instance, or an the class. If called on an instance they behave normally, returning the "name" or "value" data; but if called on the class the descriptor raises AttributeError, which causes Python to try the Enum class' __getattr__ method, which can find the member and return it... or raise AttributeError again if there is no "name" or "value" member. Here's the docstring from the types.DynamicClassAttribute in question: class DynamicClassAttribute: """Route attribute access on a class to __getattr__. This is a descriptor, used to define attributes that act differently when accessed through an instance and through a class. Instance access remains normal, but access to an attribute through a class will be routed to the class's __getattr__ method; this is done by raising AttributeError. This allows one to have properties active on an instance, and have virtual attributes on the class with the same name (see Enum for an example). """ -- ~Ethan~

On Tue, Jun 20, 2017 at 12:26 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Gotcha, yep. I was just confused by your two-parter that made it look like it would be hard (or impossible) to write code that would work on both 3.6 and the new protocol.
This can't be done globally, because that's how a descriptor can be made conditional (it raises AttributeError to say "this attribute does not, in fact, exist"). But it's easy enough - and safe enough - to do it just for your own module, where you know in advance that any AttributeError is a leak. The way generators and StopIteration interact was more easily fixed, because generators have two legitimate ways to emit data (yield and return), but there's no easy way for a magic method to say "I don't have anything to return" other than an exception. Well, that's not strictly true. In JavaScript, they don't raise StopIteration from iterators - they always return a pair of values ("done" and the actual value, where "done" is either false for a yield or true for a StopIteration). That complicates the normal case but it does make the unusual case a bit easier. Also, it's utterly and fundamentally incompatible with the current system, so it'd have to be a brand new competing protocol. ChrisA

First, I apologize for the poor post. Your corrections were exactly correct: This is only relevant in the context of properties/descriptors, and the property swallows the error message and it isn't printed to screen. I should not be typing without testing. passed into __getattr__, but I don't have a strong opinion on that. I'll assume that's true for the rest of this post, however. To clarify my mistakes in my first post, your example illustrates what I wanted to show: class A(object): eggs = "text" def __getattr__(self, name): if name == 'cheese': return "cheddar" raise AttributeError('%s missing' % name) @property def spam(self): return self.eggs.uper() # Oops. a = A() a.spam Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 6, in __getattr__ AttributeError: spam missing This swallows the AttributeError from `eggs.uper()` and it isn't available. Even if it were available, I see your point that it may not be especially useful. In one iteration of my code I was looking through the stack trace using the traceback module to find the error I wanted, but I quickly decided that was a bad idea because I couldn't reliably find the error. However, with the full error, it would (I think) be trivial to find the relevant error in the stack trace. With the full stack trace, I would hope that you could properly do any error handling you wanted. However, if the error was available in __getattr__, we could at least `raise from` so that the error isn't completely swallowed. I.e. your example would be slightly modified like this: class A(object): eggs = "text" def __getattr__(self, name, error): if name == 'cheese': return "cheddar" raise AttributeError('%s missing' % name) from error ... which I think is useful. this non-backwards-compatible-change just isn't worth it. Maybe if there are multiple small updates to error handling it could be worth it (which I believe I read was something the devs care quite a bit about atm), but I don't think that this change is a huge deal. that I am simply "renaming" the error from one I can't handle (due to name conflicts) to one I can. But the modification is just name change -- if I can handle the RuntimeError correctly, I feel like I should have just been able to handle the original AttributeError correctly (because in practice they should be raising an error in response to the exact same problem). That said, your decorator works great and gives me the functionality I needed.
On Mon, Jun 19, 2017 at 10:10 PM, Chris Angelico <rosuav@gmail.com> wrote:
participants (4)
-
Chris Angelico
-
Ethan Furman
-
Jason Maldonis
-
Steven D'Aprano