Use `isinstance` check during exception handling

Hi all, TL;DR: Change exception checking logic from err.__class__ in exc.__mro__ to isinstance(err, exc) Because it is more powerful, more consistent and allows for cleaner code. The reason I am proposing this change is that quite often, one has to deal with libraries behaving 'badly' with regard to exception granularity. I will take the Python base library before PEP3151 [1] as an example. It is not hard to imagine there are a lot of other cases of libraries that behave with little granularity in the exception classes. try: some(operation) except OSError as err: if err.errno == errno.ENOENT: # handle file-not-exists elif err.errno == errno.EACCES: # handle permission stuff. else: raise After PEP3151 one can now write: try: some(operation) except FileNotFoundError: # handle file-not-exists except PermissionError: # handle permission stuff. Now, PEP3151 took over a year to be resolved, judging by the dates. If we have to go through changes like this more often, one will have to wait quite a while. Also, consider backwards compatibility issues. Code using FileNotFoundError will only run on Python3.3+. Thus I propose to have the exception handling code use the normal `isinstance` machinery, as it is more powerful, and allows one to use virtual base classes to check if the instance of the exception is relevant to catch. An example: import errno # Fake `FileNotFoundError` in Python3.2 class ENOENTCheck(type): def __instancecheck__(cls, inst): return isinstance(inst, EnvironmentError) and \ inst.errno == errno.ENOENT class FileNotFoundError(EnvironmentError, metaclass=ENOENTCheck): pass try: open("no-such-file") except Exception as e: print(isinstance(e, FileNotFoundError)) This would print `True`. But, using `FileNotFoundError` as a class to catch will not work, because the `__mro__` is checked instead. I proposed the change on the Python issue tracker [2] already, but it was (rightfully, I think) decided that it needs some more discussion because it will bring a (potentially huge) semantic change. As an opening, here are some pros and cons I can think of: Cons: - *Code already written might break*: Highly unlikely. If you have written code that overrides `isinstance` logic, you'd more than likely not write code that depends on there being a difference in matching logic between the exception handling and normal `isinstance` calls. - *Performance will go down*: Yes, instance-checking is probably a lot more expensive then subclass-checking. I haven't made the change and benchmarked it yet, but based on Martin Panter's comment [3], it might only cost a lot when actually checking against a virtual base class. If cost is limited to those cases, I think the cost is negligible. - *Exception handling like this is just magic*: Yes, anything regarding the metaclasses might be considered as magic. But metaclasses also allow you to build great abstractions, like for instance the ORM included with Django. I think this is one of the cases where the benefits outweigh the cost. Pros: - *Cleaner code*: The same benefit PEP 3151 brought for the base library and OS errors can be extended to all the Python libraries that lack granularity. - *Consistency*: Who would ever expect that you can not always catch an exception `foo` using class `Bar` when `isinstance(foo, Bar)`. Even though I can currently think of more downsides than upsides, I do think that the downsides are really minor, while the upsides are huge: clean code and consistency. Also, I set up a proof-of-concept PEP3151 compatibility layer for Python2.7 that sort-of fakes this feature. It fakes it by checking the instance `sys.exc_info()[1]` in the `__subclasscheck__`. For the proof of concept see [4]. I only found out during porting to Python3 that it does not actually work, because of the direct checking of the `__mro__`, but that's hopefully going to be fixed soon-ish [5]. I'd be interested to hear your thoughts about this. Any more pros and cons would be welcome, as well as your judgement as to whether the breaking of backwards compatibility is potentially a bad idea. Kind regards, Sjoerd Job Postmus [1] PEP 3151 -- Reworking the OS and IO exception hierarchy https://www.python.org/dev/peps/pep-3151/ [2] Call `isinstance` instead of `issubclass` during exception handling http://bugs.python.org/issue25537 [3] http://bugs.python.org/issue25537#msg253963 [4] https://github.com/sjoerdjob/exhacktion/ [5] Catching virtual subclasses in except clauses http://bugs.python.org/issue12029

On 11/4/2015 5:02 PM, sjoerdjob@sjec.nl wrote:
TL;DR: Change exception checking logic from err.__class__ in exc.__mro__
This is backwards. err.__class__ must be a (true) subclass, not a superclass, of exc, where err = exception instance, exc = exception class. There are two checks in the exception process. First, for 'raise err', is that err is a instance of StopIteration, which is to say, err.__class__ is a subclass of StopIteration. To avoid the fakery of __instancecheck__ and __subclasscheck__, I presume this is implemented as something like one of StopIteration in err.__class__.__mro__ StopIteration in type(err).__mro__ (There was a recent pydev discussion about when and why these may diverge.) The second, for 'except exc', is that exc is a superclass (other than object) of type(err). I presume this is implemented as something like exc in type(err)[:-1] # exclude object class This assume that err has already passed the first check above. The `try` doc says "An object is compatible with an exception if it is the class or a base class of the exception object". This text fails to exclude the object class as a compatible object. CPython code rejects 'except object:' with "TypeError: catching classes that do not inherit from BaseException is not allowed"
?? This predicate is True more often than the current one because it a) does not exclude exc = object
isinstance(Exception(), object) True
and b) accesses exc.__instancecheck__ (which is part of your point). This makes it a weaker predicate. I don't have strong opinions at the moment about the proposal itself, but semantic changes in core python syntax need a PEP (once you think there is *some* support). -- Terry Jan Reedy

Hi Sjoerd, I'm not clear on what problem this suggestion is solving. PEP 3151 was solving the problem of existing code failing to be specific with IOError and OSError error numbers. Can you provide some less abstract snippets as examples, including surrounding context or a link to a project that has the problem? If I saw `except Exception as e` followed by `isinstance(e, SomeError)` I would be very confused as to why `except SomeError` was not better. Is this simply for backporting FileNotFoundError? Michael On Wed, Nov 4, 2015 at 8:39 PM Terry Reedy <tjreedy@udel.edu> wrote:

After reading https://github.com/sjoerdjob/exhacktion/ I have a better understanding of what you want to accomplish. However, I'm still not clear on where the pain is coming from. If you find yourself writing an if in an except many times as you described: try: os.chown("README.md", 100, 100) except EnvironmentError as e: if e.errno == errno.ENOENT: pass else: raise Would using a context manager solve your problem? class IgnoreNotFound: def __enter__(self): return self def __exit__(self, etyp, e, etb): if issubclass(etyp, EnvironmentError) and e.message == errno.ENOENT: return True with IgnoreNotFound: os.chown("README.md", 100, 100) On Wed, Nov 4, 2015 at 9:42 PM Michael Selik <mike@selik.org> wrote:

Michael Selik <mike@selik.org> writes:
You know, it occurs to me, there are some languages that support this as a high-level feature. I.e. in Visual Basic Try [Statements] Catch ex As Exception When [Boolean-valued expression] [Statements] End Try This is in fact the *only* way that MS Windows "Structured Exception Handling" exceptions work in C: __try { ... } __except([BOOL-valued expression]) { ... } Both of these have the effect that the exception "was never caught" if it returns false.

Ah, a minor glitch when writing the e-mail. My apologies. It should be exc in err.__class__.__mro__
I can't actually find the logic which handles the `StopIteration`, unless you are talking specifically about the exception handling in for-loops and everything else that relates to iterables. (The code I'm looking at is Python/errors.c:PyErr_GivenExceptionMatches).
From what I can find (again: errors.c), the current logic is:
(note: `err` is what is raised, `exc` is what's matched against) * if `exc` is a tuple, recurse over the elements of the tuple. * if the `err` is an instance, replace it with its class * if both `err` and `exc` are subclasses of exception: check PyType_IsSubtype(err, exc) which basically is the same as (see Objects/typeobject.c) exc in err.__mro__ note: not err.__mro__[:-1].
My idea was basically to still demand that both `err` and `exc` must sub-class `BaseException`, but that only the check for `PyType_IsSubtype` would be replaced with `PyObject_IsInstance`.
Yes, the predicate is weaker. And that gives us more power.
It's not a syntax-change, but a semantics-change. Still I would not be surprised if it indeed does need a PEP, but there will only be a point in doing that when there is *some* support. Before writing that PEP, I was checking here to see if I could find some support, or if it the pros don't outweigh the cons for the 'masses'.
-- Terry Jan Reedy
Regards, Sjoerd Job Postmus

I think with good reason. It's impossible to enumerate all possible ways a program can error. Eventually one must fall back to inspecting various error payload information and perhaps even other state. Increasing error granularity is a slippery slope. The shift should be driven by evidence of annoying boilerplate code written in the standard library and major community projects. On Thu, Nov 5, 2015 at 3:13 AM Sjoerd Job Postmus <sjoerdjob@sjec.nl> wrote:
Definitely not! PEP3151 was just meant as an example of an improvement of exception granularity taking a long time.

On Thu, Nov 5, 2015 at 7:05 PM, Serhiy Storchaka <storchaka@gmail.com> wrote:
I think the point is to future-proof Python against further such incidents, and to allow some measure of control even over third-party libraries. For instance, psycopg2 dumps ProgrammingError on you for lots of all SQL errors, but has a few special cases like IntegrityError; if there's some special case that isn't handled by psycopg2 itself, you have no recourse but to catch something too broad and then manually filter. I do like the idea of being able to refactor exception specialization code: try: ... except BroadError as e: if is_narrower_error(e): # handle NarrowerError else: raise but I don't know whether it's worth complicating the core exception handling system with such a rare case. Also, what happens if you have a bug in your __instancecheck__ method? It's going to be *extremely* hard to track down. How often does this sort of except-level specialization come up? ChrisA

Chris Angelico <rosuav@gmail.com> writes:
The syntax could be this: except BroadError as e if is_narrower_error(e): I think the bytecode *already* evaluates a boolean expression which today happens to always be a comparison of the exception's type. Notably, the exception type itself can apparently be an arbitrary expression, which (I was mildly surprised to discover) has access to sys.exc_info. class NeverThrown(Exception): pass def exfilter(): etype, e = sys.exc_info()[:2] if is_narrower_error(e): return etype else: return NeverThrown try: ... except exfilter() as e: ... This is, of course, a *profoundly* ugly way to do this.

On Thu, Nov 5, 2015 at 8:58 AM, Random832 <random832@fastmail.com> wrote:
That's pretty neat. You could parameterize the condition and allow a library function to handle the ugly part. def conditional(cls, cond): exc_type, exc = sys.exc_info()[:2] if issubclass(exc_type, cls) and cond(exc): return exc_type return NeverThrown try: ... except conditional(ValueError, lambda e: e.args[0] == 42): ...

On Nov 4, 2015, at 14:02, sjoerdjob@sjec.nl wrote:
Correct me if I'm wrong, but doesn't Python allow you to raise a class and handle it without ever creating an instance? If an instance is needed (e.g., because the exception is caught by a handler with an as clause), one is created by calling exc(), but if it's not needed, that doesn't happen. Assuming I'm right, your proposal would mean the instance is _always_ needed, so if you raise a type, exc() is always called. So, for example: class MyException(Exception): def __init__(self, thingy, *a, **kw): super().__init__(*a, **kw) self.thingy = thingy try: raise MyException except MyException: print('whoops') Instead of printing "whoops", it would now (I think) abort with a TypeError about trying to call MyException with the wrong number of arguments. Plus, there's almost certainly code where someone skips creating the instance as an optimization—probably it's usually a useless premature one, but maybe someone has tested and it makes a difference in a real program. One obvious possibility is: if exc is itself a subclass of BaseException, you use the existing subclass check; if not, instead of using the subclass check on its type, you use the new instance check. But you'd have to think that through to make sure it makes sense.

I think the main concern here is not "Would that benefit the corner cases?" - it surely would. What we should be worrying is, "Will this impede the common cases?". I think that it should be possible, for all the (very) common cases, to use what CPython already uses internally. If it matches, I don't see why we'd even bother going through the __subclasscheck__ and __instancecheck__ hooks. IMO, it should only use those (whichever ends up being the prefered choice - I have no strong preference for now either way) if the standard exception lookup fails, and possibly including the fact that BaseException doesn't need to be a direct subclass (e.g. virtual subclasses). Just my 2 cents. -Emanuel

On Nov 5, 2015, at 07:00, sjoerdjob@sjec.nl wrote:
Then what's the point of the code at Python/errors.c#186 that deals with the fact that the given exception (err) can be either an instance or a type? If it must always be an instance, surely those two lines are unnecessary and misleading. In fact, looking into it a bit more: the normal way to check for an exception from C is with PyErr_ExceptionMatches, which calls PyErr_GivenExceptionMatches with the type, not the value, of the exception. So, even if an instance has been created, you don't usually have it inside PyErr_GivenExceptionMatches anyway, so the proposed change wouldn't work without further changes. At the very least you'd need to change PyErr_ExceptionMatches to pass the value instead of the type. But presumably there's a reason it passes the type in the first place (maybe in CPython it's not possible to raise a type without an instance from pure Python code, but it is from C API code?), which means there's probably other code that has to change as well. And that's assuming there's no other code than calls PyErr_GivenExceptionMatches with the value of PyErr_Occurred() (or tstate->curexc_type or any other way of getting an exception type) anywhere in CPython or in any extension module anyone's ever written. Since there's nothing about the docs that implies that this is invalid, and in fact the docs for PyErr_ExceptionMatches and the source code for PyErr_GivenExceptionMatches both strongly imply that passing a type is the normal way to check, I doubt that assumption is true.

I added an assert before the except, asserting that no instance is passed, and surprisingly that assert triggers. Also: in some cases, tstate->cur_value is actually a PyString object instead of an exception subclass (for instance raised by PyErr_FormatV (or something like that, don't have the code in front of me right now). So yes, it does appear that a lot of code might need changing. But it also seems odd to me that exceptions get raised with strings for values instead of exception instances. It might seem troublesome to change these locations to raise actual instances instead of strings, but those might actually just be hidden bugs just the same. Do you think anyone would object if I'd go over the code looking for places where type(tstate->cur_value) != tstate->cur_type and propose patches for these?

On Nov 5, 2015, at 12:18, Sjoerd Job Postmus <sjoerdjob@sjec.nl> wrote:
It would be really nice if someone remembered _why_ the code is this way. It may be a performance issue—if the exception is raised in C code and handled in C code and never displayed, the exception object never gets used, so why waste time (and complicate the reference graph) creating one? Or it may just be a holdover from Python 1.x's string exceptions. Or there may be some other reason neither of us can guess. Without knowing which it is, it's hard to know what to do about it. But meanwhile, even if you patch all of CPython, that doesn't change the fact that there are probably extension libraries out there setting exceptions with a null or string value, or checking the exception by using PyErr_ExceptionOccurred or other means that ignore the value (without going through PyErr_GivenExceptionMatches or anything else you could change), which are perfectly valid things to do according to the docs, and have worked for decades. How are you going to fix all of those? Also, even if it's true that pure Python code can't create an exception without a value, only C code can, the way the language reference is written, that seems to be something specific to CPython, not guaranteed by the language. So, other implementations might be doing something different, and you'd have to fix all of them as well. (That may not be a big deal—are there any other major 3.x implementations besides PyPy, which is still on 3.2?) The best thing I can think of is to add a new API around exception values instead of types (a replacement for ExceptionOccurred that returns the value instead of the type, a replacement for ExceptionMatches that requires an instance instead of taking an instance or type, etc.), change GivenExceptionMatches (and the except clause implementation) to use that API, change SetNone and friends to always create a type, deprecate ExceptionOccurred and friends, change the documentation to make it clear than raising a type in Python is the same as raising Type(), wait a couple versions for the C API deprecation, and only then can you assume that ExceptionMatches always gets an instance (and decide what you do if not), so you can make your proposed change.

On Fri, Nov 6, 2015 at 9:03 AM, Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
It would be really nice if someone remembered _why_ the code is this way. It may be a performance issue—if the exception is raised in C code and handled in C code and never displayed, the exception object never gets used, so why waste time (and complicate the reference graph) creating one?
I have some recollection that it's most commonly hit when you're working with StopIteration - an internal iterator raises it, a 'for' loop catches it, and no instance is ever constructed. ChrisA

Andrew Barnert via Python-ideas writes:
Sjoerd Job Postmus <sjoerdjob@sjec.nl> wrote:
No, I'm sure mostly they're not. They're working code that has been there since the early days, perhaps a few from the 2.x days in modules. Python is generally very conservative about changing working code.
I think, yes, though I don't know enough about it to object myself. Code churn of this kind is generally frowned upon, unless you can show that the risk of hidden bugs of the kind you refer to above is substantially greater than the risk that *you* will introduce bugs with your patches.
It would be really nice if someone remembered _why_ the code is this way. It may be a performance issue—
Or it could be bootstrapping -- it may be that "something" can be raised before the exception machinery is fully initialized. (On reflection, that seems unlikely, if raise is available, BaseException probably is too. But you never know. :) I would guess it was discussed, decided, and pronounced in one of the early Python 3 PEPs. (I'm sure that for Python 2 it was purely an issue of backward compatibility.)

After digging more into the code, I concur. It's quite easy to make some changes that break some stuff in this logic.
To all: thank you for at least considering my suggestion, and helping me consider the pros and cons. Even though I myself still think it would be a major benefit to change from `issubclass` (or `exc in err.__mro__`) to `isinstance`, I do now see that the changes needed to the Python codebase are quite substantial. The consideration about third-party C-modules is also something to further think about. As I understand it now, the changes needed---and the possibility of introduced bugs---outweigh the added benefit of such a change, at least for now. (Still, it send me along a nice cursory browse of the Python code base, including the bytecode compiler and the evaluator. The compilation of the bytecode for exception handling is nicely documented. To the author(s) of that code: great job, it's very understandable!)

On Thursday, November 5, 2015 9:57 PM, "sjoerdjob@sjec.nl" <sjoerdjob@sjec.nl> wrote:
Looking at this a bit further, I think there may be something you can do that gets everything you actually wanted, without changes all over the place and serious risks. Basically, it's just two pretty small changes in errors.c: * Change PyErr_GivenExceptionMatches so that if given a value that's an instance of an exception type (PyExceptionInstance_Check(err) is true), instead of just getting its class and then ignoring the value, it does a standard isinstance check on the value. * Change PyErr_ExceptionMatches so that if there is an exception value, and its type matches the exception type, it is passed to PyErr_GivenExceptionMatches; otherwise, the exception type is passed. So, all Python code that raises and handles exceptions will get the benefit of isinstance checks. Any new C code that wants that benefit can get it very easily (only raise with PyErr_SetObject or something that uses it like PyErr_SetFromErrno; only handle with PyErr_GivenExceptionMatches). Any existing C code that skips creating an instance will continue to do so, and to get the same fast-path check used today, except for one extra call to PyExceptionInstance_Check(tstate->curexc_value). Of course it's not just that small patch to errors.c. You also need tests to exercise the new functionality. And a docs patch for the Exceptions section of the C API. And also a docs patch to the Exceptions section in the Reference. But no other code patches. And you'd definitely need to performance-test two things: First, that the extra PyExceptionInstance_Check doesn't slow down fast-path handling (e.g., in an empty for loops). Second, whatever performance tests your original proposal would have needed, to make sure the isinstance check doesn't noticeably slow down pure-Python code that handles a lot of exceptions. I think the only C API docs change needed is to PyErr_ExceptionMatches. However, a note explaining how to make sure to get and/or to bypass the isinstance check as desired might be useful. For the reference docs, I think just this note at the end of the section:
Note: Exceptions raised and handled by Python code use the same mechanism as the isinstance function for determining whether an exception matches. However, exceptions raised or handled by the implementation (or implementation-level extensions) may bypass that and check the type directly.
New in 3.6: In previous versions of Python, exception handling always bypassed the isinstance function and checked the type directly.

Sjoerd, glad we agree, happy to help. In case someone is still interested in making a patch, I'd like to clarify why it makes me nervous. Though Sjoerd led with a TL;DR that summarized his suggestion as a swap of isinstance for issubclass, a better summary would have been: "I prefer more specific exception types". The proposed solution was to enable custom exception types that conditionally insert themselves into the class hierarchy depending on data stored on a raised exception instance. Unfortunately, that solution requires changing an obscure error handling feature. As Sjoerd noted when he sent the first message in this thread, it is possible this would introduce a backwards-incompatibility. Perhaps no one in the standard library or major public projects would write such weird code as to get tripped up by the proposed change, but it's always possible that someone out there is using this feature for an important project. I think we've learned the lesson from the long slog to Python 3 adoption (and how Perl 6 harmed that community, and countless other examples) that backwards compatibility is of utmost importance. The current code pattern for a single try/except is to catch the general exception and have an if-condition in the except block.That's a frequent and readable practice. try: cursor.execute(query) except sqlite3.OperationalError as str(e): if 'syntax error' in str(e): print('Your SQL is bad') If that code or similar code appears many times, a good way to refactor with the current features of Python is to use a context manager. class SqlSyntaxErrorHandler: def __enter__(self): return self def __exit__(self, etyp, einstance, etraceback): if (issubclass(etyp, sqlite3.OperationalError) and 'syntax error' in str(einstance)): print('Your SQL is bad') with SqlSyntaxErrorHandler: cursor.execute(first_query) with SqlSyntaxErrorHandler: cursor.execute(second_query) While context managers may not be a beginner topic, they are typically easier to read and debug than using metaclasses for conditional runtime inheritance. Even if there would be some benefit to the metaclass style, it's not worth the risk of changing current error handling behavior. -- Michael On Fri, Nov 6, 2015, 2:49 AM Andrew Barnert via Python-ideas < python-ideas@python.org> wrote:

On 11/7/2015 2:18 PM, Michael Selik wrote:
I like this example of how to use the error handling ability of __exit__ to set up an 'error handling context. This is much clearer than the examples in the pep. However, I checked and issubclass(None, whatever) does not work. True should be returned when the exception is handled. def __exit__(self, etyp, einstance, etraceback): if (etyp and issubclass(etyp, sqlite3.OperationalError) and 'syntax error' in str(einstance)): print('Your SQL is bad') return True
-- Terry Jan Reedy

Sorry about that. I also forgot to instantiate the class. Here's functioning code... class SqlSyntaxErrorHandler: def __enter__(self): return self def __exit__(self, etyp, einstance, etraceback): if (etyp is not None and issubclass(etyp, sqlite3.OperationalError) and 'syntax error' in str(einstance)): print('Your SQL is bad') return True with SqlSyntaxErrorHandler(): cursor.execute(query) On Sat, Nov 7, 2015 at 4:29 PM Terry Reedy <tjreedy@udel.edu> wrote:

Hi again everybody, So I've had a couple of days extra to think about it, and yes: it might still break some things, but I expect extremely little. After the discussions here, I came to think it would require a lot of changes to the codebase, and decided to let it rest. Last night, it dawned to me that there might be a fix with very little code-changes needed: * add an extra operator `PyCmp_EXC_VAL_MATCH` * in the compiler, change `DUP_TOP` to `DUP_TOP_TWO` just above the generation of the compare. I don't have the exact location, as I'm typing on my phone right now. * create an extra function PyErr_GivenExceptionValueMatches which calls PyObject_isInstance * call that from ceval for the `PyCmp_EXC_VAL_MATCH. * most important: update the docs. A little bit of code duplication, but not that much. As for the scary part. * existing code: Because this would only introduce changes for Python code, existing C-extensions would not be affected. Python code would, but I think the fallout would be very little, as the only noticeable changes are when a class overrides its __instancecheck__, which I believe is too negligible a corner-case. * performance: with the changes already proposed for the moving towards __subclasscheck__ from the current behaviour, there are some optimizations included which should also benefit this proposal. Exact measurements are of course needed. One of the changes I do see needed in Python-code is having the unit-testing libraries change their 'assertRaises' code to reflect the new behavior. I agree this might cause a backward-compatibility issue, however after some thinking about it, I don't think it will affect more than 1 percent of already existing code, probably even way less. I don't have any numbers to support that feeling, though. As far as an important project: if you're depending on obscure features, you should probably be ready to expect changes there. Yes, the current behavior is precisely documented, but in nearly all of the cases the actual implementation and the proposed change agree, unless one did magic. I made the changes I listed above and ran `make; make test`, and got only 1 error due to socket.socket running out of memory or something. I'll re-run the tests on a clean checkout as well, but for now I think it's unrelated. Regards, Sjoerd Job Postmus

I searched cpython/Lib/*.py for except blocks that have an if-statement as the first line. r'except \w+ as\s+\w+:\n\s+if .*:' Not the world's most thorough regex, but it seemed to do the trick. There are very few repeated except/if blocks that would be appropriate for refactoring with any technique. The majority were in two modules -- mailbox.py and pathlib.py -- checking if the errorno was ENOINT and/or ENOTDIR. I suppose those could be improved with FileNotFound and NotADirectoryError, but it doesn't look like we need anything more potent. Is there a different place we should look for refactoring candidates? On Mon, Nov 9, 2015, 12:41 AM <sjoerdjob@sjec.nl> wrote:

On 11/4/2015 5:02 PM, sjoerdjob@sjec.nl wrote:
TL;DR: Change exception checking logic from err.__class__ in exc.__mro__
This is backwards. err.__class__ must be a (true) subclass, not a superclass, of exc, where err = exception instance, exc = exception class. There are two checks in the exception process. First, for 'raise err', is that err is a instance of StopIteration, which is to say, err.__class__ is a subclass of StopIteration. To avoid the fakery of __instancecheck__ and __subclasscheck__, I presume this is implemented as something like one of StopIteration in err.__class__.__mro__ StopIteration in type(err).__mro__ (There was a recent pydev discussion about when and why these may diverge.) The second, for 'except exc', is that exc is a superclass (other than object) of type(err). I presume this is implemented as something like exc in type(err)[:-1] # exclude object class This assume that err has already passed the first check above. The `try` doc says "An object is compatible with an exception if it is the class or a base class of the exception object". This text fails to exclude the object class as a compatible object. CPython code rejects 'except object:' with "TypeError: catching classes that do not inherit from BaseException is not allowed"
?? This predicate is True more often than the current one because it a) does not exclude exc = object
isinstance(Exception(), object) True
and b) accesses exc.__instancecheck__ (which is part of your point). This makes it a weaker predicate. I don't have strong opinions at the moment about the proposal itself, but semantic changes in core python syntax need a PEP (once you think there is *some* support). -- Terry Jan Reedy

Hi Sjoerd, I'm not clear on what problem this suggestion is solving. PEP 3151 was solving the problem of existing code failing to be specific with IOError and OSError error numbers. Can you provide some less abstract snippets as examples, including surrounding context or a link to a project that has the problem? If I saw `except Exception as e` followed by `isinstance(e, SomeError)` I would be very confused as to why `except SomeError` was not better. Is this simply for backporting FileNotFoundError? Michael On Wed, Nov 4, 2015 at 8:39 PM Terry Reedy <tjreedy@udel.edu> wrote:

After reading https://github.com/sjoerdjob/exhacktion/ I have a better understanding of what you want to accomplish. However, I'm still not clear on where the pain is coming from. If you find yourself writing an if in an except many times as you described: try: os.chown("README.md", 100, 100) except EnvironmentError as e: if e.errno == errno.ENOENT: pass else: raise Would using a context manager solve your problem? class IgnoreNotFound: def __enter__(self): return self def __exit__(self, etyp, e, etb): if issubclass(etyp, EnvironmentError) and e.message == errno.ENOENT: return True with IgnoreNotFound: os.chown("README.md", 100, 100) On Wed, Nov 4, 2015 at 9:42 PM Michael Selik <mike@selik.org> wrote:

Michael Selik <mike@selik.org> writes:
You know, it occurs to me, there are some languages that support this as a high-level feature. I.e. in Visual Basic Try [Statements] Catch ex As Exception When [Boolean-valued expression] [Statements] End Try This is in fact the *only* way that MS Windows "Structured Exception Handling" exceptions work in C: __try { ... } __except([BOOL-valued expression]) { ... } Both of these have the effect that the exception "was never caught" if it returns false.

Ah, a minor glitch when writing the e-mail. My apologies. It should be exc in err.__class__.__mro__
I can't actually find the logic which handles the `StopIteration`, unless you are talking specifically about the exception handling in for-loops and everything else that relates to iterables. (The code I'm looking at is Python/errors.c:PyErr_GivenExceptionMatches).
From what I can find (again: errors.c), the current logic is:
(note: `err` is what is raised, `exc` is what's matched against) * if `exc` is a tuple, recurse over the elements of the tuple. * if the `err` is an instance, replace it with its class * if both `err` and `exc` are subclasses of exception: check PyType_IsSubtype(err, exc) which basically is the same as (see Objects/typeobject.c) exc in err.__mro__ note: not err.__mro__[:-1].
My idea was basically to still demand that both `err` and `exc` must sub-class `BaseException`, but that only the check for `PyType_IsSubtype` would be replaced with `PyObject_IsInstance`.
Yes, the predicate is weaker. And that gives us more power.
It's not a syntax-change, but a semantics-change. Still I would not be surprised if it indeed does need a PEP, but there will only be a point in doing that when there is *some* support. Before writing that PEP, I was checking here to see if I could find some support, or if it the pros don't outweigh the cons for the 'masses'.
-- Terry Jan Reedy
Regards, Sjoerd Job Postmus

I think with good reason. It's impossible to enumerate all possible ways a program can error. Eventually one must fall back to inspecting various error payload information and perhaps even other state. Increasing error granularity is a slippery slope. The shift should be driven by evidence of annoying boilerplate code written in the standard library and major community projects. On Thu, Nov 5, 2015 at 3:13 AM Sjoerd Job Postmus <sjoerdjob@sjec.nl> wrote:
Definitely not! PEP3151 was just meant as an example of an improvement of exception granularity taking a long time.

On Thu, Nov 5, 2015 at 7:05 PM, Serhiy Storchaka <storchaka@gmail.com> wrote:
I think the point is to future-proof Python against further such incidents, and to allow some measure of control even over third-party libraries. For instance, psycopg2 dumps ProgrammingError on you for lots of all SQL errors, but has a few special cases like IntegrityError; if there's some special case that isn't handled by psycopg2 itself, you have no recourse but to catch something too broad and then manually filter. I do like the idea of being able to refactor exception specialization code: try: ... except BroadError as e: if is_narrower_error(e): # handle NarrowerError else: raise but I don't know whether it's worth complicating the core exception handling system with such a rare case. Also, what happens if you have a bug in your __instancecheck__ method? It's going to be *extremely* hard to track down. How often does this sort of except-level specialization come up? ChrisA

Chris Angelico <rosuav@gmail.com> writes:
The syntax could be this: except BroadError as e if is_narrower_error(e): I think the bytecode *already* evaluates a boolean expression which today happens to always be a comparison of the exception's type. Notably, the exception type itself can apparently be an arbitrary expression, which (I was mildly surprised to discover) has access to sys.exc_info. class NeverThrown(Exception): pass def exfilter(): etype, e = sys.exc_info()[:2] if is_narrower_error(e): return etype else: return NeverThrown try: ... except exfilter() as e: ... This is, of course, a *profoundly* ugly way to do this.

On Thu, Nov 5, 2015 at 8:58 AM, Random832 <random832@fastmail.com> wrote:
That's pretty neat. You could parameterize the condition and allow a library function to handle the ugly part. def conditional(cls, cond): exc_type, exc = sys.exc_info()[:2] if issubclass(exc_type, cls) and cond(exc): return exc_type return NeverThrown try: ... except conditional(ValueError, lambda e: e.args[0] == 42): ...

On Nov 4, 2015, at 14:02, sjoerdjob@sjec.nl wrote:
Correct me if I'm wrong, but doesn't Python allow you to raise a class and handle it without ever creating an instance? If an instance is needed (e.g., because the exception is caught by a handler with an as clause), one is created by calling exc(), but if it's not needed, that doesn't happen. Assuming I'm right, your proposal would mean the instance is _always_ needed, so if you raise a type, exc() is always called. So, for example: class MyException(Exception): def __init__(self, thingy, *a, **kw): super().__init__(*a, **kw) self.thingy = thingy try: raise MyException except MyException: print('whoops') Instead of printing "whoops", it would now (I think) abort with a TypeError about trying to call MyException with the wrong number of arguments. Plus, there's almost certainly code where someone skips creating the instance as an optimization—probably it's usually a useless premature one, but maybe someone has tested and it makes a difference in a real program. One obvious possibility is: if exc is itself a subclass of BaseException, you use the existing subclass check; if not, instead of using the subclass check on its type, you use the new instance check. But you'd have to think that through to make sure it makes sense.

I think the main concern here is not "Would that benefit the corner cases?" - it surely would. What we should be worrying is, "Will this impede the common cases?". I think that it should be possible, for all the (very) common cases, to use what CPython already uses internally. If it matches, I don't see why we'd even bother going through the __subclasscheck__ and __instancecheck__ hooks. IMO, it should only use those (whichever ends up being the prefered choice - I have no strong preference for now either way) if the standard exception lookup fails, and possibly including the fact that BaseException doesn't need to be a direct subclass (e.g. virtual subclasses). Just my 2 cents. -Emanuel

On Nov 5, 2015, at 07:00, sjoerdjob@sjec.nl wrote:
Then what's the point of the code at Python/errors.c#186 that deals with the fact that the given exception (err) can be either an instance or a type? If it must always be an instance, surely those two lines are unnecessary and misleading. In fact, looking into it a bit more: the normal way to check for an exception from C is with PyErr_ExceptionMatches, which calls PyErr_GivenExceptionMatches with the type, not the value, of the exception. So, even if an instance has been created, you don't usually have it inside PyErr_GivenExceptionMatches anyway, so the proposed change wouldn't work without further changes. At the very least you'd need to change PyErr_ExceptionMatches to pass the value instead of the type. But presumably there's a reason it passes the type in the first place (maybe in CPython it's not possible to raise a type without an instance from pure Python code, but it is from C API code?), which means there's probably other code that has to change as well. And that's assuming there's no other code than calls PyErr_GivenExceptionMatches with the value of PyErr_Occurred() (or tstate->curexc_type or any other way of getting an exception type) anywhere in CPython or in any extension module anyone's ever written. Since there's nothing about the docs that implies that this is invalid, and in fact the docs for PyErr_ExceptionMatches and the source code for PyErr_GivenExceptionMatches both strongly imply that passing a type is the normal way to check, I doubt that assumption is true.

I added an assert before the except, asserting that no instance is passed, and surprisingly that assert triggers. Also: in some cases, tstate->cur_value is actually a PyString object instead of an exception subclass (for instance raised by PyErr_FormatV (or something like that, don't have the code in front of me right now). So yes, it does appear that a lot of code might need changing. But it also seems odd to me that exceptions get raised with strings for values instead of exception instances. It might seem troublesome to change these locations to raise actual instances instead of strings, but those might actually just be hidden bugs just the same. Do you think anyone would object if I'd go over the code looking for places where type(tstate->cur_value) != tstate->cur_type and propose patches for these?

On Nov 5, 2015, at 12:18, Sjoerd Job Postmus <sjoerdjob@sjec.nl> wrote:
It would be really nice if someone remembered _why_ the code is this way. It may be a performance issue—if the exception is raised in C code and handled in C code and never displayed, the exception object never gets used, so why waste time (and complicate the reference graph) creating one? Or it may just be a holdover from Python 1.x's string exceptions. Or there may be some other reason neither of us can guess. Without knowing which it is, it's hard to know what to do about it. But meanwhile, even if you patch all of CPython, that doesn't change the fact that there are probably extension libraries out there setting exceptions with a null or string value, or checking the exception by using PyErr_ExceptionOccurred or other means that ignore the value (without going through PyErr_GivenExceptionMatches or anything else you could change), which are perfectly valid things to do according to the docs, and have worked for decades. How are you going to fix all of those? Also, even if it's true that pure Python code can't create an exception without a value, only C code can, the way the language reference is written, that seems to be something specific to CPython, not guaranteed by the language. So, other implementations might be doing something different, and you'd have to fix all of them as well. (That may not be a big deal—are there any other major 3.x implementations besides PyPy, which is still on 3.2?) The best thing I can think of is to add a new API around exception values instead of types (a replacement for ExceptionOccurred that returns the value instead of the type, a replacement for ExceptionMatches that requires an instance instead of taking an instance or type, etc.), change GivenExceptionMatches (and the except clause implementation) to use that API, change SetNone and friends to always create a type, deprecate ExceptionOccurred and friends, change the documentation to make it clear than raising a type in Python is the same as raising Type(), wait a couple versions for the C API deprecation, and only then can you assume that ExceptionMatches always gets an instance (and decide what you do if not), so you can make your proposed change.

On Fri, Nov 6, 2015 at 9:03 AM, Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
It would be really nice if someone remembered _why_ the code is this way. It may be a performance issue—if the exception is raised in C code and handled in C code and never displayed, the exception object never gets used, so why waste time (and complicate the reference graph) creating one?
I have some recollection that it's most commonly hit when you're working with StopIteration - an internal iterator raises it, a 'for' loop catches it, and no instance is ever constructed. ChrisA

Andrew Barnert via Python-ideas writes:
Sjoerd Job Postmus <sjoerdjob@sjec.nl> wrote:
No, I'm sure mostly they're not. They're working code that has been there since the early days, perhaps a few from the 2.x days in modules. Python is generally very conservative about changing working code.
I think, yes, though I don't know enough about it to object myself. Code churn of this kind is generally frowned upon, unless you can show that the risk of hidden bugs of the kind you refer to above is substantially greater than the risk that *you* will introduce bugs with your patches.
It would be really nice if someone remembered _why_ the code is this way. It may be a performance issue—
Or it could be bootstrapping -- it may be that "something" can be raised before the exception machinery is fully initialized. (On reflection, that seems unlikely, if raise is available, BaseException probably is too. But you never know. :) I would guess it was discussed, decided, and pronounced in one of the early Python 3 PEPs. (I'm sure that for Python 2 it was purely an issue of backward compatibility.)

After digging more into the code, I concur. It's quite easy to make some changes that break some stuff in this logic.
To all: thank you for at least considering my suggestion, and helping me consider the pros and cons. Even though I myself still think it would be a major benefit to change from `issubclass` (or `exc in err.__mro__`) to `isinstance`, I do now see that the changes needed to the Python codebase are quite substantial. The consideration about third-party C-modules is also something to further think about. As I understand it now, the changes needed---and the possibility of introduced bugs---outweigh the added benefit of such a change, at least for now. (Still, it send me along a nice cursory browse of the Python code base, including the bytecode compiler and the evaluator. The compilation of the bytecode for exception handling is nicely documented. To the author(s) of that code: great job, it's very understandable!)

On Thursday, November 5, 2015 9:57 PM, "sjoerdjob@sjec.nl" <sjoerdjob@sjec.nl> wrote:
Looking at this a bit further, I think there may be something you can do that gets everything you actually wanted, without changes all over the place and serious risks. Basically, it's just two pretty small changes in errors.c: * Change PyErr_GivenExceptionMatches so that if given a value that's an instance of an exception type (PyExceptionInstance_Check(err) is true), instead of just getting its class and then ignoring the value, it does a standard isinstance check on the value. * Change PyErr_ExceptionMatches so that if there is an exception value, and its type matches the exception type, it is passed to PyErr_GivenExceptionMatches; otherwise, the exception type is passed. So, all Python code that raises and handles exceptions will get the benefit of isinstance checks. Any new C code that wants that benefit can get it very easily (only raise with PyErr_SetObject or something that uses it like PyErr_SetFromErrno; only handle with PyErr_GivenExceptionMatches). Any existing C code that skips creating an instance will continue to do so, and to get the same fast-path check used today, except for one extra call to PyExceptionInstance_Check(tstate->curexc_value). Of course it's not just that small patch to errors.c. You also need tests to exercise the new functionality. And a docs patch for the Exceptions section of the C API. And also a docs patch to the Exceptions section in the Reference. But no other code patches. And you'd definitely need to performance-test two things: First, that the extra PyExceptionInstance_Check doesn't slow down fast-path handling (e.g., in an empty for loops). Second, whatever performance tests your original proposal would have needed, to make sure the isinstance check doesn't noticeably slow down pure-Python code that handles a lot of exceptions. I think the only C API docs change needed is to PyErr_ExceptionMatches. However, a note explaining how to make sure to get and/or to bypass the isinstance check as desired might be useful. For the reference docs, I think just this note at the end of the section:
Note: Exceptions raised and handled by Python code use the same mechanism as the isinstance function for determining whether an exception matches. However, exceptions raised or handled by the implementation (or implementation-level extensions) may bypass that and check the type directly.
New in 3.6: In previous versions of Python, exception handling always bypassed the isinstance function and checked the type directly.

Sjoerd, glad we agree, happy to help. In case someone is still interested in making a patch, I'd like to clarify why it makes me nervous. Though Sjoerd led with a TL;DR that summarized his suggestion as a swap of isinstance for issubclass, a better summary would have been: "I prefer more specific exception types". The proposed solution was to enable custom exception types that conditionally insert themselves into the class hierarchy depending on data stored on a raised exception instance. Unfortunately, that solution requires changing an obscure error handling feature. As Sjoerd noted when he sent the first message in this thread, it is possible this would introduce a backwards-incompatibility. Perhaps no one in the standard library or major public projects would write such weird code as to get tripped up by the proposed change, but it's always possible that someone out there is using this feature for an important project. I think we've learned the lesson from the long slog to Python 3 adoption (and how Perl 6 harmed that community, and countless other examples) that backwards compatibility is of utmost importance. The current code pattern for a single try/except is to catch the general exception and have an if-condition in the except block.That's a frequent and readable practice. try: cursor.execute(query) except sqlite3.OperationalError as str(e): if 'syntax error' in str(e): print('Your SQL is bad') If that code or similar code appears many times, a good way to refactor with the current features of Python is to use a context manager. class SqlSyntaxErrorHandler: def __enter__(self): return self def __exit__(self, etyp, einstance, etraceback): if (issubclass(etyp, sqlite3.OperationalError) and 'syntax error' in str(einstance)): print('Your SQL is bad') with SqlSyntaxErrorHandler: cursor.execute(first_query) with SqlSyntaxErrorHandler: cursor.execute(second_query) While context managers may not be a beginner topic, they are typically easier to read and debug than using metaclasses for conditional runtime inheritance. Even if there would be some benefit to the metaclass style, it's not worth the risk of changing current error handling behavior. -- Michael On Fri, Nov 6, 2015, 2:49 AM Andrew Barnert via Python-ideas < python-ideas@python.org> wrote:

On 11/7/2015 2:18 PM, Michael Selik wrote:
I like this example of how to use the error handling ability of __exit__ to set up an 'error handling context. This is much clearer than the examples in the pep. However, I checked and issubclass(None, whatever) does not work. True should be returned when the exception is handled. def __exit__(self, etyp, einstance, etraceback): if (etyp and issubclass(etyp, sqlite3.OperationalError) and 'syntax error' in str(einstance)): print('Your SQL is bad') return True
-- Terry Jan Reedy

Sorry about that. I also forgot to instantiate the class. Here's functioning code... class SqlSyntaxErrorHandler: def __enter__(self): return self def __exit__(self, etyp, einstance, etraceback): if (etyp is not None and issubclass(etyp, sqlite3.OperationalError) and 'syntax error' in str(einstance)): print('Your SQL is bad') return True with SqlSyntaxErrorHandler(): cursor.execute(query) On Sat, Nov 7, 2015 at 4:29 PM Terry Reedy <tjreedy@udel.edu> wrote:

Hi again everybody, So I've had a couple of days extra to think about it, and yes: it might still break some things, but I expect extremely little. After the discussions here, I came to think it would require a lot of changes to the codebase, and decided to let it rest. Last night, it dawned to me that there might be a fix with very little code-changes needed: * add an extra operator `PyCmp_EXC_VAL_MATCH` * in the compiler, change `DUP_TOP` to `DUP_TOP_TWO` just above the generation of the compare. I don't have the exact location, as I'm typing on my phone right now. * create an extra function PyErr_GivenExceptionValueMatches which calls PyObject_isInstance * call that from ceval for the `PyCmp_EXC_VAL_MATCH. * most important: update the docs. A little bit of code duplication, but not that much. As for the scary part. * existing code: Because this would only introduce changes for Python code, existing C-extensions would not be affected. Python code would, but I think the fallout would be very little, as the only noticeable changes are when a class overrides its __instancecheck__, which I believe is too negligible a corner-case. * performance: with the changes already proposed for the moving towards __subclasscheck__ from the current behaviour, there are some optimizations included which should also benefit this proposal. Exact measurements are of course needed. One of the changes I do see needed in Python-code is having the unit-testing libraries change their 'assertRaises' code to reflect the new behavior. I agree this might cause a backward-compatibility issue, however after some thinking about it, I don't think it will affect more than 1 percent of already existing code, probably even way less. I don't have any numbers to support that feeling, though. As far as an important project: if you're depending on obscure features, you should probably be ready to expect changes there. Yes, the current behavior is precisely documented, but in nearly all of the cases the actual implementation and the proposed change agree, unless one did magic. I made the changes I listed above and ran `make; make test`, and got only 1 error due to socket.socket running out of memory or something. I'll re-run the tests on a clean checkout as well, but for now I think it's unrelated. Regards, Sjoerd Job Postmus

I searched cpython/Lib/*.py for except blocks that have an if-statement as the first line. r'except \w+ as\s+\w+:\n\s+if .*:' Not the world's most thorough regex, but it seemed to do the trick. There are very few repeated except/if blocks that would be appropriate for refactoring with any technique. The majority were in two modules -- mailbox.py and pathlib.py -- checking if the errorno was ENOINT and/or ENOTDIR. I suppose those could be improved with FileNotFound and NotADirectoryError, but it doesn't look like we need anything more potent. Is there a different place we should look for refactoring candidates? On Mon, Nov 9, 2015, 12:41 AM <sjoerdjob@sjec.nl> wrote:
participants (11)
-
Andrew Barnert
-
Chris Angelico
-
Emanuel Barry
-
Ian Kelly
-
Michael Selik
-
Random832
-
Serhiy Storchaka
-
Sjoerd Job Postmus
-
sjoerdjob@sjec.nl
-
Stephen J. Turnbull
-
Terry Reedy