#2651 - KeyError does not round trip strings
Hi guys, there's this 2 year old bug about making strings passed to KeyError round trip: http://bugs.python.org/issue2651 There are three things I like you to present opinions on. 0. The moratorium. Based on the old 2.x patch there I created a new one for py3k. It's been reviewed and it was actually quite close to it being committed when Georg reminded us that there's this moratorium situation. So, please -1 that change here or on the issue if you think it should be stopped until the moratorium ends. Georg, Antoine and Michael Foord seem to be +1 on it despite the moratorium. (guys, please correct me if I'm wrong) 1. The patch makes KeyError behave analogically to IOError so that the first arg is now a message and the second is the actual key.
raise KeyError("Key not found", "a Scotsman on a horse") Traceback (most recent call last): ... KeyError: 'Key not found: a Scotsman on a horse'
This is backwards incompatible (which has been addressed for the stdlib in the patch). Now, for non-empty e.args, the key is stored in e.args[-1] whereas it used to in e.args[0]. We could swap the args to make it backwards compatible but then we lose consistency with IOError and the issue on the tracker was originally targetting consistency. 2. Some people suggest adding e.key to KeyError. I like the idea but in my opinion currently it is not implementable in a reliable way. a) if the party raising the exception does not pass any arguments, what would the expected behaviour of e.key be? `None` is a valid key so returning this can be misleading. b) if the party raising the exception passes one argument, how do we know it's a key and not a message? Take for instance "Set is empty" and such. Presenting e.key = "Set is empty" is just wrong. c) if the party raising the exception passes two arguments, we already know which one is the key. So in this case it would work well but at the same time it would be somewhat redundant. Antoine and Michael Foord suggest that we simply do a "best-effort thing" and present `None` if no args are passed and always treat the only argument as a key. It would be consistent with what's IOError is doing at the moment. I'm on the fence here myself. -- Best regards, Łukasz Langa tel. +48 791 080 144 WWW http://lukasz.langa.pl/
On Wed, 4 Aug 2010 15:39:16 +0200 Łukasz Langa <lukasz@langa.pl> wrote:
1. The patch makes KeyError behave analogically to IOError so that the first arg is now a message and the second is the actual key.
raise KeyError("Key not found", "a Scotsman on a horse") Traceback (most recent call last): ... KeyError: 'Key not found: a Scotsman on a horse'
I suppose you mean KeyError: Key not found: 'a Scotsman on a horse' ?
This is backwards incompatible (which has been addressed for the stdlib in the patch). Now, for non-empty e.args, the key is stored in e.args[-1] whereas it used to in e.args[0]. We could swap the args to make it backwards compatible but then we lose consistency with IOError and the issue on the tracker was originally targetting consistency.
I don't think consistency with IOError is very important. IOError and KeyError have basically nothing in common.
2. Some people suggest adding e.key to KeyError. I like the idea but in my opinion currently it is not implementable in a reliable way.
a) if the party raising the exception does not pass any arguments, what would the expected behaviour of e.key be? `None` is a valid key so returning this can be misleading.
b) if the party raising the exception passes one argument, how do we know it's a key and not a message? Take for instance "Set is empty" and such. Presenting e.key = "Set is empty" is just wrong.
As per your patch, all builtins will have been converted to the two argument form, though, and arguably they are the most common source of KeyErrors.
c) if the party raising the exception passes two arguments, we already know which one is the key. So in this case it would work well but at the same time it would be somewhat redundant.
What do you mean? You can certainly use e.args[-1] but it's an ugly and highly unintuitive notation. I wish the args stuff could die peacefully. Regards Antoine.
2010/8/4 Łukasz Langa <lukasz@langa.pl>:
1. The patch makes KeyError behave analogically to IOError so that the first arg is now a message and the second is the actual key.
I agree with Antoine; there's no point to this.
2. Some people suggest adding e.key to KeyError. I like the idea but in my opinion currently it is not implementable in a reliable way.
This is interesting and useful. I'd be really happy to see e.key be present if the key is known (because it was specifically provided to the constructor: KeyError(key=...)), or not present if the key isn't known. (The idea is much less interesting if code can't distinguish between the key-is-known and the key-not-known cases.) The runtime and standard library should be adjusted to provide the key whenever possible, of course. Though I doubt this would break anything, we've lived without this long enough that the it doesn't represent a sufficient failing that the moratorium should be broken. It can wait. -Fred -- Fred L. Drake, Jr. <fdrake at gmail.com> "A storm broke loose in my mind." --Albert Einstein
2010/8/5 Fred Drake <fdrake@acm.org>:
2010/8/4 Łukasz Langa <lukasz@langa.pl>:
1. The patch makes KeyError behave analogically to IOError so that the first arg is now a message and the second is the actual key.
I agree with Antoine; there's no point to this.
2. Some people suggest adding e.key to KeyError. I like the idea but in my opinion currently it is not implementable in a reliable way.
This is interesting and useful.
I'd be really happy to see e.key be present if the key is known (because it was specifically provided to the constructor: KeyError(key=...)), or not present if the key isn't known. (The idea is much less interesting if code can't distinguish between the key-is-known and the key-not-known cases.)
The runtime and standard library should be adjusted to provide the key whenever possible, of course.
Though I doubt this would break anything, we've lived without this long enough that the it doesn't represent a sufficient failing that the moratorium should be broken. It can wait.
+1 on what Fred said (i.e. post-moratorium, add a keyword-only "key" argument to KeyError, set "e.key" only if that argument is supplied, update the standard library to supply it and use a default message of "'Key not found: %r' % key" if the key argument is supplied without an explicit message). Also +1 for doing the equivalent with AttributeError and an "attr" keyword only argument. Since a keyword-only approach doesn't actually *break* any current code, I'm only -0 on doing that for 3.2 rather than -1. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Wed, Aug 4, 2010 at 5:57 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
and use a default message of "'Key not found: %r' % key" if the key argument is supplied without an explicit message
I suspect you meant a default message of 'Key not found: %r' % (key,) since `key` might be a 1-tuple. :-) -Fred -- Fred L. Drake, Jr. <fdrake at gmail.com> "A storm broke loose in my mind." --Albert Einstein
On Thu, Aug 5, 2010 at 8:02 AM, Fred Drake <fdrake@acm.org> wrote:
On Wed, Aug 4, 2010 at 5:57 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
and use a default message of "'Key not found: %r' % key" if the key argument is supplied without an explicit message
I suspect you meant a default message of
'Key not found: %r' % (key,)
since `key` might be a 1-tuple. :-)
Gah, you're right. Maybe I should have said "'Key not found: {}'.format(key)" :) Crazy-overloaded-mod-operators'ly, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 8/4/2010 6:09 PM, Nick Coghlan wrote:
On Thu, Aug 5, 2010 at 8:02 AM, Fred Drake<fdrake@acm.org> wrote:
On Wed, Aug 4, 2010 at 5:57 PM, Nick Coghlan<ncoghlan@gmail.com> wrote:
and use a default message of "'Key not found: %r' % key" if the key argument is supplied without an explicit message
I suspect you meant a default message of
'Key not found: %r' % (key,)
since `key` might be a 1-tuple. :-)
Gah, you're right.
Maybe I should have said "'Key not found: {}'.format(key)" :)
'Key not found: {!r}'.format(key) Eric.
Wiadomość napisana przez Nick Coghlan w dniu 2010-08-04, o godz. 23:57:
2010/8/5 Fred Drake <fdrake@acm.org>:
2010/8/4 Łukasz Langa <lukasz@langa.pl>:
1. The patch makes KeyError behave analogically to IOError so that the first arg is now a message and the second is the actual key.
I agree with Antoine; there's no point to this.
So I'm proposing to close the original issue #2651 and not include what's there. (see below though)
2. Some people suggest adding e.key to KeyError. I like the idea but in my opinion currently it is not implementable in a reliable way.
This is interesting and useful.
I'd be really happy to see e.key be present if the key is known (because it was specifically provided to the constructor: KeyError(key=...)), or not present if the key isn't known. (The idea is much less interesting if code can't distinguish between the key-is-known and the key-not-known cases.)
The runtime and standard library should be adjusted to provide the key whenever possible, of course.
Though I doubt this would break anything, we've lived without this long enough that the it doesn't represent a sufficient failing that the moratorium should be broken. It can wait.
+1 on what Fred said (i.e. post-moratorium, add a keyword-only "key" argument to KeyError, set "e.key" only if that argument is supplied, update the standard library to supply it and use a default message of "'Key not found: %r' % key" if the key argument is supplied without an explicit message). Also +1 for doing the equivalent with AttributeError and an "attr" keyword only argument.
Good stuff guys. Shall we do an e.index for IndexErrors as well?
Since a keyword-only approach doesn't actually *break* any current code, I'm only -0 on doing that for 3.2 rather than -1.
I'm -1 because we would alter the standard library to use this functionality which would make it incompatible with other implementations. That is of course unless Guido says we should add it anyway ;) We might create a separate issue superseding #2651 which will be all about presenting sensible named arguments for built-in exceptions. Does this sound like a good plan? -- Best regards, Łukasz Langa tel. +48 791 080 144 WWW http://lukasz.langa.pl/
2010/8/4 Łukasz Langa <lukasz@langa.pl>:
Shall we do an e.index for IndexErrors as well?
I don't recall stumbling over that need, but the parallel makes it tempting. I expect is should be a separate patch, though. Antoine's right about using keyword args from C, though. I'd expect a new helper function that handles this specific case, since it's likely to be common. Whether it used a keyword are or just performed a setattr after the exception is created doesn't really matter. -Fred -- Fred L. Drake, Jr. <fdrake at gmail.com> "A storm broke loose in my mind." --Albert Einstein
2010/8/5 Fred Drake <fdrake@acm.org>:
2010/8/4 Łukasz Langa <lukasz@langa.pl>:
Shall we do an e.index for IndexErrors as well?
I don't recall stumbling over that need, but the parallel makes it tempting. I expect is should be a separate patch, though.
Antoine's right about using keyword args from C, though. I'd expect a new helper function that handles this specific case, since it's likely to be common. Whether it used a keyword are or just performed a setattr after the exception is created doesn't really matter.
Yeah, helper functions for C code that accepted the extra arguments would do the trick. I actually had a look to see what IOError does with its attributes and I think it qualifies as being a little on the special side (replicated in both 2.6 and the py3k branch):
io1 = IOError(1) io2 = IOError(1, 2) io3 = IOError(1, 2, 3) io4 = IOError(1, 2, 3, 4) io1, io2, io3, io4 (IOError(1,), IOError(1, 2), IOError(1, 2), IOError(1, 2, 3, 4)) io1.errno, io2.errno, io3.errno, io4.errno (None, 1, 1, None) io1.strerror, io2.strerror, io3.strerror, io4.strerror (None, 2, 2, None) io1.filename, io2.filename, io3.filename, io4.filename (None, None, 3, None)
One argument = no attributes set Two arguments = errno, strerror set (including if second argument is explicitly None) Three arguments = errno, strerror, filename set Four or more arguments = no attributes set Keyword arguments are not supported by IOError (or EnvironmentError, which is where the above behaviour is actually implemented). That precedent would deem it acceptable to adopt a backwards compatible protocol that still allowed arbitrary positional arguments for Key/Attribute/IndexError, but treated the 2 argument case specially. However, the IOError behaviour really doesn't strike me as a particularly good example for us to be following, so PEP 3151 may want to consider the issue of tidying up those exception signatures. The "right" way to go still appears to be to allow arbitrary positional arguments (which go into .args) for backwards compatibility, then add appropriate keyword-only arguments. IOError could get some keyword only arguments as well, retaining the somewhat odd behaviour above for backwards compatibility reasons (although the repr in the 3 argument case should be fixed). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Thu, 5 Aug 2010 07:57:07 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
+1 on what Fred said (i.e. post-moratorium, add a keyword-only "key" argument to KeyError, set "e.key" only if that argument is supplied, update the standard library to supply it and use a default message of "'Key not found: %r' % key" if the key argument is supplied without an explicit message). Also +1 for doing the equivalent with AttributeError and an "attr" keyword only argument.
Keyword-only arguments are rather annoying to use from C code, though. Regards Antoine.
participants (5)
-
Antoine Pitrou -
Eric Smith -
Fred Drake -
Nick Coghlan -
Łukasz Langa