More descriptive error message than "too many values to unpack"

Currently this code: d = {"key": "value"} for key, value in d: pass produces this error: ValueError: too many values to unpack (expected 2) I suggest that the error message should also have: 1. The name of the type of the unpacked value 2. The length of the unpacked value, if it exists and (to not execute arbitrary __len__ implementations) if the type belongs to a safe subset, e.g. only builtin types. Then the error message could be: ValueError: too many values to unpack (expected 2, found 3) from object of type str

On Sun, Mar 1, 2020 at 11:35 PM Alex Hall <alex.mojaki@gmail.com> wrote:
Adding the type name would be pretty safe to do. Adding the length, though, is a bit harder. Consider:
So the only way would be to call len(), and if it fails, fall back on the "expected 2" form. And I'm not sure if that would be worthwhile, given that it's going to have to run arbitrary code just for the sake of the error message. +0.5 for showing the type name, -0.5 for trying to show the length. ChrisA

Chris Angelico wrote:
I did address these:
The length of the unpacked value, if it exists and (to not execute arbitrary __len__ implementations) if the type belongs to a safe subset, e.g. only builtin types.
Is there anyone who thinks it's acceptable to run `len()` on arbitrary objects for an error message? Assuming 'no', then the length is only shown if the type is exactly one of list, tuple, str, etc. where we know __len__ exists and is safe. Alternatively, instead of checking the type, we could check if `__len__` itself belongs to a safe set, if we want to include e.g. subclasses of `list`.

On Mon, Mar 2, 2020 at 12:01 AM Alex Hall <alex.mojaki@gmail.com> wrote:
It seems very odd to say "built-in types only", IMO (or any other arbitrary whitelist). It'd be cleaner to either call it regardless of the type, or just not do that. The inconsistency seems like more trouble than it's worth. For your provided use-case, the length is almost irrelevant; the best way to spot the bug is to see "oh, it was trying to unpack a string, not a tuple". And that part can be done much more easily and safely. ChrisA

On Mar 1, 2020, at 05:03, Alex Hall <alex.mojaki@gmail.com> wrote:
Is there anyone who thinks it's acceptable to run `len()` on arbitrary objects for an error message? Assuming 'no', then the length is only shown if the type is exactly one of list, tuple, str, etc. where we know __len__ exists and is safe. Alternatively, instead of checking the type, we could check if `__len__` itself belongs to a safe set, if we want to include e.g. subclasses of `list`.
IIRC, the new unpacking code still works like the old in that it special-cases list and tuple (where it can use the fast indexing API that just accesses the C array underneath), but for everything else it calls a function with iter(obj). If so, adding the length for list and tuple would be trivial, but adding it for other builtin types would not. Is list and tuple good enough? It won’t help your example use, where it’s str, but I think for that case you don’t generally care that the string had 6 chars instead of 2; you already know the problem from it being a string in the first place. I suspect that almost every time the type is right but the length isn’t, the type is list, tuple, or a non-builtin type. How often do you accidentally unpack a length-6 dict where you wanted to unpack a length-2 dict?

Seems you're right: https://github.com/python/cpython/blob/baf29b221682be0f4fde53a05ea3f57c3c79f...
I agree with all of this, and I think this is a great balance of implementation complexity and value, especially when the code already accounts for it. I've made an issue here: https://bugs.python.org/issue39816?

the problem here is that "iterable unpacking" (is that what we call it now?) is pretty general, and used all over python. In the example given, it seemed that that would be a helpful message, but it wouldn't really solve the general problem: that is, that dicts iterate over keys, and people sometimes forget and expect that they are iteration go over the items. but if your keys happen to be length-2 iterables, then you won't get an error -- but you will also not get what is likely expected: In [3]: d = {"ab": 123, ...: "cd": 345} In [4]: for string, number in d: ...: print(string, number) ...: a b c d oops! So, in the general case how helpful is it to provide the type of what is being unpacked?? (and note that in this case, I (key, value) tuple is expected, and a tuple is a perfectly valid dict key, too, so again, type isn't necessarily helpful. Length would be a bit more helpful, but still not always. That being said, more information is better than less, so maybe an unpacking error should show the *value* of what was being unpacked:
Which, in fact, is what iPython already does: In [5]: a,b = 1,2,3 --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-5-402193b14bdc> in <module> ----> 1 a,b = 1,2,3 ValueError: too many values to unpack (expected 2) But not in the dict iterating case: In [2]: for key, val in d: ...: print(key, val) ...: --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-2-8ac4fd3a14a8> in <module> ----> 1 for key, val in d: 2 print(key, val) 3 ValueError: too many values to unpack (expected 2) (cause it's showing the line of code, not the run time values) So if possible, it would be great if error messages did generally show the value(s) of the objects involved, if possible. Then that would be: ValueError: too many values to unpack (expected 2, got : 'this') Given that it's a ValueError, it seem the *value* should be known, and generally relevant. And this is already done for some ValueErrors: In [3]: i = int('45f') --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-3-45f7234bd5b5> in <module> ----> 1 i = int('45f') ValueError: invalid literal for int() with base 10: '45f' Maybe it should be for ALL Value Errors? -CHB On Sun, Mar 1, 2020 at 11:38 AM Alex Hall <alex.mojaki@gmail.com> wrote:
-- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython

On Sun, Mar 1, 2020 at 6:51 PM Christopher Barker <pythonchb@gmail.com> wrote: SNIP the problem here is that "iterable unpacking" (is that what we call it
This information is certainly retrievable - but not by standard consoles. $ python -m friendly_traceback Friendly Console version 0.0.28a. [Python version: 3.7.3] >>> d = {"a": 1, "b": 2} >>> for key, value in d: ... print(key, value) ... Python exception: ValueError: not enough values to unpack (expected 2, got 1) A ValueError indicates that a function or an operation received an argument of the right type, but an inappropriate value. Execution stopped on line 1 of file '<friendly-console:2>'. -->1: for key, value in d: d: {'a': 1, 'b': 2} == André Roberge

On Mon, Mar 2, 2020 at 12:47 AM Christopher Barker <pythonchb@gmail.com> wrote:
The only extra thing IPython is doing here is showing the source line, which it can do because it saves the input in linecache. The standard shell never saves its input so it never shows in tracebacks. I do think that's an issue, but if you ran these examples in files you'd get the same amount of information either way. (cause it's showing the line of code, not the run time values)
There are many tools which enhance tracebacks with local variables. Even the standard traceback module can do it. But of course improving the default experience with just the right amount of information would be ideal.
In general Python error messages don't include the relevant values or much information about them, although I often wish they would. For example when I get a KeyError I wish I could see which keys are present, unless there's too many for it to be practical. I'm speculating, but I think there are two main reasons for this: 1. To avoid executing arbitrary __repr__ methods. 2. To avoid the performance cost of creating a message for an exception that may be caught and thrown away. Neither of these really apply to int('45f'). Are there any other major reasons for the general lack of information? It feels like an intentional decision beyond implementation difficulty. I imagine we can work around both reasons: 1. There's a lot of information that can be extracted and displayed without running any user defined code. 2. Objects can be saved (such as the dict that raised KeyError) while deferring the message rendering until it's requested.

There was a proposal (by me) some time ago to add some structured information to some of the Exceptions. See https://www.python.org/dev/peps/pep-0473/, but it finally got rejected due to being too broad. I'd be happy to revive (parts of) the proposal if anyone is interested. I managed though to create a PR adding a name attribute to NameError, see https://github.com/python/cpython/pull/6271 and https://bugs.python.org/issue37797. On Mon, Mar 2, 2020 at 6:01 PM Alex Hall <alex.mojaki@gmail.com> wrote:
-- Sebastian Kreft

I also would prefer richer exceptions especially if it can be done without introducing any other problems. In the same vein, I once suggested a richer inheritance failure message (this, basically: https://github.com/NeilGirdhar/inheritance_graph), for which if I remember correctly Guido was mildly supportive. I didn't have time to make a proposal patch. On Monday, March 2, 2020 at 4:32:35 PM UTC-5, Sebastian Kreft wrote:

Yes there is, IPython used to display values of local frame variables by default, though some of these values for some users were secrets that ended up in logs files. So even base types are sensitive and I don't think we should print them by default. Well you say what about interactive sessions ? Surely it's fine ! ...Unless you are doing a demo at a conference and you end up printing your AWS key in a live broadcast... Though I'm definitively for having more informations displayed, even if some of that is opt-in. With rich reprs in Jupyter we could actually show variable values on hover with the mouse, or in a collapsible html element. -- Matthias

On Sun, Mar 1, 2020 at 11:35 PM Alex Hall <alex.mojaki@gmail.com> wrote:
Adding the type name would be pretty safe to do. Adding the length, though, is a bit harder. Consider:
So the only way would be to call len(), and if it fails, fall back on the "expected 2" form. And I'm not sure if that would be worthwhile, given that it's going to have to run arbitrary code just for the sake of the error message. +0.5 for showing the type name, -0.5 for trying to show the length. ChrisA

Chris Angelico wrote:
I did address these:
The length of the unpacked value, if it exists and (to not execute arbitrary __len__ implementations) if the type belongs to a safe subset, e.g. only builtin types.
Is there anyone who thinks it's acceptable to run `len()` on arbitrary objects for an error message? Assuming 'no', then the length is only shown if the type is exactly one of list, tuple, str, etc. where we know __len__ exists and is safe. Alternatively, instead of checking the type, we could check if `__len__` itself belongs to a safe set, if we want to include e.g. subclasses of `list`.

On Mon, Mar 2, 2020 at 12:01 AM Alex Hall <alex.mojaki@gmail.com> wrote:
It seems very odd to say "built-in types only", IMO (or any other arbitrary whitelist). It'd be cleaner to either call it regardless of the type, or just not do that. The inconsistency seems like more trouble than it's worth. For your provided use-case, the length is almost irrelevant; the best way to spot the bug is to see "oh, it was trying to unpack a string, not a tuple". And that part can be done much more easily and safely. ChrisA

On Mar 1, 2020, at 05:03, Alex Hall <alex.mojaki@gmail.com> wrote:
Is there anyone who thinks it's acceptable to run `len()` on arbitrary objects for an error message? Assuming 'no', then the length is only shown if the type is exactly one of list, tuple, str, etc. where we know __len__ exists and is safe. Alternatively, instead of checking the type, we could check if `__len__` itself belongs to a safe set, if we want to include e.g. subclasses of `list`.
IIRC, the new unpacking code still works like the old in that it special-cases list and tuple (where it can use the fast indexing API that just accesses the C array underneath), but for everything else it calls a function with iter(obj). If so, adding the length for list and tuple would be trivial, but adding it for other builtin types would not. Is list and tuple good enough? It won’t help your example use, where it’s str, but I think for that case you don’t generally care that the string had 6 chars instead of 2; you already know the problem from it being a string in the first place. I suspect that almost every time the type is right but the length isn’t, the type is list, tuple, or a non-builtin type. How often do you accidentally unpack a length-6 dict where you wanted to unpack a length-2 dict?

Seems you're right: https://github.com/python/cpython/blob/baf29b221682be0f4fde53a05ea3f57c3c79f...
I agree with all of this, and I think this is a great balance of implementation complexity and value, especially when the code already accounts for it. I've made an issue here: https://bugs.python.org/issue39816?

the problem here is that "iterable unpacking" (is that what we call it now?) is pretty general, and used all over python. In the example given, it seemed that that would be a helpful message, but it wouldn't really solve the general problem: that is, that dicts iterate over keys, and people sometimes forget and expect that they are iteration go over the items. but if your keys happen to be length-2 iterables, then you won't get an error -- but you will also not get what is likely expected: In [3]: d = {"ab": 123, ...: "cd": 345} In [4]: for string, number in d: ...: print(string, number) ...: a b c d oops! So, in the general case how helpful is it to provide the type of what is being unpacked?? (and note that in this case, I (key, value) tuple is expected, and a tuple is a perfectly valid dict key, too, so again, type isn't necessarily helpful. Length would be a bit more helpful, but still not always. That being said, more information is better than less, so maybe an unpacking error should show the *value* of what was being unpacked:
Which, in fact, is what iPython already does: In [5]: a,b = 1,2,3 --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-5-402193b14bdc> in <module> ----> 1 a,b = 1,2,3 ValueError: too many values to unpack (expected 2) But not in the dict iterating case: In [2]: for key, val in d: ...: print(key, val) ...: --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-2-8ac4fd3a14a8> in <module> ----> 1 for key, val in d: 2 print(key, val) 3 ValueError: too many values to unpack (expected 2) (cause it's showing the line of code, not the run time values) So if possible, it would be great if error messages did generally show the value(s) of the objects involved, if possible. Then that would be: ValueError: too many values to unpack (expected 2, got : 'this') Given that it's a ValueError, it seem the *value* should be known, and generally relevant. And this is already done for some ValueErrors: In [3]: i = int('45f') --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-3-45f7234bd5b5> in <module> ----> 1 i = int('45f') ValueError: invalid literal for int() with base 10: '45f' Maybe it should be for ALL Value Errors? -CHB On Sun, Mar 1, 2020 at 11:38 AM Alex Hall <alex.mojaki@gmail.com> wrote:
-- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython

On Sun, Mar 1, 2020 at 6:51 PM Christopher Barker <pythonchb@gmail.com> wrote: SNIP the problem here is that "iterable unpacking" (is that what we call it
This information is certainly retrievable - but not by standard consoles. $ python -m friendly_traceback Friendly Console version 0.0.28a. [Python version: 3.7.3] >>> d = {"a": 1, "b": 2} >>> for key, value in d: ... print(key, value) ... Python exception: ValueError: not enough values to unpack (expected 2, got 1) A ValueError indicates that a function or an operation received an argument of the right type, but an inappropriate value. Execution stopped on line 1 of file '<friendly-console:2>'. -->1: for key, value in d: d: {'a': 1, 'b': 2} == André Roberge

On Mon, Mar 2, 2020 at 12:47 AM Christopher Barker <pythonchb@gmail.com> wrote:
The only extra thing IPython is doing here is showing the source line, which it can do because it saves the input in linecache. The standard shell never saves its input so it never shows in tracebacks. I do think that's an issue, but if you ran these examples in files you'd get the same amount of information either way. (cause it's showing the line of code, not the run time values)
There are many tools which enhance tracebacks with local variables. Even the standard traceback module can do it. But of course improving the default experience with just the right amount of information would be ideal.
In general Python error messages don't include the relevant values or much information about them, although I often wish they would. For example when I get a KeyError I wish I could see which keys are present, unless there's too many for it to be practical. I'm speculating, but I think there are two main reasons for this: 1. To avoid executing arbitrary __repr__ methods. 2. To avoid the performance cost of creating a message for an exception that may be caught and thrown away. Neither of these really apply to int('45f'). Are there any other major reasons for the general lack of information? It feels like an intentional decision beyond implementation difficulty. I imagine we can work around both reasons: 1. There's a lot of information that can be extracted and displayed without running any user defined code. 2. Objects can be saved (such as the dict that raised KeyError) while deferring the message rendering until it's requested.

There was a proposal (by me) some time ago to add some structured information to some of the Exceptions. See https://www.python.org/dev/peps/pep-0473/, but it finally got rejected due to being too broad. I'd be happy to revive (parts of) the proposal if anyone is interested. I managed though to create a PR adding a name attribute to NameError, see https://github.com/python/cpython/pull/6271 and https://bugs.python.org/issue37797. On Mon, Mar 2, 2020 at 6:01 PM Alex Hall <alex.mojaki@gmail.com> wrote:
-- Sebastian Kreft

I also would prefer richer exceptions especially if it can be done without introducing any other problems. In the same vein, I once suggested a richer inheritance failure message (this, basically: https://github.com/NeilGirdhar/inheritance_graph), for which if I remember correctly Guido was mildly supportive. I didn't have time to make a proposal patch. On Monday, March 2, 2020 at 4:32:35 PM UTC-5, Sebastian Kreft wrote:

Yes there is, IPython used to display values of local frame variables by default, though some of these values for some users were secrets that ended up in logs files. So even base types are sensitive and I don't think we should print them by default. Well you say what about interactive sessions ? Surely it's fine ! ...Unless you are doing a demo at a conference and you end up printing your AWS key in a live broadcast... Though I'm definitively for having more informations displayed, even if some of that is opt-in. With rich reprs in Jupyter we could actually show variable values on hover with the mouse, or in a collapsible html element. -- Matthias
participants (8)
-
Alex Hall
-
Andrew Barnert
-
André Roberge
-
Chris Angelico
-
Christopher Barker
-
Matthias Bussonnier
-
Neil Girdhar
-
Sebastian Kreft