Improve error message when missing 'self' in method definition

Hi all, Recently pypy received a patch that improves the error message one gets when 'self' is missing in a method's signature: https://mail.python.org/pipermail/pypy-dev/2016-September/014678.html Here are the commits that implement the change in pypy: https://bitbucket.org/pypy/pypy/commits/all?search=branch(better-error-missi...) I'm curious whether a similar improvement would also be received well in CPython. In particular, this guides one to the correct solution for a common programming mistake made by newcomers (and even not-so-newcomers). Here is a case study that found this was a common source of errors for newcomers: http://dl.acm.org/citation.cfm?id=2960327 -Nathan

On 5 October 2016 at 18:17, Lisa Roach <lisaroach14@gmail.com> wrote:
I'm not a new user by any means, and I still regularly make this mistake. Because I've got the experience, I recognise the error when I see it, but that's not much help for people who haven't already made the mistake hundreds of times :-) +1 on improving the message. Paul

+1. Python does need better error messages. This and the recent new import exception will really help. Will feature freeze prevent this to get into 3.6 if some champion it? I also really like https://github.com/SylvainDe/DidYouMean-Python and as a trainer, will use it in my next training sessions. Le 05/10/2016 à 20:09, Stephan Houben a écrit :

On Thu, Oct 6, 2016 at 5:27 AM, Michel Desmoulin <desmoulinmichel@gmail.com> wrote:
Given that it's not changing semantics at all, just adding info/hints to an error message, it could well be added in a point release. +1 on any feature that helps people to debug code. This doesn't look overly spammy or anything, and it's easy for someone coming from C++ to forget to include that key parameter. ChrisA

Chris Angelico writes:
Given that it's not changing semantics at all, just adding info/hints to an error message, it could well be added in a point release.
But it does change semantics, specifically for doctests. I seem to recall that that is considered a blocker for this kind of change in a maintenance-only branch. In the end that's probably up to the RM, but I would be mildly against it. FWIW YMMV of course.

On Tue, Oct 11, 2016 at 02:31:25PM +1100, Chris Angelico wrote:
Error messages are not part of Python's public API. We should be able to change error messages at any time, including point releases. Nevertheless, we shouldn't abuse that right. If it's only a change to the error message, and not a functional change, then maybe we can add it to the next 3.6 beta or rc. But its probably not worth backporting it to older versions. -- Steve

On 14 October 2016 at 00:04, Steven D'Aprano <steve@pearwood.info> wrote:
My working assumptions for this: - students will move to the latest Python relatively quickly* -> changes aimed at newcomers can just go in the next feature release - production systems migrate slowly* -> changes aimed at making obscure failures easier to debug go into maintenance releases Neither is a hard-and-fast rule, but they're my default starting points. Cheers, Nick. *"quickly" and "slowly" are truly relative here - Python 2.6 is still pretty widely supported and used for production services, but if students are learning on anything other than Python 3.5, it's likely to be 2.7. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, Oct 5, 2016 at 1:27 PM, Michel Desmoulin <desmoulinmichel@gmail.com> wrote:
Speaking of, I'm not much of a C hacker, and messing with CPython internals is a little daunting. If anyone wants to take this on, you have my blessing. I also may take a shot at implementing this idea in the next couple weeks when I have some time.

On 2016-10-05 2:50 PM, Nathan Goldbaum wrote:
It would help if you could create an issue and write exhaustive unittests (or at least specifying how exactly the patch should work for all corner cases). Someone with the knowledge of CPython internals will later add the missing C code to the patch. Yury

On Wed, Oct 05, 2016 at 09:02:50PM +0200, Ivan Levkivskyi wrote:
I don't know... there's a lot of corner cases and I don't think we can improve them all. Here's the suggested exception from PyPy: TypeError: f() takes exactly 1 argument (2 given). Did you forget 'self' in the function definition? What happens if f() takes a mix of positional and keyword arguments? What if it takes arbitrary positional arguments? There are also classmethods, staticmethods, and any arbitrary descriptor. And don't forget ordinary functions too. How will this affect them? Before accepting this patch, I think that we need to ensure that it improves at least *some* cases (not necessarily all) while it does not make any of the remaining cases worse. I wonder whether an alternate approach might be better. Instead of trying to guess whether the method signature is wrong when the method is called, maybe the default metaclass (type) could introspect the namespace, inspect each callable in the namespace, and raise a warning (not an error) if the first argument is not `self` (for regular methods) or `cls` (for classmethods). It's not that self is mandatory, but it's conventional, and if we're going to guess that the name ought to be `self` at method call time, maybe we should guess that the name should be `self` when we build the class. -- Steve

Hi all, It always bothered me to write something like this when i want to strip keys from a dictionnary in Python: a = {"foo": 1, "bar": 2, "baz": 3, "foobar": 42} interesting_keys = ["foo", "bar", "baz"] b = {k, v for k,v in a.items() if k in interesting_keys} Wouldn't it be nice to have a syntactic sugar such as: b = a.subset(interesting_keys) I find this version more elegant/explicit. But maybe this feature is not "worth it" Cheers !

Looks like it was discussed before: https://mail.python.org/pipermail/python-ideas/2012-January/013252.html

That discussion seemed to mostly just conclude that dicts shouldn't have all set operations, and then it kind of just dropped off. No one really argued the subset part. -- Ryan [ERROR]: Your autotools build scripts are 200 lines longer than your program. Something’s wrong. http://kirbyfan64.github.io/ On Oct 12, 2016 11:33 AM, "Riley Banks" <waultah@gmail.com> wrote:

On 10/12/2016 12:06 PM, Enguerrand Pelletier wrote:
If the keys are hashable, this should be a set.
b = {k, v for k,v in a.items() if k in interesting_keys}
Test code before posting. The above is a set comprehension creating a set of tupes. For a dict, 'k, v' must be 'k:v'.
Wouldn't it be nice to have a syntactic sugar such as:
b = a.subset(interesting_keys)
It is pretty rare for the filter condition to be exactly 'key in explicit_keys'. If it is, one can directly construct the dict from a and explict_keys. b = {k:a[k] for k in interesting_keys} The syntactic sugar wrapping this would save 6 keypresses. Interesting_keys can be any iterable. To guarantee no KeyErrors, add 'if k in a'. -- Terry Jan Reedy

On 10/12/2016 5:52 PM, Terry Reedy wrote:
On 10/12/2016 12:06 PM, Enguerrand Pelletier wrote:
I should have followed my own advice. The above is a SyntaxError until 'k,v' is wrapped in parens, '(k,v)'. -- Terry Jan Reedy

On 10/12/2016 12:06 PM, Enguerrand Pelletier wrote:
b = {k, v for k,v in a.items() if k in interesting_keys}
Just my own opinion, I don't pretty much get you on adding a method generating Sets to the dictionary object. Best regards, Ares Ou *Software Engineer / Full-Stack Python Developer* *Blog:* http://aresou.net | *Github:* https://github.com/aresowj *Stack Overflow:* http://stackoverflow.com/users/5183727/ares-ou Ares Ou On Wed, Oct 12, 2016 at 3:03 PM, Terry Reedy <tjreedy@udel.edu> wrote:

On Wed, Oct 12, 2016 at 06:06:51PM +0200, Enguerrand Pelletier wrote:
(You have a small typo: should be "k: v" not "k, v".) Why does it bother you? It is simple, easy to understand, and explict.
Wouldn't it be nice to have a syntactic sugar such as:
Syntactic sugar is not really the right term, "syntactic sugar" means a special form of syntax as a short-cut for something longer. This is not special syntax, it is just a method.
b = a.subset(interesting_keys)
Personally, I have never missed this method, but if I did, it would be easy to make a helper function: def subset(adict, keys): """Return a new dict from existing keys.""" return {k: v for k,v in a.items() if k in keys} Not every simple helper function needs to be built-in to the class. That is the beauty of Python, you can make your own helper functions, once you give up the idea that everything needs to be a method. There are some problems with making this a method. To start with, it means that every dict and mapping would have to support it. Perhaps that is acceptible, but it does mean that the question is bigger than just dict. It also involves: ChainMap Counter Mapping MutableMapping OrderedDict UserDict defaultdict at the very least. (Perhaps this is easy to implement, by just adding this to the Mapping ABC and letting everything else inherit from that. But even so, it increases the complexity of the entire Mapping ABC and all its classes.) But a bigger problem with making this a built-in dict method is deciding exactly what it should do. Here are some options: - should the method return a new dict, or modify the existing dict? - should it keep the "interesting keys" or remove them? - is it an error if one of the interesting keys is missing? - or should it be silently skipped? - or automatically added? using what value? Whatever options we pick here, you can be sure that some people will want a different set of options. Unless we are sure that one combination is much more common than the other combinations, we're better off letting people write their own helper functions that behave exactly as they want: def subset(d, keys): # Version which raises if any of the keys are missing return {key: d[key] for key in keys} -- Steve

Oh, I forgot to mention... On Wed, Oct 12, 2016 at 06:06:51PM +0200, Enguerrand Pelletier wrote:
You've started a completely new discussion on an unrelated topic, but you appear to have done so by replying to an existing email conversation and editing the subject line. Specifically the thread with subject: "Improve error message when missing 'self' ..." Editing the subject line to indicate a change in topic is good! Thank you for doing that! But you should be aware that many email programs will thread this new, unrelated conversation under the "Improve error message" conversation. That means some people may see a threaded view something like this: Subject: [Python-ideas] Improve error message when missing 'self' ... ├─> Re: [Python-ideas] Improve error message ... │ └─> │ └─┬─> │ │ └─> │ └─> └─>[Python-ideas] Add a method to get the subset ... └─> Re: [Python-ideas] Add a method to get ... which may not matter if they are actually related threads, but anyone who has blocked or muted the original "Improve error message..." will not see your new conversation either. This may be considered part of the older conversation, and muted as well. When you plan to start a brand new conversation, unrelated to an existing thread, it is best to start a fresh email, don't reply to an existing one. Changing the subject line is not sufficient to break the link between your email and the previous conversation. You should use your email program's New Email command, type in the new subject line, and then set the To address to python-ideas@python.org. That's the only way to avoid the risk that some people won't see your email because they've muted the parent thread. Hope this helps, Steve

On 5 October 2016 at 18:17, Lisa Roach <lisaroach14@gmail.com> wrote:
I'm not a new user by any means, and I still regularly make this mistake. Because I've got the experience, I recognise the error when I see it, but that's not much help for people who haven't already made the mistake hundreds of times :-) +1 on improving the message. Paul

+1. Python does need better error messages. This and the recent new import exception will really help. Will feature freeze prevent this to get into 3.6 if some champion it? I also really like https://github.com/SylvainDe/DidYouMean-Python and as a trainer, will use it in my next training sessions. Le 05/10/2016 à 20:09, Stephan Houben a écrit :

On Thu, Oct 6, 2016 at 5:27 AM, Michel Desmoulin <desmoulinmichel@gmail.com> wrote:
Given that it's not changing semantics at all, just adding info/hints to an error message, it could well be added in a point release. +1 on any feature that helps people to debug code. This doesn't look overly spammy or anything, and it's easy for someone coming from C++ to forget to include that key parameter. ChrisA

Chris Angelico writes:
Given that it's not changing semantics at all, just adding info/hints to an error message, it could well be added in a point release.
But it does change semantics, specifically for doctests. I seem to recall that that is considered a blocker for this kind of change in a maintenance-only branch. In the end that's probably up to the RM, but I would be mildly against it. FWIW YMMV of course.

On Tue, Oct 11, 2016 at 02:31:25PM +1100, Chris Angelico wrote:
Error messages are not part of Python's public API. We should be able to change error messages at any time, including point releases. Nevertheless, we shouldn't abuse that right. If it's only a change to the error message, and not a functional change, then maybe we can add it to the next 3.6 beta or rc. But its probably not worth backporting it to older versions. -- Steve

On 14 October 2016 at 00:04, Steven D'Aprano <steve@pearwood.info> wrote:
My working assumptions for this: - students will move to the latest Python relatively quickly* -> changes aimed at newcomers can just go in the next feature release - production systems migrate slowly* -> changes aimed at making obscure failures easier to debug go into maintenance releases Neither is a hard-and-fast rule, but they're my default starting points. Cheers, Nick. *"quickly" and "slowly" are truly relative here - Python 2.6 is still pretty widely supported and used for production services, but if students are learning on anything other than Python 3.5, it's likely to be 2.7. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, Oct 5, 2016 at 1:27 PM, Michel Desmoulin <desmoulinmichel@gmail.com> wrote:
Speaking of, I'm not much of a C hacker, and messing with CPython internals is a little daunting. If anyone wants to take this on, you have my blessing. I also may take a shot at implementing this idea in the next couple weeks when I have some time.

On 2016-10-05 2:50 PM, Nathan Goldbaum wrote:
It would help if you could create an issue and write exhaustive unittests (or at least specifying how exactly the patch should work for all corner cases). Someone with the knowledge of CPython internals will later add the missing C code to the patch. Yury

On 5 October 2016 at 20:55, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
I agree with Yury here. There are corner cases (like what to do with classmethods etc). If behaviour for all of them are specified, it would be quite straightforward to implement this. -- Ivan

On Wed, Oct 05, 2016 at 09:02:50PM +0200, Ivan Levkivskyi wrote:
I don't know... there's a lot of corner cases and I don't think we can improve them all. Here's the suggested exception from PyPy: TypeError: f() takes exactly 1 argument (2 given). Did you forget 'self' in the function definition? What happens if f() takes a mix of positional and keyword arguments? What if it takes arbitrary positional arguments? There are also classmethods, staticmethods, and any arbitrary descriptor. And don't forget ordinary functions too. How will this affect them? Before accepting this patch, I think that we need to ensure that it improves at least *some* cases (not necessarily all) while it does not make any of the remaining cases worse. I wonder whether an alternate approach might be better. Instead of trying to guess whether the method signature is wrong when the method is called, maybe the default metaclass (type) could introspect the namespace, inspect each callable in the namespace, and raise a warning (not an error) if the first argument is not `self` (for regular methods) or `cls` (for classmethods). It's not that self is mandatory, but it's conventional, and if we're going to guess that the name ought to be `self` at method call time, maybe we should guess that the name should be `self` when we build the class. -- Steve

Hi all, It always bothered me to write something like this when i want to strip keys from a dictionnary in Python: a = {"foo": 1, "bar": 2, "baz": 3, "foobar": 42} interesting_keys = ["foo", "bar", "baz"] b = {k, v for k,v in a.items() if k in interesting_keys} Wouldn't it be nice to have a syntactic sugar such as: b = a.subset(interesting_keys) I find this version more elegant/explicit. But maybe this feature is not "worth it" Cheers !

Looks like it was discussed before: https://mail.python.org/pipermail/python-ideas/2012-January/013252.html

That discussion seemed to mostly just conclude that dicts shouldn't have all set operations, and then it kind of just dropped off. No one really argued the subset part. -- Ryan [ERROR]: Your autotools build scripts are 200 lines longer than your program. Something’s wrong. http://kirbyfan64.github.io/ On Oct 12, 2016 11:33 AM, "Riley Banks" <waultah@gmail.com> wrote:

On 10/12/2016 12:06 PM, Enguerrand Pelletier wrote:
If the keys are hashable, this should be a set.
b = {k, v for k,v in a.items() if k in interesting_keys}
Test code before posting. The above is a set comprehension creating a set of tupes. For a dict, 'k, v' must be 'k:v'.
Wouldn't it be nice to have a syntactic sugar such as:
b = a.subset(interesting_keys)
It is pretty rare for the filter condition to be exactly 'key in explicit_keys'. If it is, one can directly construct the dict from a and explict_keys. b = {k:a[k] for k in interesting_keys} The syntactic sugar wrapping this would save 6 keypresses. Interesting_keys can be any iterable. To guarantee no KeyErrors, add 'if k in a'. -- Terry Jan Reedy

On 10/12/2016 5:52 PM, Terry Reedy wrote:
On 10/12/2016 12:06 PM, Enguerrand Pelletier wrote:
I should have followed my own advice. The above is a SyntaxError until 'k,v' is wrapped in parens, '(k,v)'. -- Terry Jan Reedy

On 10/12/2016 12:06 PM, Enguerrand Pelletier wrote:
b = {k, v for k,v in a.items() if k in interesting_keys}
Just my own opinion, I don't pretty much get you on adding a method generating Sets to the dictionary object. Best regards, Ares Ou *Software Engineer / Full-Stack Python Developer* *Blog:* http://aresou.net | *Github:* https://github.com/aresowj *Stack Overflow:* http://stackoverflow.com/users/5183727/ares-ou Ares Ou On Wed, Oct 12, 2016 at 3:03 PM, Terry Reedy <tjreedy@udel.edu> wrote:

On Wed, Oct 12, 2016 at 06:06:51PM +0200, Enguerrand Pelletier wrote:
(You have a small typo: should be "k: v" not "k, v".) Why does it bother you? It is simple, easy to understand, and explict.
Wouldn't it be nice to have a syntactic sugar such as:
Syntactic sugar is not really the right term, "syntactic sugar" means a special form of syntax as a short-cut for something longer. This is not special syntax, it is just a method.
b = a.subset(interesting_keys)
Personally, I have never missed this method, but if I did, it would be easy to make a helper function: def subset(adict, keys): """Return a new dict from existing keys.""" return {k: v for k,v in a.items() if k in keys} Not every simple helper function needs to be built-in to the class. That is the beauty of Python, you can make your own helper functions, once you give up the idea that everything needs to be a method. There are some problems with making this a method. To start with, it means that every dict and mapping would have to support it. Perhaps that is acceptible, but it does mean that the question is bigger than just dict. It also involves: ChainMap Counter Mapping MutableMapping OrderedDict UserDict defaultdict at the very least. (Perhaps this is easy to implement, by just adding this to the Mapping ABC and letting everything else inherit from that. But even so, it increases the complexity of the entire Mapping ABC and all its classes.) But a bigger problem with making this a built-in dict method is deciding exactly what it should do. Here are some options: - should the method return a new dict, or modify the existing dict? - should it keep the "interesting keys" or remove them? - is it an error if one of the interesting keys is missing? - or should it be silently skipped? - or automatically added? using what value? Whatever options we pick here, you can be sure that some people will want a different set of options. Unless we are sure that one combination is much more common than the other combinations, we're better off letting people write their own helper functions that behave exactly as they want: def subset(d, keys): # Version which raises if any of the keys are missing return {key: d[key] for key in keys} -- Steve

Oh, I forgot to mention... On Wed, Oct 12, 2016 at 06:06:51PM +0200, Enguerrand Pelletier wrote:
You've started a completely new discussion on an unrelated topic, but you appear to have done so by replying to an existing email conversation and editing the subject line. Specifically the thread with subject: "Improve error message when missing 'self' ..." Editing the subject line to indicate a change in topic is good! Thank you for doing that! But you should be aware that many email programs will thread this new, unrelated conversation under the "Improve error message" conversation. That means some people may see a threaded view something like this: Subject: [Python-ideas] Improve error message when missing 'self' ... ├─> Re: [Python-ideas] Improve error message ... │ └─> │ └─┬─> │ │ └─> │ └─> └─>[Python-ideas] Add a method to get the subset ... └─> Re: [Python-ideas] Add a method to get ... which may not matter if they are actually related threads, but anyone who has blocked or muted the original "Improve error message..." will not see your new conversation either. This may be considered part of the older conversation, and muted as well. When you plan to start a brand new conversation, unrelated to an existing thread, it is best to start a fresh email, don't reply to an existing one. Changing the subject line is not sufficient to break the link between your email and the previous conversation. You should use your email program's New Email command, type in the new subject line, and then set the To address to python-ideas@python.org. That's the only way to avoid the risk that some people won't see your email because they've muted the parent thread. Hope this helps, Steve
participants (17)
-
Ares Ou
-
Chris Angelico
-
Enguerrand Pelletier
-
Ivan Levkivskyi
-
Lisa Roach
-
Michel Desmoulin
-
Nathan Goldbaum
-
Nick Coghlan
-
Paul Moore
-
Riley Banks
-
Ryan Gonzalez
-
Stephan Houben
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Terry Reedy
-
Yury Selivanov
-
אלעזר