Mailman 3 How long the wrong type of argument should we limit (or not) in the error message (C-api)? - Python-Dev

newer
Re: [Python-Dev] cpython (3.3):...

How long the wrong type of argument should we limit (or not) in the error message (C-api)?

Vajrasky Kok

Dec. 13, 2013

10:56 p.m.

Greetings, When fixing/adding error message for wrong type of argument in C code, I am always confused, how long the wrong type is the ideal? The type of error message that I am talking about: "Blabla() argument 1 must be integer not wrong_type". We have inconsistency in CPython code, for example: Python/sysmodule.c =============== PyErr_Format(PyExc_TypeError, "can't intern %.400s", s->ob_type->tp_name); Modules/_json.c ============ PyErr_Format(PyExc_TypeError, "first argument must be a string, not %.80s", Py_TYPE(pystr)->tp_name); Objects/typeobject.c =============== PyErr_Format(PyExc_TypeError, "can only assign string to %s.__name__, not '%s'", type->tp_name, Py_TYPE(value)->tp_name); So is it %.400s or %.80s or %s? I vote for %s. Other thing is which one is more preferable? Py_TYPE(value)->tp_name or value->ob_type->tp_name? I vote for Py_TYPE(value)->tp_name. Or this is just a matter of taste? Thank you. Vajrasky Kok

Show replies by date

David Hutto

December 2013

11:21 p.m.

Being that python is, to me, a prototyping language, then every possible outcome should be presented to the end user. A full variety of explanations should be presented to the programmer. On Fri, Dec 13, 2013 at 11:56 PM, Vajrasky Kok <sky.kok@speaklikeaking.com>wrote:

...

Greetings,

When fixing/adding error message for wrong type of argument in C code, I am always confused, how long the wrong type is the ideal?

The type of error message that I am talking about:

"Blabla() argument 1 must be integer not wrong_type".

We have inconsistency in CPython code, for example:

Python/sysmodule.c =============== PyErr_Format(PyExc_TypeError, "can't intern %.400s", s->ob_type->tp_name);

Modules/_json.c ============ PyErr_Format(PyExc_TypeError, "first argument must be a string, not %.80s", Py_TYPE(pystr)->tp_name);

Objects/typeobject.c =============== PyErr_Format(PyExc_TypeError, "can only assign string to %s.__name__, not '%s'", type->tp_name, Py_TYPE(value)->tp_name);

So is it %.400s or %.80s or %s? I vote for %s.

Other thing is which one is more preferable? Py_TYPE(value)->tp_name or value->ob_type->tp_name? I vote for Py_TYPE(value)->tp_name.

Or this is just a matter of taste?

Thank you.

Vajrasky Kok _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/dwightdhutto%40gmail.com

-- Best Regards, David Hutto *CEO:* *http://www.hitwebdevelopment.com <http://www.hitwebdevelopment.com>*

Greg Ewing

3:07 p.m.

David Hutto wrote:

...

Being that python is, to me, a prototyping language, then every possible outcome should be presented to the end user.

So we should produce a quantum superposition of error messages? :-) (Sorry, I've been watching Susskind's lectures on QM and I've got quantum on the brain at the moment.) -- Greg

David Hutto

10:09 p.m.

Susskinds...Me too, but the refinement of the error messages is the point. We should be looking at the full assessment of the error, which the prototyping of python should present. I've seen others reply that python wouldn't be around, or that theree are other forms I've seen before that will take the forefront. The point should be to align the prototyping of python with the updates in technology taking place. It should be like it usually is, line for lineerror assessments, even followed by further info to inform the prototyper that is looking to translate to a lower level language. On Sat, Dec 14, 2013 at 4:07 PM, Greg Ewing <greg.ewing@canterbury.ac.nz>wrote:

...

David Hutto wrote:

...
Being that python is, to me, a prototyping language, then every possible outcome should be presented to the end user.

So we should produce a quantum superposition of error messages? :-)

(Sorry, I've been watching Susskind's lectures on QM and I've got quantum on the brain at the moment.)

-- Greg

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ dwightdhutto%40gmail.com

-- Best Regards, David Hutto *CEO:* *http://www.hitwebdevelopment.com <http://www.hitwebdevelopment.com>*

David Hutto

10:18 p.m.

We all strive to be python programmers, and some of the responses are that it might not be around in the future. Now we all probably speak conversational in other langs, but I'm thinking of keeping around a great prototyping language. So the topic becomes how too integrate it with the not just the expected, but the unexpected technologies....Despite the topic is error messages, it should apply to all possibilities that could be derived from a prototyping language like python. On Sat, Dec 14, 2013 at 11:09 PM, David Hutto <dwightdhutto@gmail.com>wrote:

...

Susskinds...Me too, but the refinement of the error messages is the point. We should be looking at the full assessment of the error, which the prototyping of python should present.

I've seen others reply that python wouldn't be around, or that theree are other forms I've seen before that will take the forefront.

The point should be to align the prototyping of python with the updates in technology taking place.

It should be like it usually is, line for lineerror assessments, even followed by further info to inform the prototyper that is looking to translate to a lower level language.

On Sat, Dec 14, 2013 at 4:07 PM, Greg Ewing <greg.ewing@canterbury.ac.nz>wrote:

...
David Hutto wrote:

...
Being that python is, to me, a prototyping language, then every possible outcome should be presented to the end user.

So we should produce a quantum superposition of error messages? :-)

(Sorry, I've been watching Susskind's lectures on QM and I've got quantum on the brain at the moment.)

-- Greg

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ dwightdhutto%40gmail.com

-- Best Regards, David Hutto *CEO:* *http://www.hitwebdevelopment.com <http://www.hitwebdevelopment.com>*

-- Best Regards, David Hutto *CEO:* *http://www.hitwebdevelopment.com <http://www.hitwebdevelopment.com>*

Nick Coghlan

5:10 p.m.

On 14 December 2013 14:56, Vajrasky Kok <sky.kok@speaklikeaking.com> wrote:

...

Greetings,

When fixing/adding error message for wrong type of argument in C code, I am always confused, how long the wrong type is the ideal?

The type of error message that I am talking about:

"Blabla() argument 1 must be integer not wrong_type".

We have inconsistency in CPython code, for example:

Python/sysmodule.c =============== PyErr_Format(PyExc_TypeError, "can't intern %.400s", s->ob_type->tp_name);

Modules/_json.c ============ PyErr_Format(PyExc_TypeError, "first argument must be a string, not %.80s", Py_TYPE(pystr)->tp_name);

Objects/typeobject.c =============== PyErr_Format(PyExc_TypeError, "can only assign string to %s.__name__, not '%s'", type->tp_name, Py_TYPE(value)->tp_name);

So is it %.400s or %.80s or %s? I vote for %s.

Other thing is which one is more preferable? Py_TYPE(value)->tp_name or value->ob_type->tp_name? I vote for Py_TYPE(value)->tp_name.

Or this is just a matter of taste?

The idiom has shifted over time, but the preference more recently is definitely for length limiting user provided identifiers (which are generally type names) to limit the maximum length of error messages (to add another variant to the mix, PEP 7 has "%.100s" in an example about breaking long lines that happens to include reporting TypeError). The question should probably be addressed directly in PEP 7, and I'd be inclined to just bless the "%.400s" variant for future code. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Antoine Pitrou

5:16 p.m.

On Sun, 15 Dec 2013 09:10:08 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:

...

The question should probably be addressed directly in PEP 7, and I'd be inclined to just bless the "%.400s" variant for future code.

Shouldn't we have a special "%T" shortcut instead of trying to harmonize all the occurrences of `"%.400s", Py_TYPE(self)->tp_name` ? Sprinkling the same magic number / convention everywhere doesn't sound very future-proof, nor convenient. Regards Antoine.

Victor Stinner

5:52 p.m.

2013/12/15 Antoine Pitrou <solipsis@pitrou.net>:

...

Shouldn't we have a special "%T" shortcut instead of trying to harmonize all the occurrences of `"%.400s", Py_TYPE(self)->tp_name` ?

Oh, I like this proposition! The following pattern is very common in Python: "... %.400s ...", Py_TYPE(self)->tp_name Victor

Nick Coghlan

7:25 p.m.

On 15 December 2013 09:52, Victor Stinner <victor.stinner@gmail.com> wrote:

...

2013/12/15 Antoine Pitrou <solipsis@pitrou.net>:

...
Shouldn't we have a special "%T" shortcut instead of trying to harmonize all the occurrences of `"%.400s", Py_TYPE(self)->tp_name` ?

Oh, I like this proposition! The following pattern is very common in Python:

"... %.400s ...", Py_TYPE(self)->tp_name

Oh, yes, a %T shortcut for "length limited type name of the supplied object" would be brilliant. We need this frequently for C level error messages, and I almost always have to look at an existing example to remember the exact incantation :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Steven D'Aprano

9:51 p.m.

New subject: How long the wrong type of argument should we limit (or not) in the error message (C-api)?

On Sun, Dec 15, 2013 at 11:25:10AM +1000, Nick Coghlan wrote:

...

Oh, yes, a %T shortcut for "length limited type name of the supplied object" would be brilliant. We need this frequently for C level error messages, and I almost always have to look at an existing example to remember the exact incantation :)

What are the chances that could be made available from pure Python too? Having to extract the name of the type is a very common need for error messages, and I never know whether I ought to write type(obj).__name__ or obj.__class__.__name__. A %T and/or {:T} format code could be the One Obvious Way to include the type name in strings: raise TypeError("Expected int but got '{:T}'".format(obj)) looks nicer to me than either of: raise TypeError("Expected int but got '{}'".format(type(obj).__name__)) raise TypeError("Expected int but got '{}'".format(obj.__class__.__name__)) -- Steven

Ethan Furman

10:33 a.m.

On 12/14/2013 07:51 PM, Steven D'Aprano wrote:

...

On Sun, Dec 15, 2013 at 11:25:10AM +1000, Nick Coghlan wrote:

...
Oh, yes, a %T shortcut for "length limited type name of the supplied object" would be brilliant. We need this frequently for C level error messages, and I almost always have to look at an existing example to remember the exact incantation :)

What are the chances that could be made available from pure Python too? Having to extract the name of the type is a very common need for error messages, and I never know whether I ought to write type(obj).__name__ or obj.__class__.__name__. A %T and/or {:T} format code could be the One Obvious Way to include the type name in strings

+1 -- ~Ethan~

Nick Coghlan

3:30 p.m.

On 16 Dec 2013 02:58, "Ethan Furman" <ethan@stoneleaf.us> wrote:

...

On 12/14/2013 07:51 PM, Steven D'Aprano wrote:

...
On Sun, Dec 15, 2013 at 11:25:10AM +1000, Nick Coghlan wrote:

...
Oh, yes, a %T shortcut for "length limited type name of the supplied object" would be brilliant. We need this frequently for C level error messages, and I almost always have to look at an existing example to remember the exact incantation :)

What are the chances that could be made available from pure Python too? Having to extract the name of the type is a very common need for error messages, and I never know whether I ought to write type(obj).__name__ or obj.__class__.__name__. A %T and/or {:T} format code could be the One Obvious Way to include the type name in strings

+1

It's less obviously correct for Python code, though. In C, we're almost always running off slots, so type(obj).__name__ has a very high chance of being what we want, and is also preferred for speed reasons (since it's just a couple of pointer dereferences). At the Python level, whether to display obj.__name__ (working with a class directly), type(obj).__name__ (working with the concrete type, ignoring any proxying) or obj.__class__.__name__ (which takes proxying into account) really depends on exactly what you're doing, and the speed differences between them aren't so stark. Cheers, Nick.

...

-- ~Ethan~

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe:

https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com

Steven D'Aprano

7:23 p.m.

New subject: How long the wrong type of argument should we limit (or not) in the error message (C-api)?

On Mon, Dec 16, 2013 at 07:30:56AM +1000, Nick Coghlan wrote:

...

On 16 Dec 2013 02:58, "Ethan Furman" <ethan@stoneleaf.us> wrote:

...
On 12/14/2013 07:51 PM, Steven D'Aprano wrote:

...
On Sun, Dec 15, 2013 at 11:25:10AM +1000, Nick Coghlan wrote:

...
Oh, yes, a %T shortcut for "length limited type name of the supplied object" would be brilliant. We need this frequently for C level error messages, and I almost always have to look at an existing example to remember the exact incantation :)

What are the chances that could be made available from pure Python too? Having to extract the name of the type is a very common need for error messages, and I never know whether I ought to write type(obj).__name__ or obj.__class__.__name__. A %T and/or {:T} format code could be the One Obvious Way to include the type name in strings

+1

It's less obviously correct for Python code, though. In C, we're almost always running off slots, so type(obj).__name__ has a very high chance of being what we want, and is also preferred for speed reasons (since it's just a couple of pointer dereferences).

At the Python level, whether to display obj.__name__ (working with a class directly), type(obj).__name__ (working with the concrete type, ignoring any proxying) or obj.__class__.__name__ (which takes proxying into account) really depends on exactly what you're doing, and the speed differences between them aren't so stark.

That's a good point, but I think most coders would have a very fuzzy idea of the difference between type(obj).__name__ and obj.__class__.__name__, and when to use one versus the other. I know I don't have a clue, and nearly always end up just arbitrarily picking one. It would be a good thing if the 95% of the time that I don't need to think about it, I don't need to think about it and just use {:T} formatting code. The other 5% of the time I can always extract the name manually, as before. Likewise, I think that most of the time it is useful to distinguish between instances and classes, metaclasses not withstanding. If obj is itself a class, we should see its name directly, and not the name of its class (which will generally be "type"). In other words, what I usually want to write is: raise TypeError("some error regarding type '%s'" % obj.__name__ if isinstance(obj, type) else type(obj).__name__ ) (modulo difference between type(obj) and obj.__class__, as above), but what I end up doing is take the lazy way out and unconditionally use type(obj).__name__. But this is getting further away from the %T formatting code at the C level, so perhaps it needs to go to Python Ideas first? -- Steven

Stephen J. Turnbull

12:37 a.m.

New subject: How long the wrong type of argument should we limit (or not) in the error message (C-api)?

Steven D'Aprano writes:

...

That's a good point, but I think most coders would have a very fuzzy idea of the difference between type(obj).__name__ and obj.__class__.__name__, and when to use one versus the other.

...

Likewise, I think that most of the time it is useful to distinguish between instances and classes, metaclasses not withstanding.

To me trying to handle all of those distinctions in a single format code sounds like way confusing magic (and probably will fail to DWIM way too often).

Walter Dörwald

9:29 a.m.

On 15.12.13 17:33, Ethan Furman wrote:

...

On 12/14/2013 07:51 PM, Steven D'Aprano wrote:

...
On Sun, Dec 15, 2013 at 11:25:10AM +1000, Nick Coghlan wrote:

...
Oh, yes, a %T shortcut for "length limited type name of the supplied object" would be brilliant. We need this frequently for C level error messages, and I almost always have to look at an existing example to remember the exact incantation :)

What are the chances that could be made available from pure Python too? Having to extract the name of the type is a very common need for error messages, and I never know whether I ought to write type(obj).__name__ or obj.__class__.__name__. A %T and/or {:T} format code could be the One Obvious Way to include the type name in strings

+1

I'd vote for including the module name in the string and using __qualname__ instead of __name__, i.e. make "{:T}".format(obj) equivalent to "{0.__class__.__module__}.{0.__class__.qualname__}".format(obj). Servus, Walter

Eric V. Smith

10:21 a.m.

On 12/16/2013 10:29 AM, Walter Dörwald wrote:

...

I'd vote for including the module name in the string and using __qualname__ instead of __name__, i.e. make "{:T}".format(obj) equivalent to "{0.__class__.__module__}.{0.__class__.qualname__}".format(obj).

That's not possible in general. The format specifier interpretation is done by each type. So, you could add this to str.__format__ and int.__format__, but you can't add it to an arbitrary type's __format__. For example, types not in the stdlib would never know about it. There's no logic for calling through to object.__format__ for unknown specifiers. Look at datetime, for example. It uses strftime, so "T" currently just prints a literal "T". And for object.__format__, we recently made it an error to specify any format string. This is to prevent you from calling format(an_object, ".30") and "knowning" that it's interpreted by str.__format__ (because that's the default conversion for object.__format__). If in the future an_object's class added its own __format__, this code would break (or at least do the wrong thing). But I really do like the idea! Maybe there's a way to just make obj.__class__ recognize "T", so you could at least do: format(obj.__class__, "T") or equivalently: "{:T}".format(obj.__class__) I realize that having to use .__class__ defeats some of the beauty of this scheme. Eric.

Nick Coghlan

2:49 p.m.

On 17 Dec 2013 02:23, "Eric V. Smith" <eric@trueblade.com> wrote:

...

On 12/16/2013 10:29 AM, Walter Dörwald wrote:

...
I'd vote for including the module name in the string and using __qualname__ instead of __name__, i.e. make "{:T}".format(obj) equivalent to "{0.__class__.__module__}.{0.__class__.qualname__}".format(obj).

That's not possible in general. The format specifier interpretation is done by each type. So, you could add this to str.__format__ and int.__format__, but you can't add it to an arbitrary type's __format__. For example, types not in the stdlib would never know about it.

That just suggests it would need to be a type coercion code, like !a, !r, and !s. However, that observation also suggests that starting with a "classname" or "typename" builtin would be more appropriate than jumping directly to a formatting code. We've definitely drifted well into python-ideas territory at this point, though :) Cheers, Nick.

...

There's no logic for calling through to object.__format__ for unknown specifiers. Look at datetime, for example. It uses strftime, so "T" currently just prints a literal "T".

And for object.__format__, we recently made it an error to specify any format string. This is to prevent you from calling format(an_object, ".30") and "knowning" that it's interpreted by str.__format__ (because that's the default conversion for object.__format__). If in the future an_object's class added its own __format__, this code would break (or at least do the wrong thing).

But I really do like the idea! Maybe there's a way to just make obj.__class__ recognize "T", so you could at least do: format(obj.__class__, "T") or equivalently: "{:T}".format(obj.__class__)

I realize that having to use .__class__ defeats some of the beauty of this scheme.

Eric.

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe:

https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com

Eric V. Smith

1:06 p.m.

On 12/16/2013 03:49 PM, Nick Coghlan wrote:

...

On 17 Dec 2013 02:23, "Eric V. Smith" <eric@trueblade.com <mailto:eric@trueblade.com>> wrote:

...
On 12/16/2013 10:29 AM, Walter Dörwald wrote:

...
I'd vote for including the module name in the string and using __qualname__ instead of __name__, i.e. make "{:T}".format(obj) equivalent to "{0.__class__.__module__}.{0.__class__.qualname__}".format(obj).

That's not possible in general. The format specifier interpretation is done by each type. So, you could add this to str.__format__ and int.__format__, but you can't add it to an arbitrary type's __format__. For example, types not in the stdlib would never know about it.

That just suggests it would need to be a type coercion code, like !a, !r, and !s. However, that observation also suggests that starting with a "classname" or "typename" builtin would be more appropriate than jumping directly to a formatting code.

That's an excellent observation, Nick, including that it should be based on a builtin. But I'd suggest something like classof(), and have it's __format__ "do the right thing". But it all seems like overkill for this problem.

...

We've definitely drifted well into python-ideas territory at this point, though :)

True enough! Eric.

...

Cheers, Nick.

...
There's no logic for calling through to object.__format__ for unknown specifiers. Look at datetime, for example. It uses strftime, so "T" currently just prints a literal "T".

And for object.__format__, we recently made it an error to specify any format string. This is to prevent you from calling format(an_object, ".30") and "knowning" that it's interpreted by str.__format__ (because that's the default conversion for object.__format__). If in the future an_object's class added its own __format__, this code would break (or at least do the wrong thing).

But I really do like the idea! Maybe there's a way to just make obj.__class__ recognize "T", so you could at least do: format(obj.__class__, "T") or equivalently: "{:T}".format(obj.__class__)

I realize that having to use .__class__ defeats some of the beauty of this scheme.

Eric.

_______________________________________________ Python-Dev mailing list Python-Dev@python.org <mailto:Python-Dev@python.org> https://mail.python.org/mailman/listinfo/python-dev Unsubscribe:

https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/eric%2Ba-python-dev%40tru...

Vajrasky Kok

11:30 p.m.

On Sun, Dec 15, 2013 at 7:52 AM, Victor Stinner <victor.stinner@gmail.com> wrote:

...

Oh, I like this proposition! The following pattern is very common in Python:

"... %.400s ...", Py_TYPE(self)->tp_name

Okay, I have created ticket (and preliminary patch) for this enhancement: http://bugs.python.org/issue19984 Vajrasky Kok

4102

Age (days ago)

4107

Last active (days ago)

List overview

Download

18 comments

11 participants

participants (11)

Antoine Pitrou
David Hutto
Eric V. Smith
Ethan Furman
Greg Ewing
Nick Coghlan
Stephen J. Turnbull
Steven D'Aprano
Vajrasky Kok
Victor Stinner
Walter Dörwald

How long the wrong type of argument should we limit (or not) in the error message (C-api)?

tags

participants (11)