Mailman 3 Re: [Patches] [Patch #102955] bltinmodule.c warning fix - Python-Dev

newer
[Patch #102813] _cursesmodule: Add...

Re: [Patches] [Patch #102955] bltinmodule.c warning fix

older
Reviving the bookstore

Andrew Kuchling

Dec. 20, 2000

8:40 a.m.

On Tue, Dec 19, 2000 at 07:02:05PM -0800, noreply@sourceforge.net wrote:

...

Date: 2000-Dec-19 19:02 By: tim_one

...

Is it OK to refer to 8-bit strings under that name? How about "expected an 8-bit string or Unicode string", when the object passed to ord() isn't of the right type. Similarly, when the value is of the right type but has length>1, the message is "ord() expected a character, length-%d string found". Should that be "length-%d (string / unicode) found)" And should the type names be changed to '8-bit string'/'Unicode string', maybe? --amk

Show replies by date

Tim Peters

December 2000

1:44 a.m.

New subject: [Patches] [Patch #102955] bltinmodule.c warning fix

[Andrew Kuchling]

...

Actually, upon reflection I think it was a mistake to add all these "or Unicode" clauses to the error msgs to begin with. Python used to have only one string type, we're saying that's also a hope for the future, and in the meantime I know I'd have no trouble understanding "string" as including both 8-bit strings and Unicode strings. So we should say "8-bit string" or "Unicode string" when *only* one of those is allowable. So "ord() expected string ..." instead of (even a repaired version of) "ord() expected string or Unicode character ..." but-i'm-not-even-motivated-enough-to-finish-this-sig-

M.-A. Lemburg

5:16 a.m.

New subject: [Patches] [Patch #102955] bltinmodule.c warning fix

Tim Peters wrote:

...

I think this has to do with understanding that there are two string types in Python 2.0 -- a novice won't notice this until she sees the error message. My understanding is similar to yours, "string" should mean "any string object" and in cases where the difference between 8-bit string and Unicode matters, these should be referred to as "8-bit string" and "Unicode string". Still, I think it is a good idea to make people aware of the possibility of passing Unicode objects to these functions, so perhaps the idea of adding both possibilies to error messages is not such a bad idea for 2.1. The next phases would be converting all messages back to "string" and then convert all strings to Unicode ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

Tim Peters

1:15 a.m.

New subject: [Patches] [Patch #102955] bltinmodule.c warning fix

[Tim]

...

... So we should say "8-bit string" or "Unicode string" when *only* one of those is allowable. So

"ord() expected string ..."

instead of (even a repaired version of)

"ord() expected string or Unicode character ..."

[MAL]

...

I think this has to do with understanding that there are two string types in Python 2.0 -- a novice won't notice this until she sees the error message.

Except that this error msg has nothing to do with how many string types there are: they didn't pass *any* flavor of string when they get this msg. At the time they pass (say) a float to ord(), that there are currently two flavors of string is more information than they need to know.

...

My understanding is similar to yours, "string" should mean "any string object" and in cases where the difference between 8-bit string and Unicode matters, these should be referred to as "8-bit string" and "Unicode string".

In that happy case of universal harmony, the msg above should say just "string" and leave it at that.

...

Still, I think it is a good idea to make people aware of the possibility of passing Unicode objects to these functions,

Me too.

...

so perhaps the idea of adding both possibilies to error messages is not such a bad idea for 2.1.

But not that. The user is trying to track down their problem. Advertising an irrelevant (to their problem) distinction at that time of crisis is simply spam. TypeError: ord() requires an 8-bit string or a Unicode string. On the other hand, you'd be surprised to discover all the things you can pass to chr(): it's not just ints. Long ints are also accepted, by design, and due to an obscure bug in the Python internals, you can also pass floats, which get truncated to ints.

...

The next phases would be converting all messages back to "string" and then convert all strings to Unicode ;-)

Then we'll save a lot of work by skipping the need for the first half of that -- unless you're volunteering to do all of it <wink>.

Andrew Kuchling

12:37 p.m.

New subject: [Patches] [Patch #102955] bltinmodule.c warning fix

On Thu, Dec 21, 2000 at 02:44:19AM -0500, Tim Peters wrote:

...

So we should say "8-bit string" or "Unicode string" when *only* one of those is allowable. So

OK... how about this patch? Index: bltinmodule.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Python/bltinmodule.c,v retrieving revision 2.185 diff -u -r2.185 bltinmodule.c --- bltinmodule.c 2000/12/20 15:07:34 2.185 +++ bltinmodule.c 2000/12/21 18:36:54 @@ -1524,13 +1524,14 @@ } } else { PyErr_Format(PyExc_TypeError, - "ord() expected string or Unicode character, " \ + "ord() expected string of length 1, but " \ "%.200s found", obj->ob_type->tp_name); return NULL; } PyErr_Format(PyExc_TypeError, - "ord() expected a character, length-%d string found", + "ord() expected a character, " + "but string of length %d found", size); return NULL; }

Tim Peters

1:16 a.m.

New subject: [Patches] [Patch #102955] bltinmodule.c warning fix

[Tim]

...

So we should say "8-bit string" or "Unicode string" when *only* one of those is allowable.

[Andrew]

...

OK... how about this patch?

+1 from me. And maybe if you offer to send a royalty to Marc-Andre each time it's printed, he'll back down from wanting to use the error msgs as a billboard <wink>.

...

Index: bltinmodule.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Python/bltinmodule.c,v retrieving revision 2.185 diff -u -r2.185 bltinmodule.c --- bltinmodule.c 2000/12/20 15:07:34 2.185 +++ bltinmodule.c 2000/12/21 18:36:54 @@ -1524,13 +1524,14 @@ } } else { PyErr_Format(PyExc_TypeError, - "ord() expected string or Unicode character, " \ + "ord() expected string of length 1, but " \ "%.200s found", obj->ob_type->tp_name); return NULL; }

PyErr_Format(PyExc_TypeError, - "ord() expected a character, length-%d string found", + "ord() expected a character, " + "but string of length %d found", size); return NULL; }

Tim Peters

December 2000

1:44 a.m.

New subject: [Patches] [Patch #102955] bltinmodule.c warning fix

[Andrew Kuchling]

...

M.-A. Lemburg

5:16 a.m.

New subject: [Patches] [Patch #102955] bltinmodule.c warning fix

Tim Peters wrote:

...

Tim Peters

1:15 a.m.

New subject: [Patches] [Patch #102955] bltinmodule.c warning fix

[Tim]

...

... So we should say "8-bit string" or "Unicode string" when *only* one of those is allowable. So

"ord() expected string ..."

instead of (even a repaired version of)

"ord() expected string or Unicode character ..."

[MAL]

...

I think this has to do with understanding that there are two string types in Python 2.0 -- a novice won't notice this until she sees the error message.

...

My understanding is similar to yours, "string" should mean "any string object" and in cases where the difference between 8-bit string and Unicode matters, these should be referred to as "8-bit string" and "Unicode string".

In that happy case of universal harmony, the msg above should say just "string" and leave it at that.

...

Still, I think it is a good idea to make people aware of the possibility of passing Unicode objects to these functions,

Me too.

...

so perhaps the idea of adding both possibilies to error messages is not such a bad idea for 2.1.

...

The next phases would be converting all messages back to "string" and then convert all strings to Unicode ;-)

Then we'll save a lot of work by skipping the need for the first half of that -- unless you're volunteering to do all of it <wink>.

Andrew Kuchling

12:37 p.m.

New subject: [Patches] [Patch #102955] bltinmodule.c warning fix

On Thu, Dec 21, 2000 at 02:44:19AM -0500, Tim Peters wrote:

...

So we should say "8-bit string" or "Unicode string" when *only* one of those is allowable. So

Tim Peters

1:16 a.m.

New subject: [Patches] [Patch #102955] bltinmodule.c warning fix

[Tim]

...

So we should say "8-bit string" or "Unicode string" when *only* one of those is allowable.

[Andrew]

...

OK... how about this patch?

+1 from me. And maybe if you offer to send a royalty to Marc-Andre each time it's printed, he'll back down from wanting to use the error msgs as a billboard <wink>.

...

Index: bltinmodule.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Python/bltinmodule.c,v retrieving revision 2.185 diff -u -r2.185 bltinmodule.c --- bltinmodule.c 2000/12/20 15:07:34 2.185 +++ bltinmodule.c 2000/12/21 18:36:54 @@ -1524,13 +1524,14 @@ } } else { PyErr_Format(PyExc_TypeError, - "ord() expected string or Unicode character, " \ + "ord() expected string of length 1, but " \ "%.200s found", obj->ob_type->tp_name); return NULL; }

PyErr_Format(PyExc_TypeError, - "ord() expected a character, length-%d string found", + "ord() expected a character, " + "but string of length %d found", size); return NULL; }

8851

Age (days ago)

8854

Last active (days ago)

List overview

Download

5 comments

3 participants

participants (3)

Andrew Kuchling
M.-A. Lemburg
Tim Peters

Re: [Patches] [Patch #102955] bltinmodule.c warning fix

Andrew Kuchling

Tim Peters

M.-A. Lemburg

Tim Peters

Andrew Kuchling

Tim Peters

Tim Peters

M.-A. Lemburg

Tim Peters

Andrew Kuchling

Tim Peters

tags

participants (3)