Mailman 3 Inconsistency in 2.4.3 for __repr__() returning unicode - Python-Dev

newer
Libref sections to put new modules...

Inconsistency in 2.4.3 for repr() returning unicode

Hye-Shik Chang

27 Mar 2006 27 Mar '06

1:10 p.m.

We got an inconsistency for __repr__() returning unicode as reported in http://python.org/sf/1459029 : class s1: def __repr__(self): return '\\n' class s2: def __repr__(self): return u'\\n' print repr(s1()), repr(s2()) Until 2.4.2: \n \n 2.4.3: \n \\n \\n looks bit weird but it's correct. As once discussed[1] in python-dev before, if __repr__ returns unicode object, PyObject_Repr encodes it via unicode-escape codec. So, non-latin character also could be in repr neutrally. But our unicode-escape had a bug since when it is introduced. The bug was that it doesn't escape backslashes. Therefore, backslashes wasn't escaped in repr while it sholud be escaped because we used the unicode-escape codec. So, fixing the bug made a behavior inconsistency. How do we resolve the problem? Hye-Shik [1] http://mail.python.org/pipermail/python-dev/2000-July/005353.html

Show replies by thread

M.-A. Lemburg

27 Mar 27 Mar

3:44 p.m.

Hye-Shik Chang wrote:

...

We got an inconsistency for __repr__() returning unicode as reported in http://python.org/sf/1459029 :

class s1: def __repr__(self): return '\\n'

class s2: def __repr__(self): return u'\\n'

print repr(s1()), repr(s2())

Until 2.4.2: \n \n 2.4.3: \n \\n

\\n looks bit weird but it's correct. As once discussed[1] in python-dev before, if __repr__ returns unicode object, PyObject_Repr encodes it via unicode-escape codec. So, non-latin character also could be in repr neutrally.

I don't think that using unicode-escape is the right choice for converting a string returned by __repr__ to a string - why would you want to escape a Unicode string that was specifically prepared to provide the representation of an object ?

...

But our unicode-escape had a bug since when it is introduced. The bug was that it doesn't escape backslashes. Therefore, backslashes wasn't escaped in repr while it sholud be escaped because we used the unicode-escape codec.

So, fixing the bug made a behavior inconsistency. How do we resolve the problem?

Change PyObject_Repr() to use the default encoding (again) which is also consistent with how PyObject_Str() works. To make repr() conversion more robust, we could have PyObject_Repr() apply the conversion using the 'replace' error strategy - after all, repr() is usually only used for debugging, where it's more important that you do get an output rather than an exception.

...

Hye-Shik

[1] http://mail.python.org/pipermail/python-dev/2000-July/005353.html

-- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 27 2006)

...

...
...
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

Anthony Baxter

28 Mar 28 Mar

11:54 a.m.

New subject: Inconsistency in 2.4.3 for __repr__() returning unicode

On Monday 27 March 2006 21:14, M.-A. Lemburg wrote:

...

Change PyObject_Repr() to use the default encoding (again) which is also consistent with how PyObject_Str() works.

For 2.4.3, I plan to just revert the following patch (and supply a test case) Index: Objects/object.c =================================================================== --- Objects/object.c (revision 16197) +++ Objects/object.c (revision 16198) @@ -267,7 +267,7 @@ return NULL; if (PyUnicode_Check(res)) { PyObject* str; - str = PyUnicode_AsEncodedString(res, NULL, NULL); + str = PyUnicode_AsUnicodeEscapeString(res); Py_DECREF(res); if (str) res = str; Does anyone have any objections to this? The test suite passes with this (including the new test) as do various random tests I could string together. I need to apply this in the next short while, so if you have an issue with it, please speak up now! Thanks, Anthony

...

To make repr() conversion more robust, we could have PyObject_Repr() apply the conversion using the 'replace' error strategy - after all, repr() is usually only used for debugging, where it's more important that you do get an output rather than an exception.

-- Anthony Baxter It's never too late to have a happy childhood.

Anthony Baxter

1:11 p.m.

New subject: Inconsistency in 2.4.3 for __repr__() returning unicode

Never mind. For 2.4.3, I reverted perky's patch for the unicode-escape, and reverted the old patch for PyObject_Repr on the trunk. After talking to perky and Neal, this seemed like the safest option for 2.4.3. Anthony

6598

Age (days ago)

6599

Last active (days ago)

List overview

Download

3 comments

3 participants

participants (3)

Anthony Baxter
Hye-Shik Chang
M.-A. Lemburg

Inconsistency in 2.4.3 for __repr__() returning unicode

Hye-Shik Chang

M.-A. Lemburg

Anthony Baxter

Anthony Baxter

tags

participants (3)

Inconsistency in 2.4.3 for repr() returning unicode