[issue8821] Range check on unicode repr

Marc-Andre Lemburg report at bugs.python.org
Mon Aug 2 11:07:52 CEST 2010


Marc-Andre Lemburg <mal at egenix.com> added the comment:

Antoine Pitrou wrote:
> 
> Antoine Pitrou <pitrou at free.fr> added the comment:
> 
> Well, the patch was technically useless since, as mentioned, unicode strings are terminated by a NUL character by design.

There are two things to keep in mind:

 * Unicode objects are NUL-terminated, but only very external APIs
   rely on this (e.g. code using the Windows Unicode API). Please
   don't make the code in unicodeobject.c itself rely on this
   subtle detail.

 * The codecs work on Py_UNICODE* buffers which are *never* guaranteed
   to be NUL-terminated, so the problem in question is real.

> Anyway, I now get the following error on the 2.7 branch. Perhaps it's related:
> 
> ======================================================================
> FAIL: test_ucs4 (test.test_unicode.UnicodeTest)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "/home/antoine/cpython/27/Lib/test/test_unicode.py", line 941, in test_ucs4
>     self.assertEqual(x, y)
> AssertionError: '\\udbc0\\udc00' != '\\U00100000'
> 
> ----------
> nosy: +pitrou
> status: closed -> open
> 
> _______________________________________
> Python tracker <report at bugs.python.org>
> <http://bugs.python.org/issue8821>
> _______________________________________
> _______________________________________________
> Python-bugs-list mailing list
> Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/mal%40egenix.com

----------
nosy: +lemburg

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue8821>
_______________________________________


More information about the Python-bugs-list mailing list