[New-bugs-announce] [issue18183] Calling .lower() on certain unicode string raises SystemError

Dave Challis report at bugs.python.org
Mon Jun 10 16:20:36 CEST 2013


New submission from Dave Challis:

This occurred when attempting to decode invalid UTF-8 bytes using "errors='replace'", then attempting to lowercase the produced unicode string.

This was also tested in python 2.7, but it doesn't occur there.

Code to reproduce:

x = b'\xe2\xb3\x99\xb3\xd1\x9f\xe0vjGd|\x12\xf2\x84\xac\xae&$\xa4\xae+\xa4sbtf$&fG\xfb\xe6?.\xe2sbv\x14\xcb\x89\x98\xda\xd9\x99\xda\xb9d9\x1bY\x99\xb7\xb3\x1b9\xa2y*B\xa3\xba\xefj&g\xe2\x92Et\x85~\xbf\x8a\xe3\x919\x8bvc\xfb#$$.\xber6D&b.#4\xa4.\x13RtI\x10\xed\x9c\xd0\x98\xb8\x18\x91\x99\\\nC\x13\x8dV\xccL\xf4\x89\x9c\x90'

x = x.decode('utf-8', errors='replace')

x.lower()


Output:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
SystemError: invalid maximum character passed to PyUnicode_New

----------
components: Unicode
messages: 190907
nosy: davechallis, ezio.melotti
priority: normal
severity: normal
status: open
title: Calling .lower() on certain unicode string raises SystemError
type: behavior
versions: Python 3.3

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18183>
_______________________________________


More information about the New-bugs-announce mailing list