[issue6697] Python 3.1 segfaults when invalid UTF-8 characters are passed from command line

Amaury Forgeot d'Arc report at bugs.python.org
Wed Aug 19 14:34:00 CEST 2009


Amaury Forgeot d'Arc <amauryfa at gmail.com> added the comment:

The problem is actually wider::
    >>> getattr(None, "\udc80")
    Segmentation fault
An idea would be to change _PyUnicode_AsDefaultEncodedString and allow
unpaired surrogates (utf8+surrogateescape, as explained in PEP383), but
I fear the consequences...

The code that fails seems pretty common:
	PyErr_Format(PyExc_AttributeError,
		     "'%.50s' object has no attribute '%.400s'",
		     tp->tp_name, _PyUnicode_AsString(name));
It would be unfortunate to replace all usages of _PyUnicode_AsString to
check the return value.

Martin, what do you think?

----------
nosy: +amaury.forgeotdarc, loewis

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue6697>
_______________________________________


More information about the Python-bugs-list mailing list