[issue6697] Python 3.1 segfaults when invalid UTF-8 characters are passed from command line
Martin v. Löwis
report at bugs.python.org
Wed Aug 19 22:20:06 CEST 2009
Martin v. Löwis <martin at v.loewis.de> added the comment:
> It would be unfortunate to replace all usages of _PyUnicode_AsString to
> check the return value.
I agree with MAL: we do need to check for errors returned from
_PyUnicode_AsString, and it would be best if we created a fail-safe
version of it.
In the specific case (getattr), it might also be useful to create a
result that is unicode-escaped, i.e. with \u escapes for all non-ASCII
For _PyUnicode_AsString, I'm uncertain whether supporting half
surrogates is a good idea. Unless there is a compelling reason to
support them, I think we leave that as-is. Your example is not
compelling: I think the unicode string should be escaped, anyway.
The OP's case is also not compelling, we should print an error
message that the source code is incorrectly encoded.
Python tracker <report at bugs.python.org>
More information about the Python-bugs-list