[issue13643] 'ascii' is a bad filesystem default encoding

STINNER Victor report at bugs.python.org
Wed Dec 21 21:04:44 CET 2011


STINNER Victor <victor.stinner at haypocalc.com> added the comment:

> it will still be passing values that can't be
> interpreted by other processes as you highlighed earlier.

On UNIX, data going outside Python has be be encoded: you pass byte strings, not directly Unicode. Surrogates are encoded back to original bytes.

Example:

>>> b'a\xff'.decode('ascii', 'surrogateescape')
'a\udcff'
>>> b'a\xff'.decode('ascii', 'surrogateescape').encode('ascii', 'surrogateescape')
b'a\xff'

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue13643>
_______________________________________


More information about the Python-bugs-list mailing list