Author: martin.v.loewis Date: Thu Apr 30 11:50:16 2009 New Revision: 72148 Log: Restrict escapable bytes into the 128..255 range. Modified: peps/trunk/pep-0383.txt Modified: peps/trunk/pep-0383.txt ============================================================================== --- peps/trunk/pep-0383.txt (original) +++ peps/trunk/pep-0383.txt Thu Apr 30 11:50:16 2009 @@ -68,8 +68,9 @@ On POSIX systems, Python currently applies the locale's encoding to convert the byte data to Unicode, failing for characters that cannot -be decoded. With this PEP, non-decodable bytes will be represented as -lone half surrogate codes U+DCxx. +be decoded. With this PEP, non-decodable bytes >128 will be +represented as lone half surrogate codes U+DC80..U+DCFF. Bytes below +128 will produce exceptions; see the discussion below. To convert non-decodable bytes, a new error handler ([2]) "python-escape" is introduced, which produces these half @@ -109,6 +110,11 @@ Data obtained from other sources may conflict with data produced by this PEP. Dealing with such conflicts is out of scope of the PEP. +Encodings that are not compatible with ASCII are not supported by +this specification; bytes in the ASCII range that fail to decode +will cause an exception. It is widely agreed that such encodings +should not be used as locale charsets. + For most applications, we assume that they eventually pass data received from a system interface back into the same system interfaces. For example, an application invoking os.listdir() will
participants (1)
-
martin.v.loewis