[Python-checkins] r72233 - peps/trunk/pep-0383.txt

martin.v.loewis python-checkins at python.org
Sun May 3 10:16:59 CEST 2009


Author: martin.v.loewis
Date: Sun May  3 10:16:57 2009
New Revision: 72233

Log:
Remove utf-8b codec. Rename error handler to utf8b.


Modified:
   peps/trunk/pep-0383.txt

Modified: peps/trunk/pep-0383.txt
==============================================================================
--- peps/trunk/pep-0383.txt	(original)
+++ peps/trunk/pep-0383.txt	Sun May  3 10:16:57 2009
@@ -72,27 +72,17 @@
 represented as lone half surrogate codes U+DC80..U+DCFF. Bytes below
 128 will produce exceptions; see the discussion below.
 
-To convert non-decodable bytes, a new error handler ([2])
-"python-escape" is introduced, which produces these half
-surrogates. On encoding, the error handler converts the half surrogate
-back to the corresponding byte. This error handler will be used in any
-API that receives or produces file names, command line arguments, or
-environment variables.
+To convert non-decodable bytes, a new error handler ([2]) "utf8b" is
+introduced, which produces these half surrogates. On encoding, the
+error handler converts the half surrogate back to the corresponding
+byte. This error handler will be used in any API that receives or
+produces file names, command line arguments, or environment variables.
 
 The error handler interface is extended to allow the encode error
 handler to return byte strings immediately, in addition to returning
 Unicode strings which then get encoded again (also see the discussion
 below).
 
-If the locale's encoding is UTF-8, the file system encoding is set to
-a new encoding "utf-8b", as the regular UTF-8 codec would not
-re-encode half surrogates as single bytes. The UTF-8b codec decodes
-invalid bytes (which must be >= 0x80) into half surrogate codes
-U+DC80..U+DCFF. Unlike the utf-8 codec, the utf-8b codec follows the
-strict definition of UTF-8 to determine what an invalid byte is
-(which, among other restrictions, disallows to encode surrogate codes
-in UTF-8).
-
 Byte-orientied interfaces that already exist in Python 3.0 are not
 affected by this specification. They are neither enhanced nor 
 deprecated.


More information about the Python-checkins mailing list