[Python-checkins] r72145 - peps/trunk/pep-0383.txt

martin.v.loewis python-checkins at python.org
Thu Apr 30 10:34:04 CEST 2009


Author: martin.v.loewis
Date: Thu Apr 30 10:34:04 2009
New Revision: 72145

Log:
Add discussion of error handlers proposed by Glen
Linderman.


Modified:
   peps/trunk/pep-0383.txt

Modified: peps/trunk/pep-0383.txt
==============================================================================
--- peps/trunk/pep-0383.txt	(original)
+++ peps/trunk/pep-0383.txt	Thu Apr 30 10:34:04 2009
@@ -80,7 +80,8 @@
 
 The error handler interface is extended to allow the encode error
 handler to return byte strings immediately, in addition to returning
-Unicode strings which then get encoded again.
+Unicode strings which then get encoded again (also see the discussion
+below).
 
 If the locale's encoding is UTF-8, the file system encoding is set to
 a new encoding "utf-8b", as the regular UTF-8 codec would not
@@ -123,6 +124,17 @@
           # fn is now a str object
           yield fn.encode(fse, "python-escape")
 
+The encode error handler interface presently requires replacement
+Unicode to be provide in lieu of the non-encodable Unicode from the
+source string.  It promptly encodes that replacement Unicode.  In some
+error handlers, such as the python-escape proposed here, it is simpler
+and more efficient for the error handler to provide a pre-encoded
+replacement byte string, rather than forcing it to calculating Unicode
+from which the encoder would create the desired bytes.  In fact, with
+python-escape, there are required byte sequences which cannot be
+generated from replacement Unicode.
+
+
 References
 ==========
 


More information about the Python-checkins mailing list