[Python-checkins] r72145 - peps/trunk/pep-0383.txt
martin.v.loewis
python-checkins at python.org
Thu Apr 30 10:34:04 CEST 2009
Author: martin.v.loewis
Date: Thu Apr 30 10:34:04 2009
New Revision: 72145
Log:
Add discussion of error handlers proposed by Glen
Linderman.
Modified:
peps/trunk/pep-0383.txt
Modified: peps/trunk/pep-0383.txt
==============================================================================
--- peps/trunk/pep-0383.txt (original)
+++ peps/trunk/pep-0383.txt Thu Apr 30 10:34:04 2009
@@ -80,7 +80,8 @@
The error handler interface is extended to allow the encode error
handler to return byte strings immediately, in addition to returning
-Unicode strings which then get encoded again.
+Unicode strings which then get encoded again (also see the discussion
+below).
If the locale's encoding is UTF-8, the file system encoding is set to
a new encoding "utf-8b", as the regular UTF-8 codec would not
@@ -123,6 +124,17 @@
# fn is now a str object
yield fn.encode(fse, "python-escape")
+The encode error handler interface presently requires replacement
+Unicode to be provide in lieu of the non-encodable Unicode from the
+source string. It promptly encodes that replacement Unicode. In some
+error handlers, such as the python-escape proposed here, it is simpler
+and more efficient for the error handler to provide a pre-encoded
+replacement byte string, rather than forcing it to calculating Unicode
+from which the encoder would create the desired bytes. In fact, with
+python-escape, there are required byte sequences which cannot be
+generated from replacement Unicode.
+
+
References
==========
More information about the Python-checkins
mailing list