[Python-checkins] cpython (merge 3.3 -> default): merge 3.3

Sun Dec 2 17:33:27 CET 2012

http://hg.python.org/cpython/rev/239486c4e469
changeset:   80695:239486c4e469
parent:      80692:2181c37977d3
parent:      80694:221858c0d3b1
user:        Benjamin Peterson <benjamin at python.org>
date:        Sun Dec 02 11:33:14 2012 -0500
summary:
  merge 3.3

files:
  Doc/library/codecs.rst     |  17 ++++++++++-------
  Doc/library/exceptions.rst |  24 ++++++++++++++++++++++++
  2 files changed, 34 insertions(+), 7 deletions(-)

diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst
--- a/Doc/library/codecs.rst
+++ b/Doc/library/codecs.rst
@@ -155,13 +155,16 @@
    when *name* is specified as the errors parameter.
 
    For encoding *error_handler* will be called with a :exc:`UnicodeEncodeError`
-   instance, which contains information about the location of the error. The error
-   handler must either raise this or a different exception or return a tuple with a
-   replacement for the unencodable part of the input and a position where encoding
-   should continue. The encoder will encode the replacement and continue encoding
-   the original input at the specified position. Negative position values will be
-   treated as being relative to the end of the input string. If the resulting
-   position is out of bound an :exc:`IndexError` will be raised.
+   instance, which contains information about the location of the error. The
+   error handler must either raise this or a different exception or return a
+   tuple with a replacement for the unencodable part of the input and a position
+   where encoding should continue. The replacement may be either :class:`str` or
+   :class:`bytes`.  If the replacement is bytes, the encoder will simply copy
+   them into the output buffer. If the replacement is a string, the encoder will
+   encode the replacement.  Encoding continues on original input at the
+   specified position. Negative position values will be treated as being
+   relative to the end of the input string. If the resulting position is out of
+   bound an :exc:`IndexError` will be raised.
 
    Decoding and translating works similar, except :exc:`UnicodeDecodeError` or
    :exc:`UnicodeTranslateError` will be passed to the handler and that the
diff --git a/Doc/library/exceptions.rst b/Doc/library/exceptions.rst
--- a/Doc/library/exceptions.rst
+++ b/Doc/library/exceptions.rst
@@ -377,6 +377,30 @@
    Raised when a Unicode-related encoding or decoding error occurs.  It is a
    subclass of :exc:`ValueError`.
 
+   :exc:`UnicodeError` has attributes that describe the encoding or decoding
+   error.  For example, ``err.object[err.start:err.end]`` gives the particular
+   invalid input that the codec failed on.
+
+   .. attribute:: encoding
+
+       The name of the encoding that raised the error.
+
+   .. attribute:: reason
+
+       A string describing the specific codec error.
+
+   .. attribute:: object
+
+       The object the codec was attempting to encode or decode.
+
+   .. attribute:: start
+
+       The first index of invalid data in :attr:`object`.
+
+   .. attribute:: end
+
+       The index after the last invalid data in :attr:`object`.
+
 
 .. exception:: UnicodeEncodeError
 

-- 
Repository URL: http://hg.python.org/cpython