codecs.register_error for "strict", unicode.encode() and str.decode()

Alan Franzoni mailing at
Fri Jul 27 01:03:01 CEST 2012

I think I'm missing some piece here.

I'm trying to register a default error handler for handling exceptions
for preventing encoding/decoding errors (I know how this works and that
making this global is probably not a good practice, but I found this
strange behaviour while writing a proof of concept of how to let Python
work in a more forgiving way).

What I discovered is that register_error() for "strict" seems to work in
the way I expect for string decoding, not for unicode encoding.

That's what happens on Mac, Python 2.7.1 from Apple:

melquiades:tmp alan$ cat
# -*- coding: utf-8 -*-

import codecs

def handle_encode(e):
    return ("ASD", e.end)

codecs.register_error("strict", handle_encode)

print u"à".encode("ascii")

melquiades:tmp alan$ python
Traceback (most recent call last):
  File "", line 10, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe0' in
position 0: ordinal not in range(128)

OTOH this works properly:

melquiades:tmp alan$ cat
# -*- coding: utf-8 -*-

import codecs

def handle_decode(e):
    return (u"ASD", e.end)

codecs.register_error("strict", handle_decode)

print "à".decode("ascii")

melquiades:tmp alan$ python

What piece am I missing? The doc at says " For
encoding /error_handler/ will be called with a UnicodeEncodeError
<> instance,
which contains information about the location of the error.", is there
any reason why the standard "strict" handler cannot be replaced?

Thanks for any clue.

File links:

Alan Franzoni
contact me at public@[mysurname].eu

More information about the Python-list mailing list