[Python-Dev] PEP 293, Codec Error Handling Callbacks

Martin v. Loewis martin@v.loewis.de
06 Aug 2002 12:12:54 +0200


Oren Tirosh <oren-py-d@hishome.net> writes:

> > > 2. Keep the old, limited functionality, let it fail, catch the
> > > error, re-use an argument originally intended for an error
> > > handling strategy to shoehorn a callback that can implement the
> > > missing functionality, add a new name-based registry to overcome
> > > the fact that the argument must be a string.

> > That is possible, but inefficient. 
> 
> I'm confused.
> 
> I have just described what PEP 293 is proposing and you say that it's 
> inefficient :-? 

Perhaps I have misunderstood your description. I was assuming an
algorithm like

def new_encode(str, encoding, errors):
  return dispatch[errors](str, encoding)

def xml_encode(str, encoding):
  try:
    return str.encode(encoding, "strict")
  except UnicodeError:
    if len(str) == 1:
      return "&#%d;" % ord(str)
    return xml_encode(str[:len(str)/2], encoding) + \
           xml_encode(str[len(str)/2:], encoding)

dispatch['xmlcharref'] = xml_encode

This seems to match the description "keep the old, limited
functionality, let it fail, catch the error", and it has all the
deficiencies I mentioned. 

It also is not the meaning of PEP 293. The whole idea is that the
handler is invoked *before* something has failed.

> Instead of treating it as a problem ("the string cannot be encoded") and 
> getting trapped in the mindset of error handling I suggest approaching it 
> from a positive point of view: "how can I make the encoding work the
> way I want it to work?".  Let's leave the error handling for real errors.

Sounds good, but how does this help in finding a solution?

> Treating this as an error-handling issue was so counter-intuitive to me 
> that until recently I never bothered to read PEP 293. The title made me 
> think that it's completely irrelevant to my needs. After all, what I 
> wanted was to translate HTML to/from Unicode, not find a better way to 
> handle errors.

If you think this is a documentation issue - I'm fine with documenting
the feature differently.

Regards,
Martin