[Python-Dev] PEP 383 update: utf8b is now the error handler
MRAB
google at mrabarnett.plus.com
Wed May 6 12:08:45 CEST 2009
M.-A. Lemburg wrote:
> Martin v. Löwis wrote:
>>> The name "utf8b" suggested in the PEP is not in line with the codec
>>> design
>> Where is that design documented, and how exactly violates the name
>> the design (chapter and verse, please).
>
> Martin, I designed the whole Python codec machinery, so even if
> this is not explicitly written down somewhere, you can take my
> word for it.
>
> I don't want users to be confused by such an error handler
> name, so please change it !
>
> Here's a list of the currently available error handlers (taken from
> codecs.py):
>
> The .encode()/.decode() methods may use different error
> handling schemes by providing the errors argument. These
> string values are predefined:
>
> 'strict' - raise a ValueError error (or a subclass)
> 'ignore' - ignore the character and continue with the next
> 'replace' - replace with a suitable replacement character;
> Python will use the official U+FFFD REPLACEMENT
> CHARACTER for the builtin Unicode codecs on
> decoding and '?' on encoding.
> 'xmlcharrefreplace' - Replace with the appropriate XML
> character reference (only for encoding).
> 'backslashreplace' - Replace with backslashed escape sequences
> (only for encoding).
>
> The set of allowed values can be extended via register_error.
>
>>> Error handlers and codecs are two different things, so the namespaces
>>> need to be clearly separate.
>> They *are* separate naemspaces; that's guaranteed by the implementation.
>
> In the implementation, yes, but not in the head of a typical user:
> the 'utf8b' looks more like a codec name than an error handler
> name.
>
Judging by the existing names, I think that 'surrogate' would be
reasonable. It already contains the meaning of substitute, it's not too
long, and the codes which act as replacements are already called
surrogates.
> I want to avoid any such confusion with Python codecs and don't
> understand why you are making a problem out of this.
>
More information about the Python-Dev
mailing list