[Python-Dev] PEP 383 update: utf8b is now the error handler
Terry Reedy
tjreedy at udel.edu
Thu May 7 00:03:57 CEST 2009
Martin v. Löwis wrote:
> Because utf8b (or, perhaps "UTF-8b") is the official name for this
> algorithm:
> http://hyperreal.org/~est/utf-8b/
Thank you for the link. It starts:
"This directory contains a C implementation of a UTF-8b codec.
A Python codec based on it is provided as well."
'RTF-8b' consists, obviously, 'UTF-8' plus 'b', with the 'b' signifying
a variation of or addition to UTF-8. The 'b', and only the 'b', refers
to the innovative error-handler that was added to the existing 'UTF-8'
codec/algorithm. The name of the combined whole is not the name of the
part.
If you were incorporating the Python-wrapped utf-8b *codec* as a codec,
which is what I once thought *because you used that name*, then calling
it 'utf-8b' would be fine. But you apparently instead proposed and
implemented an *error-handler*, which seems to me to be something else,
and which will not be specific to utf-8 but usable with any codec.
Hence some of us think it should have a different name.
I gather that you lifted the error-handler part of the algorithm and
propose to use it with *any* ascii-respecting codec. I could claim that
the 'official name' of that part is 'b', but I think we can find a
better name.
Terry Jan Reedy
More information about the Python-Dev
mailing list