[Python-Dev] XML codec?

Walter Dörwald walter at livinglogic.de
Fri Nov 9 14:41:37 CET 2007


Walter Dörwald wrote:
> Martin v. Löwis wrote:
>>>> Yes, an XML parser should be able to use UTF-8, UTF-16, UTF-32, etc
>>>> codecs to do the encoding.  There's no need to create a magical
>>>> mystery codec to pick out which though.
>>> So the code is good, if it is inside an XML parser, and it's bad if it
>>> is inside a codec?
>> Exactly so. This functionality just *isn't* a codec - there is no
>> encoding. Instead, it is an algorithm for *detecting* an encoding.
> 
> And what do you do once you've detected the encoding? You decode the
> input, so why not combine both into an XML decoder?

In fact, we already have such a codec. The utf-16 decoder looks at the
first two bytes and then decides to forward the rest to either a
utf-16-be or a utf-16-le decoder.

Servus,
   Walter


More information about the Python-Dev mailing list