[Python-Dev] python-dev Summary for 2005-04-01 through 2005-04-15
[draft]
Walter Dörwald
walter at livinglogic.de
Mon Apr 18 23:33:58 CEST 2005
Tim Lesher sagte:
> Here's the first draft of the python-dev summary for the first half of April. Please send any corrections or suggestions to
> the summarizers.
> [...]
> ----------------------------------------
> Unicode byte order mark decoding
> ----------------------------------------
>
> Evan Jones saw that the UTF-16 decoder discards the byte-order mark (BOM) from Unicode files, while the UTF-8 decoder
> doesn't. Although the BOM isn't really required in UTF-8 files, many Unicode-generating applications, especially on Microsoft
> platforms, add it.
>
> Walter Dörwald created a patch_ to add a UTF-8-Sig codec that generates a BOM on writing and skips it on reading, but after a
> long discussion on the history of the Unicode, Microsoft's influence over its
> evolution, the consensus was that BOM and signature handling belong at a higher level (for example, a stream API) than the
> codec.
All codecs provide a stream API, so there is no higher level.
Bye,
Walter Dörwald
More information about the Python-Dev
mailing list