[Python-Dev] python-dev Summary for 2005-04-01 through 2005-04-15 [draft]

Walter Dörwald walter at livinglogic.de
Mon Apr 18 23:33:58 CEST 2005


Tim Lesher sagte:

> Here's the first draft of the python-dev summary for the first half of April.  Please send any corrections or suggestions to
> the summarizers.
> [...]
> ----------------------------------------
> Unicode byte order mark decoding
> ----------------------------------------
>
> Evan Jones saw that the UTF-16 decoder discards the byte-order mark (BOM) from Unicode files, while the UTF-8 decoder
> doesn't. Although the BOM isn't really required in UTF-8 files, many Unicode-generating applications, especially on Microsoft
> platforms, add it.
>
> Walter Dörwald created a patch_ to add a UTF-8-Sig codec that generates a BOM on writing and skips it on reading, but after a
> long discussion on the history of the Unicode, Microsoft's influence over its
> evolution, the consensus was that BOM and signature handling belong at a higher level (for example, a stream API) than the
> codec.

All codecs provide a stream API, so there is no higher level.

Bye,
   Walter Dörwald






More information about the Python-Dev mailing list