[Python-Dev] Improve open() to support reading file starting with an unicode BOM

Stephen J. Turnbull stephen at xemacs.org
Fri Jan 8 07:06:16 CET 2010


Guido van Rossum writes:

 > I'm a little hesitant about this. First of all, UTF-8 + BOM is crazy
 > talk.

That doesn't stop many applications from doing it.  Python should
perhaps<wink,nudge> not produce UTF-8 + BOM without a disclaimer of
indemnification against all resulting damage, signed in blood, from
the user for each instance.

But it should do something sane when reading such files.  I can't
really see any harm in throwing it away, especially since use of
ZERO-WIDTH NO-BREAK SPACE as a joining character has been deprecated
IIRC.







More information about the Python-Dev mailing list