[Python-Dev] PEP 263 considered faulty (for some Japanese)

Tom Emerson tree@basistech.com
Wed, 13 Mar 2002 01:43:06 -0500


Stephen J. Turnbull writes:
>     SUZUKI> Just one worry: [UTF-8 BOM] may be incompatible with
>     SUZUKI> '#!/usr/bin/env' used in Unix.
> 
> It probably is, but it's out of Python's control: the editor will add
> it.  And this can (and will) be handled by changing the shells.

The UTF-8 BOM is an aBOMination that should not be allowed to
live. The only editor that I know of that inserts the sequence is
Microsoft's WordPad (or TextPad, I don't use either). I hope XEmacs
isn't going to do this.

The point of the UniBOM is to determine the byte-order used in a
UTF-(16|32) or UCS-[24] encoded file: one can of course us it as an
indicator for Unicode, but it should not be used as an encoding
idenfier. It is merely a hint.

    -tree

-- 
Tom Emerson                                          Basis Technology Corp.
Sr. Computational Linguist                         http://www.basistech.com
  "Beware the lollipop of mediocrity: lick it once and you suck forever"