[Python-3000] Help on text editors

Marcin 'Qrczak' Kowalczyk qrczak at knm.org.pl
Sat Sep 9 17:43:13 CEST 2006


David Hopwood <david.nospam.hopwood at blueyonder.co.uk> writes:

>> Right, except BOMs break tons of Unix applications (and even
>> occasional Windows ones) which do not expect them.
>
> This problem is overstated. A BOM anywhere in a text causes no
> problem with display, and *should* be treated as an ignorable
> character for searching, etc.

It is not ignorable in most file formats, and it is not automatically
ignored by reading functions of most programming languages.

> Note that there are plenty of other characters that should be
> treated as ignorable, so the applications that are broken for BOMs
> are broken more generally.

I disagree. UTF-8 BOM should not be used on Unix. It's not a reliable
method of encoding detection in general (applies only to Unicode),
and it breaks the simplicity of text streams.

-- 
   __("<         Marcin Kowalczyk
   \__/       qrczak at knm.org.pl
    ^^     http://qrnik.knm.org.pl/~qrczak/


More information about the Python-3000 mailing list