[Python-3000] Help on text editors

Michael Urman murman at gmail.com
Sat Sep 9 06:32:10 CEST 2006


On 9/7/06, David Hopwood <david.nospam.hopwood at blueyonder.co.uk> wrote:
> Yes. However, this is not a good idea for precisely the reason described
> on that page (false detection of Unicode), and so any Unicode detection
> algorithm in Python should only be based on detecting a BOM, IMHO.

Right, except BOMs break tons of Unix applications (and even
occasional Windows ones) which do not expect them. Which leaves us
with Python nearly unable to detect unicode on Unix. This is quite
unfortunate for those of us rooting for UTF-8. Perhaps there are
better heuristics that are worth considering. Perhaps not. It
certainly shouldn't be the default behaviour of a TextFile
constructor.

Michael
-- 
Michael Urman  http://www.tortall.net/mu/blog


More information about the Python-3000 mailing list