Try this

Carl Banks pavlovevidence at gmail.com
Mon Sep 17 04:27:51 EDT 2007


On Sun, 16 Sep 2007 17:58:09 -0700, mensanator at aol.com wrote:
> The very presence of an algorithm to detect encoding is a bug. Files
> with they .txt extension should always be treated as ANSI even if they
> contain binary data. Notepad should never be allowed to try to decide
> what the encoding is if the the open dialog has the encoding set to
> ANSI.


I'm sure, then, you'll be happy to know that Python 3 will use similar 
(or maybe not similar) heuristics to determine the encoding of text 
files.  At least that was the case last time I checked.


FWIW, I'm not a big fan of heuristics, either (and this behavior would 
seriously irk me on Linux, where, unlike in Windows, there is an 
occasional need to create files containing only a small ascii string).  
But sometimes heuristics are necessary.  There's too many non-ascii text 
files floating around out there for Microsoft to do nothing; they have 
clients in many countries they have to please.



Carl Banks



More information about the Python-list mailing list