[Python-3000] Help on text editors

Michael Urman murman at gmail.com
Fri Sep 8 03:05:09 CEST 2006

On 9/7/06, Paul Prescod <paul at prescod.net> wrote:
> 1. On US English Windows, Notepad defaults to an encoding called "ANSI".
> What does "ANSI" map to in European and Asian versions of Windows?

On most Western European configurations, the ANSI Code Page is
historically 1252 (CP1252 or WINDOWS-1252 according to iconv). It may
be something different now for supporting the EURO symbol. Japanese
machines tend to use CP932 (or MS932), also known as SHIFT-JIS (or
close enough). I don't know exactly which ACPs match other languages
off the top of my head.

I expect notepad will default to the ACP encoding whenever a file is
detected as such, or a new file contains only characters representable
via that code page. Otherwise I expect it will default to "Unicode"
(UTF-16 / UCS-2). When editing an existing file, it will default to
the detected encoding, unless "Unicode" is required to save the
changes. It uses BOMs to mark all unicode encodings, but doesn't
require them to be present in order to detect "Unicode."

> 3. In general, how do modern versions of Linux and other Unix handle this
> issue?

I use en-US.UTF-8, after many years of C or en-US.ISO-8859-1. Due to
the age of my install, this was not the default, but now I use it as
pervasively as possible. I set it via GDM these days, but via my shell
rc file originally.

Michael Urman  http://www.tortall.net/mu/blog

More information about the Python-3000 mailing list