Is there really a default source encoding?
"Martin v. Löwis"
martin at v.loewis.de
Thu Jan 23 09:15:55 CET 2003
Magnus Lie Hetland wrote:
> The 2.3a1 "what's new" document, section 3 says that iso8859-1 is the
> default source encoding, and that the encoding only affects Unicode
> string literals... But if I put non-ASCII iso8859-1 letters in my
> string literals without declaring an encoding, I get a deprecation
Right. The default encoding is Latin-1, for compatibility with previous
releases. However, reliance on the default encoding is deprecated, and
the default encoding will change to ASCII in 2.4. According to the
guidelines for language evolution (PEP 5), this means a warning must be
emitted in 2.3.
> What's the correct behaviour, and where is it documented?
Whose correct behaviour? I believe Python 2.3a1 behaves correctly:
Latin-1 is the deprecated default encoding, see
Notice the "favours Latin-1". Strictly speaking, the default encoding
mostly relevant only of Unicode literals. With the default behaviour,
byte strings appear at runtime as they appear on disk in the source
file, so they could be in nearly any encoding. So you can use also
koi8-r (say) in your source code, and, unless you use Unicode literals,
nothing would break.
Your correct behaviour is to properly declare the encoding of your
source files; this is documented at the same place.
More information about the Python-list