PEP: Defining Python Source Code Encodings

M.-A. Lemburg mal at
Thu Jul 19 05:14:35 EDT 2001

Roman Suzi wrote:
> On Wed, 18 Jul 2001, M.-A. Lemburg wrote:
> >    This PEP proposes to introduce a syntax to declare the encoding of
> >    a Python source file. The encoding information is then used by the
> >    Python parser to interpret the file using the given encoding. Most
> >    notably this enhances the interpretation of Unicode literals in
> >    the source code and makes it possible to write Unicode literals
> >    using e.g. UTF-8 directly in an Unicode aware editor.
> I have not understood: will Unicode encoding be directly allowed or only
> ASCII-compatible encodings (like UTF-8)?

Only ASCII supersets -- detecting UTF-16 or UTF-32 would be hard since
Python's source files do not provide a usable file magic (at least
not to my knowledge) and prepending BOM marks or similar identifiers
will likely break executability of Python scrips on Unix (due to the
#! logic).
> >Problem
> >
> >    In Python 2.1, Unicode literals can only be written using the
> >    Latin-1 based encoding "unicode-escape". This makes the
> >    programming environment rather unfriendly to Python users who live
> >    and work in non-Latin-1 locales such as many of the Asian
> >    countries. Programmers can write their 8-bit strings using the
> >    favourite encoding, but are bound to the "unicode-escape" encoding
> >    for Unicode literals.
> Isn't it time for better gettext support in Python? Then for i18n-enabled
> programs, encodings will belong to .mo, .po or whatever called files...

We already have good gettext support in Python (see the Tools/i18n
directory). AFAIK, gettext uses UTF-8 as only encoding for its
l10n files.
> >Scope
> >
> >    This PEP only affects Python source code which makes use of the
> >    proposed magic comment.
> >    Without the magic comment in the proposed
> >    position, Python will treat the source file as it does currently
> >    to maintain backwards compatibility.
> This is what I like most. Will optimization be mentioned here?

If the performance hit is noticable, we could look into optimizing
the setup for files which don't use the magic comment. I'd rather
leave this to after the implemenation.

Marc-Andre Lemburg
CEO Software GmbH
Consulting & Company:                 
Python Software:              

More information about the Python-list mailing list