PEP 3131: Supporting Non-ASCII Identifiers

Thu May 17 09:17:51 EDT 2007

> However, what I want to see is how people deal with such issues when
> sharing their code: what are their experiences and what measures do
> they mandate to make it all work properly? You can see some
> discussions about various IDEs mandating UTF-8 as the default
> encoding, along with UTF-8 being the required encoding for various
> kinds of special Java configuration files. 

I believe the problem is solved when everybody uses Eclipse.
You can set a default encoding for all Java source files in a project,
and you check the project file into your source repository.
Eclipse both provides the editor and drives the compiler, and
does so in a consistent way.

> Yes, it should reduce confusion at a technical level. But what about
> the tools, the editors, and so on? If every computing environment had
> decent UTF-8 support, wouldn't it be easier to say that everything has
> to be in UTF-8? 

For both Python and Java, it's too much historical baggage already.
When source encodings were introduced to Python, allowing UTF-8
only was already proposed. People rejected it at the time, because
a) they had source files where weren't encoded in UTF-8, and
   were afraid of breaking them, and
b) their editors would not support UTF-8.

So even with Python 3, UTF-8 is *just* the default default encoding.
I would hope that all Python IDEs, over time, learn about this
default, until then, users may have to manually configure their
IDEs and editors. With a default of UTF-8, it's still simpler than
with PEP 263: you can say that .py files are UTF-8, and your
editor will guess incorrectly only if there is an encoding
declaration other than UTF-8.

Regards,
Martin