[Python-Dev] RE: Defining Unicode Literal Encodings

M.-A. Lemburg mal@lemburg.com
Sat, 14 Jul 2001 13:45:10 +0200

Paul Prescod wrote:
> "M.-A. Lemburg" wrote:
> >
> > ....
> >
> > Please don't mix 8-bit strings with Unicode literals: 8-bit
> > strings don't carry any encoding information, so providing encoding
> > information cannot be stored anywhere.
> First, we could store the information if we want.
> Second, whether we choose to store the information or not, the point is
> that the source file should not mix encodings.

I have added a new paragraph to the PEP (see my rev. 1.1 posting)
pointing out that it is the programmers responsability to choose 
reasonable encodings; in particular, the used encodings should be
compatible so that a text editor can display the data correctly.
> > Comments, OTOH, are part of the program text, so they have to be ASCII
> > just like the Python source itself.
> The Python interpreter allows non-ASCII characters in comments.
> > Hmm, good point, but hard to implement. We'd probably need a two
> >
> > phase decoding for this to work:
> >
> > 1. decode the given Unicode literal encoding
> > 2. decode any Unicode escapes in the Unicode string
> That doesn't sound so hard. :)

True. The issue here is very similar to standard literals
vs. raw ones. Perhaps step 2 should only be imposed on standard
literals while raw ones stop after step 1.
> > I think that allowing one directive per file is the way to go,
> > but I'm not sure about the exact position. Basically, I think it
> > should go "near" the top, but not necessarily before any doc-string
> > in the file.
> If Guido is violently opposed to having it before the docstring then we
> could allow it either before or after the docstring to give tools time
> to catch up.
> I'm not sure what tools in particular have the problem, though. Any tool
> that uses introspection or inspect.py will be fine.

See my other posting for ways to work around this problem.

Marc-Andre Lemburg
CEO eGenix.com Software GmbH
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/