[I18n-sig] Strawman Proposal (2): Encoding attributes

Paul Prescod paulp@ActiveState.com
Fri, 09 Feb 2001 12:04:23 -0800


"M.-A. Lemburg" wrote:
> 
> ...
> 
> The parser has no idea of what to do with Unicode input...
> this would mean that we would have to make it Unicode
> aware and this opens a new can of worms; not only in the case
> where this encoding specifier is used.

Obviously the parser cannot be made unicode aware for Python 2.1 but why
not for Python 2.2? What's so difficult about it? There's no rocket
science.

Also, if we wanted a quick hack, couldn't we implement it at first by
"decoding" to UTF-8? Then the parser could look for UTF-8 in Unicode
string literals and translate those into real Unicode.

> Also, string literals ("text") would have to translate the
> Unicode input passed to the parser back to ASCII (or whatever
> the default encoding is) and this would break code which currently
> uses strings for data or some specific text encoding.

It would only break code that adds the encoding declaration. If you
don't add the declaration you don't break any code!

Plus, we all agree that passing binary data in literal strings should be
a deprecated usage eventually. That's why we're inventing binary
strings.

> ...
> Sorry, Paul, but this will never happen. Python is an ASCII
> programming language and does good at it.

I am amazed to hear you say that. Why SHOULDN'T we allow Chinese
variables names some day? This is the 21st century. If we don't go after
Asian markets someone else will! I've gotta admit that that kind of
Euro-centric attitude sort of annoys me...

 Paul Prescod