Python's 8-bit cleanness deprecated?

John Roth johnroth at ameritech.net
Sat Feb 8 20:28:01 EST 2003


"Kirill Simonov" <kirill_simonov at mail.ru> wrote in message
news:mailman.1044655519.19088.python-list at python.org...
* John Roth <johnroth at ameritech.net>:
>
> After thinking about this for a few days, it suddenly occured to me
> that there may be a very obscure method in this madness. That is, by
> restricting python source to 7-bit ascii unless otherwise declared,
> it opens the way to migrate to UTF-8 input. This, in turn, would
> solve most of the character set problems in one fell swoop.
>

Why do you think that UTF-8 is a panacea?

For example, my little script

    print "ðÒÉ×ÅÔ!"

will become

    print u"ðÒÉ×ÅÔ!".encode('koi8-r')

if I am forced to use UTF-8 for my source code. I don't see any
advantage here.

[REPLY Starts here]

I thought I already replied to this, but the reply doesn't seem
to have appeared on my news server!

It won't make any difference to your example whatsoever!
UTF-8 is capable of expressing any character in the Unicode
set, so your example would work as coded just fine without
any additional declarations.

What it would do is remove the entire current Unicode
infrastructure added in 2.0 because it would be unnecessary.

John Roth






More information about the Python-list mailing list