Python's 8-bit cleanness deprecated?

Roman Suzi rnd at
Tue Feb 4 21:07:06 CET 2003

On Tue, 4 Feb 2003, Just wrote:

>In article <mailman.1044380830.22886.python-list at>,
> Jeff Epler <jepler at> wrote:
>> On Tue, Feb 04, 2003 at 01:36:04PM +0100, Just wrote:
>> > Here's a possible compromise (which I'm not sure is implementable at 
>> > all): Python could only issue warnings if 8-bit chars are used in string 
>> > literals, and not if they only occur in comments.
>> What makes you believe that Python can tell what is a comment and what
>> is a string without knowing the encoding?
>This is not about knowing the encoding but about warning when an 
>encoding _should_ have been specified. Since whatever the encoding is, 
>it must be a superset of ASCII I don't see why my suggestion wouldn't 
>work (bar implementation limitations). That's not so say I'm completely 
>convinced of the idea myself.
>I don't see your point: my suggestion is about reducing the warning 
>irritation for people using 8-bit encodings in comments of code that 
>works *now* (in Python <= 2.2), not about bizarre things you _could_ do 
>with perverse encoding directives in 2.3.

Well, obligatory -*- will cause not-just-ASCII OS vendors (at least Linux
distros) to disable warnings in their packages. And I am afraid that it will
be done for all warnings, not just encoding ones! Because packagers will
understand that well-cyrillized (for example) Linux should not warn about
encodings at each corner. They make tweakings to everything from LaTeX to
Emacs to make them usable by people who use cyrillic. So, Python's developers
decision to "warning irritate" due to encoding will be answered with
packager's tweaks. Nobody will want to be blamed for extra-growing error logs
on a web-server, or exposing user to extra warnings from some package which is
old but still usable.

Saying this, I agree that only forcing -*- we can achieve discipline to write
encoding on every program (I already do that on my scripts because I use two
cyrillic encodings out of five ;-) and it's convenient to have Emacs
automagically understand me).

The problem we are discussing is not technical one, it's about sociology and
perception of people.

Another trouble I feel with this new feature is that I can never tell if my
program will run or not. Even working with 'recode' program I need -f option
from time to time to let me do recoding inspite of some stray char which does
not (in recode's opinion) belong to certain encoding.

Now I will have same doubts with every Python prorgam. Will it run if I insert
this backtick? What about pseudograhics I use in KOI8-R while it's really from
CP866? What if standard on encodings change and there will be new "Asio"
currency instead of "Euro"? Will my program still run? Etc.

That is why I am asking for unconditional raw 8-bit cleanness of Python 
without any -*- things...

Sincerely yours, Roman Suzi
rnd at =\= My AI powered by Linux RedHat 7.3

More information about the Python-list mailing list