Python's 8-bit cleanness deprecated?
just at xs4all.nl
Tue Feb 4 19:22:17 CET 2003
In article <mailman.1044380830.22886.python-list at python.org>,
Jeff Epler <jepler at unpythonic.net> wrote:
> On Tue, Feb 04, 2003 at 01:36:04PM +0100, Just wrote:
> > Here's a possible compromise (which I'm not sure is implementable at
> > all): Python could only issue warnings if 8-bit chars are used in string
> > literals, and not if they only occur in comments.
> What makes you believe that Python can tell what is a comment and what
> is a string without knowing the encoding?
This is not about knowing the encoding but about warning when an
encoding _should_ have been specified. Since whatever the encoding is,
it must be a superset of ASCII I don't see why my suggestion wouldn't
work (bar implementation limitations). That's not so say I'm completely
convinced of the idea myself.
> I think the only limitation of the source file encoding is that it must
> be an ASCII superset. So for instance I could have a perverse encoding
> where 0x81 decodes to u'\n', and 0x83 is another valid character in the
> 's'. Then this byte string
> actually decodes to
> which means the file contains a string with high-bit-set chars used in
> a string literal.
I don't see your point: my suggestion is about reducing the warning
irritation for people using 8-bit encodings in comments of code that
works *now* (in Python <= 2.2), not about bizarre things you _could_ do
with perverse encoding directives in 2.3.
More information about the Python-list