[Python-Dev] Re: Regression in unicodestr.encode()?

François Pinard pinard@iro.umontreal.ca
10 Apr 2002 10:04:11 -0400


[jepler@unpythonic.dhs.org]

> Why Python refuses to do it this way:
>     for security reasons, the UTF-8 codec gives you an "illegal encoding"
>     error in this case.

> [...] I'm terribly glad that Python has gotten this detail right.

I'm also glad that Python did it right, not at all because of security
reasons (these are debatable -- the trend is to see security holes everywhere
in these days), but for better conformance with Unicode specifications.

Python being 8-bit clean, it is less a problem with it than with languages
much relying on NUL terminated C strings.  I hope that Python will stick
to its current UTF-8 behaviour, even if C extension writers were applying
some pressure for a change.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard