[Python-Dev] Re: Regression in unicodestr.encode()?

Tim Peters tim.one@comcast.net
Tue, 09 Apr 2002 20:45:02 -0400


> Hm, but isn't there a way to encode a NUL that doesn't produce a NUL?
> In some variant?

UTF-8 has a "no \u0000 in, no NUL out" property by design (it's what makes
UTF-8 uniquely well-suited to processing by crufty old 8-bit C string
library routines, and that was a goal of the encoding scheme).

If people are really <wink> wondering whether Barry has discovered an actual
bug, don't:  take his example and decode it back to Unicode.  You won't get
what you started with in current CVS (or at least Barry didn't when I
watched him do it).  That's an easier proof than indirectly wondering about
UTF-8 properties.