[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
Stephen J. Turnbull
stephen at xemacs.org
Mon Apr 27 19:45:15 CEST 2009
Paul Moore writes:
> 2009/4/27 Stephen J. Turnbull <stephen at xemacs.org>:
> > I believe there are solutions that don't have that problem.
> > Specifically, if the return values were bytes, or (better for 2.x,
> > where bytes are strings as far as most programmers are concerned) as a
> > new data type, to indicate that they're not text until the client
> > acknowledges them as such. EIBTI.
> I think you're ignoring the fact that under Windows, it's the *bytes*
> APIs that are lossy.
The *Windows* bytes APIs may be lossy. Python's bytes on the other
hand can represent anything that UTF-16 can. Just represented as
UTF-8. The point is that in Python 3 "bytes" means it's *your*
responsibility, not Python's, to decode that data. The advantage of a
new data type is that Python can provide ways to do it and hide the
internal representation (in theory, it could even be different for the
> Can I at least assume that you aren't recommending that only the bytes
> API exists on Unix, and only the Unicode API on Windows?
I'm agnostic about the underlying APIs used to talk to the OS; people
who actually use that OS should decide that. I'm just recommending
that the return values of the getters not be of a "character string"
type until converted explicitly by the application.
> The *only* "robust" solution is to completely separate the 2
I'm not so pessimistic, unless you're referring to Microsoft's
penchant for forking any solution they don't own.
> People *want* a solution that doesn't require every application
> developer to sweat blood to write working code, simply to cover
> corner cases that they don't believe will happen. The rest of us
> don't want to be made to care.
Well, yes, I wrote pretty much the same thing in the post you're
replying to. But do you really think PEP 383 as written is the unique
solution to those requirements?
More information about the Python-Dev