[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

Fri Apr 24 19:25:03 CEST 2009

Paul Moore writes:

 > The pros for Martin's proposal are a uniform cross-platform interface,
 > and a user-friendly API for the common case.

A more accurate phrasing would be "... a user-friendly API for those
who feel very lucky today."  Which is the common case, of course, but
spins a little differently.

 > [1] Actually, all the PEP says is "With this PEP, a uniform
 > treatment of these data as characters becomes possible." An
 > argument as to why this is a good thing would be a useful addition
 > to the PEP. At the moment it's more or less treated as self-evident
 > - which I agree with, but which clearly the Unix people here are
 > not as certain of.

Well, the problem is that both parts are false.  If you didn't start
with a valid string in a known encoding, you shouldn't treat it as
characters because it's not.  Hand it to a careful API, and you'll get
an Exception raised in your face.  And that's precisely why it's not
obviously a good thing.  Careful clients will have to treat it as
"transcoded bytes", and so the people who develop those clients get no
benefit.  OTOH, at least some of those who feel lucky and use it
naively are going to turn out to be wrong.

That said, I'm +0 on the PEP as is.  It's a little bit better than the
current situation in that developers who would otherwise just punt on
dealing with the other world (ie, Windows for Unix hackers, and Unix
for Windows coders) will have a unified interface so it'll maybe work
automagically (when you're luck :-) in that other world, too.  And if
somebody comes up with an idea of true genius for handling the
underlying problem, or even just a slight practical improvement, then
everybody who uses this API can benefit simply by upgrading Python.