[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

Antoine Pitrou solipsis at pitrou.net
Mon Apr 27 18:09:07 CEST 2009


Stephen J. Turnbull <stephen <at> xemacs.org> writes:
> 
> I hate to break it to you, but most stages of mail processing have
> very little to do with SMTP.  In particular, processing MIME
> attachments often requires dealing with file names.

AFAIK, the file name is only there as an indication for the user when he wants
to save the file. If it's garbled a bit, no big deal.

> The point is that Martin's proposal is not just a solution
> to the problem he posed.

But you haven't concretely demonstrated it with actual use cases. The problems
that the PEP tries to solve, conversely, /have/ been experienced.

> And the APIs won't be killable until
> Python 4000.

Which APIs? The PEP doesn't propose any new API, it just enhances the
implementation of current APIs so that they work out of the box in all cases.

> Specifically, if the return values were bytes,

... it would make Windows support worse.

> or (better for 2.x,
> where bytes are strings as far as most programmers are concerned) as a
> new data type,

I'm -1 on any new string-like type (for file paths or whatever else) with custom
encoding/decoding semantics. It's the best way to ruin the clean str/bytes
separation that 3.x introduced.

Besides, the goal is also to makes things easier for the programmer. Otherwise,
we'll have the same situation as in 2.x where many English-centric programmers
produced code that was incapable of dealing with non-ASCII input, because they
didn't care about the distinction between str and unicode.

Regards

Antoine.




More information about the Python-Dev mailing list