On Fri, 10 Jan 2014 16:23:53 -0800 Ethan Furman <ethan@stoneleaf.us> wrote:
On 01/08/2014 02:42 PM, Antoine Pitrou wrote:
With Victor's consent, I overhauled PEP 460 and made the feature set more restricted and consistent with the bytes/str separation.
From the PEP: =============
Python 3 generally mandates that text be stored and manipulated as unicode (i.e. str objects, not bytes). In some cases, though, it makes sense to manipulate bytes objects directly. Typical usage is binary network protocols, where you can want to interpolate and assemble several bytes object (some of them literals, some of them compute) to produce complete protocol messages. For example, protocols such as HTTP or SIP have headers with ASCII names and opaque "textual" values using a varying and/or sometimes ill-defined encoding. Moreover, those headers can be followed by a binary body... which can be chunked and decorated with ASCII headers and trailers!
As it stands now, the PEP talks about ASCII, about how it makes sense sometimes to work directly with bytes objects, and then refuses to allow % to embed ASCII text in the byte stream.
Indeed I refuse for %-formatting to allow the mixing of bytes and str objects, in the same way that it is forbidden to concatenate "a" and b"b" together, or to write b"".join(["abc"]). Python 3 was made *precisely* because the implicit conversion between ASCII unicode and bytes is deemed harmful. It's completely counter-productive and misleading for our users to start mudding the message by introducing exceptions to that rule. Regards Antoine.