[Python-Dev] PEP 460: allowing %d and %f and mojibake

Ethan Furman ethan at stoneleaf.us
Mon Jan 13 00:46:39 CET 2014

On 01/12/2014 02:57 PM, Stephen J. Turnbull wrote:
> Ethan Furman writes:
>> Nothing else is ideal.  I'll go that route if I have to.  I
>> understand that in the real world you go with what works, but in
>> the development stage you fight for the ideal.  :)
> You're going to lose, because Python 3 chose a different ideal that
> conflicts with yours.

Entirely possible.  I didn't set out to waste anyone's time, but I wasn't around for the initial discussions so don't 
know the reasons behind the result, only that the result is not an appropriate boundary type despite it being what is 
handed around at the boundaries.

>>> My reading of Nick's refusal is that %d takes a value which is
>>> semantically a number, converts it into a base-10 representation
>>> (which is semantically a *string*, not a sequence of bytes[1]) and
>>> then *encodes* that string into a series of bytes using the ASCII
>>> encoding.
>> That could be.  And yet the bytes type already has several
>> concessions to ASCII encoding.
> No, Nick's point is that there's no encoding needed there are all,
> just a bunch of methods that handle numbers in the range 0-255.  You
> can rationalize the particular choice of numbers by referring to the
> ASCII coded character set, and that's very useful to users.  But
> knowledge of ASCII isn't necessary to specify these methods; they can
> be defined in an encoding/decoding-free way.

How can you say that with a straight face? [1]  Do you really think that .title, .isalnum, and .center (to name only a 
few) would work the same if the assumed encoding was EBCIDC?  Do you think they would do the proper transformations, or 
return the proper result, if the bytes they were used on were encoded Japanese?

>> But bytes already acknowledges an ASCII bias.
> True, but that bias is implemented without use of encoding or
> decoding.   b'%d' % (123,) -> b'123' does require encoding, at the
> very least in the sense of type change and serialization.

You mean like changing a number into text does?  Really, this is no different.


[1] I'm sorry to be offensive, but I have no idea how to respond to that that acknowledges my complete astonishment that 
you would say such a thing.

More information about the Python-Dev mailing list