[Python-Dev] PEP 460: allowing %d and %f and mojibake

Stephen J. Turnbull stephen at xemacs.org
Sun Jan 12 23:57:34 CET 2014


Ethan Furman writes:

 > Nothing else is ideal.  I'll go that route if I have to.  I
 > understand that in the real world you go with what works, but in
 > the development stage you fight for the ideal.  :)

You're going to lose, because Python 3 chose a different ideal that
conflicts with yours.

 > > My reading of Nick's refusal is that %d takes a value which is
 > > semantically a number, converts it into a base-10 representation
 > > (which is semantically a *string*, not a sequence of bytes[1]) and
 > > then *encodes* that string into a series of bytes using the ASCII
 > > encoding.
 > 
 > That could be.  And yet the bytes type already has several
 > concessions to ASCII encoding.

No, Nick's point is that there's no encoding needed there are all,
just a bunch of methods that handle numbers in the range 0-255.  You
can rationalize the particular choice of numbers by referring to the
ASCII coded character set, and that's very useful to users.  But
knowledge of ASCII isn't necessary to specify these methods; they can
be defined in an encoding/decoding-free way.

 > But bytes already acknowledges an ASCII bias.

True, but that bias is implemented without use of encoding or
decoding.   b'%d' % (123,) -> b'123' does require encoding, at the
very least in the sense of type change and serialization.


More information about the Python-Dev mailing list