[Python-Dev] PEP 460: allowing %d and %f and mojibake

Nick Coghlan ncoghlan at gmail.com
Sun Jan 12 17:09:27 CET 2014


On 13 Jan 2014 01:22, "Kristján Valur Jónsson" <kristjan at ccpgames.com>
wrote:
>
>
> Well, my suggestion would that we _should_ make it work, by having the %s
format specifyer on bytes objects mean: str(arg).encode('ascii', 'strict')
> It would be an explicit encoding operator with a known, fixed, and well
specified encoder.
> This would cover most of the use cases seen in this threadnought.  Others
could be handled with explicit str formatting and encoding.
>
> Imho, this is not equivalent to re-introducing automatic type conversion
between binary/unicode, it is adding a specific convenience function for
explicitly asking for ASCII encoding.

It is not explicit, it is implicit - whether or not the resulting string
assumes ASCII compatibility or not depends on whether you pass a binary
value (no assumption) or a string value (assumes ASCII compatibility). This
kind of data driven change in assumptions about correctness is utterly
unacceptable in the core text and binary types in Python 3.

It's also completely unnecessary - asciistr will be a third party extension
type that allows those users pining for the halcyon days of the Python 2
str type to stop harassing the core devs with requests to compromise the
core Python 3 text model with implicit encoding operations. I'll ensure any
interoperability bugs between asciistr and the core types that can't be
worked around get fixed.

A separate type is genuinely explicit (since the ASCII assumption is no
longer hidden from the type system), and allows much simpler
interoperability for code that wants (indexing asciistr will eventually
produce length 1 asciistr instances instead of str instances, it will avoid
the bytes(intval) discrepancy, it will avoid the str(bytesval) problem,
etc).

I've been suggesting for years that Python 3 might need a third type (not
required to be a builtin, since it's so specialised), but folks migrating
from Python 2 have been so focused on making the core binary type a hybrid
type again, the notion of taking advantage of PEP 393 to create a dedicated
extension type specifically for working with ASCII compatible binary
protocols has failed to compute.

I'm hoping a test suite and preliminary implementation will help more
people to finally get the point.

Regards,
Nick.

>
> K
> ________________________________________
> From: Python-Dev [python-dev-bounces+kristjan=ccpgames.com at python.org] on
behalf of Georg Brandl [g.brandl at gmx.net]
> Sent: Sunday, January 12, 2014 09:23
> To: python-dev at python.org
> Subject: Re: [Python-Dev] PEP 460: allowing %d and %f and mojibake
>
> Am 12.01.2014 09:57, schrieb Paul Moore:
> > On 12 January 2014 01:01, Victor Stinner <victor.stinner at gmail.com>
wrote:
> >> Supporting formating integers would allow to write b"Content-Length:
> >> %s\r\n" % 123, which would work on Python 2 and Python 3.
> >
> > I'm surprised that no-one is mentioning b"Content-Length: %s\r\n" %
> > str(123) which works on Python 2 and 3, is explicit, and needs no
> > special-casing of int in the format code.
>
> Certainly doesn't work on Python 3 right now, and never should :)
>
> Georg
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140113/82cb509f/attachment.html>


More information about the Python-Dev mailing list