[Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

Sun Jan 12 18:57:14 CET 2014

Wait a second, this is how I understood it but what Nick said made me think
otherwise...

On Sun, Jan 12, 2014 at 6:22 PM, Steven D'Aprano <steve at pearwood.info>wrote:

> On Sun, Jan 12, 2014 at 12:52:18PM +0100, Juraj Sukop wrote:
> > On Sun, Jan 12, 2014 at 2:35 AM, Steven D'Aprano <steve at pearwood.info
> >wrote:
> >
> > Just to check I understood what you are saying. Instead of writing:
> >
> >     content = b'\n'.join([
> >         b'header',
> >         b'part 2 %.3f' % number,
> >         binary_image_data,
> >         utf16_string.encode('utf-16be'),
> >         b'trailer'])
>
> Which doesn't work, since bytes don't support %f in Python 3.
>

I know and this was an example of the ideal (for me, anyway) way of
formatting bytes.

> First, "utf16_string" confuses me. What is it? If it is a Unicode
> string, i.e.:
>

It is a Unicode string which happens to contain code points outside U+00FF
(as with the TTF example above), so that it triggers the (at least) 2-bytes
memory representation in CPython 3.3+. I agree, I chose the variable name
poorly, my bad.

>
>     content = '\n'.join([
>         'header',
>         'part 2 %.3f' % number,
>         binary_image_data.decode('latin-1'),
>         utf16_string,  # Misleading name, actually Unicode string
>         'trailer'])
>

Which, because of that horribly-named-variable, prevents the use of simple
memcpy and makes the image data occupy way more memory than as when it was
in simple bytes.

> Both examples assume that you intend to do further processing of content
> before sending it, and will encode just before sending:
>

Not really, I was interested to compare it to bytes formatting, hence it
included the "encode()" as well.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140112/599a369a/attachment.html>