<div dir="ltr">bytes.format() below. I'll leave it to you to decide if they warrant using, leaving as an open question, or rejecting.<br><div class="gmail_extra"><br><br><div class="gmail_quote">On Tue, Jan 14, 2014 at 2:56 PM, Ethan Furman <span dir="ltr"><<a href="mailto:ethan@stoneleaf.us" target="_blank">ethan@stoneleaf.us</a>></span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Duh.  Here's the text, as well.  ;)<br>

<br>

<br>

PEP: 461<br>

Title: Adding % and {} formatting to bytes<br>

Version: $Revision$<br>

Last-Modified: $Date$<br>

Author: Ethan Furman <<a href="mailto:ethan@stoneleaf.us" target="_blank">ethan@stoneleaf.us</a>><br>

Status: Draft<br>

Type: Standards Track<br>

Content-Type: text/x-rst<br>

Created: 2014-01-13<br>

Python-Version: 3.5<br>

Post-History: 2014-01-13<br>

Resolution:<br>

<br>

<br>

Abstract<br>

========<br>

<br>

This PEP proposes adding the % and {} formatting operations from str to bytes.<br>

<br>

<br>

Proposed semantics for bytes formatting<br>

==============================<u></u>=========<br>

<br>

%-interpolation<br>

---------------<br>

<br>

All the numeric formatting codes (such as %x, %o, %e, %f, %g, etc.)<br>

will be supported, and will work as they do for str, including the<br>

padding, justification and other related modifiers.<br>

<br>

Example::<br>

<br>

   >>> b'%4x' % 10<br>

   b'   a'<br>

<br>

%c will insert a single byte, either from an int in range(256), or from<br>

a bytes argument of length 1.<br>

<br>

Example:<br>

<br>

    >>> b'%c' % 48<br>

    b'0'<br>

<br>

    >>> b'%c' % b'a'<br>

    b'a'<br>

<br>

%s, because it is the most general, has the most convoluted resolution:<br>

<br>

  - input type is bytes?<br>

    pass it straight through<br>

<br>

  - input type is numeric?<br>

    use its __xxx__ [1] [2] method and ascii-encode it (strictly)<br>

<br>

  - input type is something else?<br>

    use its __bytes__ method; if there isn't one, raise an exception [3]<br>

<br>

Examples:<br>

<br>

    >>> b'%s' % b'abc'<br>

    b'abc'<br>

<br>

    >>> b'%s' % 3.14<br>

    b'3.14'<br>

<br>

    >>> b'%s' % 'hello world!'<br>

    Traceback (most recent call last):<br>

    ...<br>

    TypeError: 'hello world' has no __bytes__ method, perhaps you need to encode it?<br>

<br>

.. note::<br>

<br>

   Because the str type does not have a __bytes__ method, attempts to<br>

   directly use 'a string' as a bytes interpolation value will raise an<br>

   exception.  To use 'string' values, they must be encoded or otherwise<br>

   transformed into a bytes sequence::<br>

<br>

      'a string'.encode('latin-1')<br>

<br>

<br>

format<br>

------<br>

<br>

The format mini language will be used as-is, with the behaviors as listed<br>

for %-interpolation.<br></blockquote><div><br></div><div>That's too vague; % interpolation does not support other format operators in the same way as str.format() does. % interpolation has specific code to support %d, etc. But str.format() gets supported for {:d} not from special code but because e.g. float.__format__('d') works. So you can't say "bytes.format() supports {:d} just like %d works with string interpolation" since the mechanisms are fundamentally different.</div>


<div><br></div><div>This is why I have argued that if you specify it as "if there is a format spec specified, then the return value from calling __format__() will have str.decode('ascii', 'strict') called on it" you get the support for the various number-specific format specs for free. It also means if you pass in a string that you just want the strict ASCII bytes of then you can get it with {:s}.</div>


<div><br></div><div>I also think that a 'b' conversion be added to bytes.format(). This doesn't have the same issue as %b if you make {} implicitly mean {!b} in Python 3.5 as {} will mean what is the most accurate for bytes.format() in either version. It also allows for explicit support where you know you only want a byte and allows {!s} to mean you only want a string (and thus throw an error otherwise).</div>


<div><br></div><div>And all of this means that much like %s only taking bytes, the only way for bytes.format() to accept a non-byte argument is for some format spec to be specified to trigger the .encode('ascii', 'strict') call.</div>


<div><br></div><div>-Brett</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

<br>

Open Questions<br>

==============<br>

<br>

For %s there has been some discussion of trying to use the buffer protocol<br>

(Py_buffer) before trying __bytes__.  This question should be answered before<br>

the PEP is implemented.<br>

<br>

<br>

Proposed variations<br>

===================<br>

<br>

It has been suggested to use %b for bytes instead of %s.<br>

<br>

  - Rejected as %b does not exist in Python 2.x %-interpolation, which is<br>

    why we are using %s.<br>

<br>

It has been proposed to automatically use .encode('ascii','strict') for str<br>

arguments to %s.<br>

<br>

  - Rejected as this would lead to intermittent failures.  Better to have the<br>

    operation always fail so the trouble-spot can be correctly fixed.<br>

<br>

It has been proposed to have %s return the ascii-encoded repr when the value<br>

is a str  (b'%s' % 'abc'  --> b"'abc'").<br>

<br>

  - Rejected as this would lead to hard to debug failures far from the problem<br>

    site.  Better to have the operation always fail so the trouble-spot can be<br>

    easily fixed.<br>

<br>

<br>

Foot notes<br>

==========<br>

<br>

.. [1] Not sure if this should be the numeric __str__ or the numeric __repr__,<br>

       or if there's any difference<br>

.. [2] Any proper numeric class would then have to provide an ascii<br>

       representation of its value, either via __repr__ or __str__ (whichever<br>

       we choose in [1]).<br>

.. [3] TypeError, ValueError, or UnicodeEncodeError?<br>

<br>

<br>

Copyright<br>

=========<br>

<br>

This document has been placed in the public domain.<br>

<br>

<br>

..<br>

   Local Variables:<br>

   mode: indented-text<br>

   indent-tabs-mode: nil<br>

   sentence-end-double-space: t<br>

   fill-column: 70<br>

   coding: utf-8<br>

   End:<div class="HOEnZb"><div class="h5"><br>

<br>

______________________________<u></u>_________________<br>

Python-Dev mailing list<br>

<a href="mailto:Python-Dev@python.org" target="_blank">Python-Dev@python.org</a><br>

<a href="https://mail.python.org/mailman/listinfo/python-dev" target="_blank">https://mail.python.org/<u></u>mailman/listinfo/python-dev</a><br>

Unsubscribe: <a href="https://mail.python.org/mailman/options/python-dev/brett%40python.org" target="_blank">https://mail.python.org/<u></u>mailman/options/python-dev/<u></u>brett%40python.org</a><br>

</div></div></blockquote></div><br></div></div>