[Python-Dev] PEP 461 - Adding % and {} formatting to bytes

Neil Schemenauer nas at arctrix.com
Wed Jan 15 17:27:32 CET 2014


Neil Schemenauer <nas at arctrix.com> wrote:
> We should use duck-typing and that means a special method, I
> think.  We could introduce a new one but __bytes__ looks like it
> can work.  Otherwise, maybe __ascii__ is a good name.

I poked around the Python 3 source.  Using __bytes__ has some
downsides, e.g. the following would happen:

    >>> bytes(12)
    b'12'

Perhaps that's a little too ASCII-centric.  OTOH, UTF-8 seems to be
winning the encoding war and so the above could be argued as
reasonable behavior.  I think forcing people to explicitly choose an
encoding for str objects will be sufficient to avoid the bytes/str
mess we have in Python 2.

Unfortunately, that change conflicts with the current behavior:

    >>> bytes(12)
    b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

Would it be too disruptive to change that?  It doesn't appear to be
too useful and we could do it using a keyword argument, e.g.:

    bytes(size=12)

I notice something else surprising to me:

    >>> class Test(object):
    ...     def __bytes__(self):
    ...         return b'test'
    ...
    >>> with open('test', 'wb') as fp:
    ...     fp.write(Test())
    ...
    Traceback (most recent call last):
      File "<stdin>", line 2, in <module>
    TypeError: 'Test' does not support the buffer interface

I'd expect that to write b'test' to the file, not raise an error.

Regards,

  Neil



More information about the Python-Dev mailing list