[Python-ideas] Fixing the Python 3 bytes constructor

Nick Coghlan ncoghlan at gmail.com
Fri Mar 28 11:27:33 CET 2014


One of the current annoyances with the bytes type in Python 3 is the
way the constructor handles integers:

>>> bytes(3)
b'\x00\x00\x00'

It would be far more consistent with the behaviour of other bytes
interfaces if the result of that call was instead b'\x03'. Instead, to
get that behaviour, you currently have to wrap it in a list or other
iterable:

>>> bytes([3])
b'\x03'

The other consequence of this is that's currently no neat way to
convert the integers produced by various bytes APIs back to a length
one bytes object - we have no binary equivalent of "chr" to convert an
integer in the range 0-255 inclusive to its bytes counterpart. The
acceptance of PEP 361 means we'll get another option (b"%c".__mod__)
but that's hardly what anyone would call obvious.

However, during a conversation today, a possible solution occurred to
me: a "bytes.chr" class method, that served as an alternate
constructor. That idea results in the following 3 part proposal:

1. Add "bytes.chr" such that "bytes.chr(x)" is equivalent to the PEP
361 defined "b'%c' % x"

2. Add "bytearray.allnull" and "bytes.allnull" such that
"bytearray.allnull(x)" is equivalent to the current "bytearray(x)" int
handling

3. Deprecate the current "bytes(x)" and "bytearray(x)" int handling as
not only ambiguous, but actually a genuine bug magnet (it's way too
easy to accidentally pass a large integer and try to allocate a
ridiculously large bytes object)

For point 2, I also considered the following alternative names before
settling on "allnull":

- bytes.null sounds too much like an alias for b"\x00"
- bytes.nulls just sounded too awkward to say (too many sibilants)
- bytes.zeros I can never remember how to spell (bytes.zeroes?)
- bytearray.cleared sort of worked, but bytes.cleared?
- ditto for bytearray.prealloc and bytes.prealloc (latter makes no sense)

That last is also a very C-ish name (although it is a rather C-ish operation).

Anyway, what do people think? Does anyone actually *like* the way the
bytes constructor in Python 3 currently handles integers and want to
keep it forever? Does the above proposal sound like a reasonable
suggestion for improvement in 3.5? Does this hit PEP territory, since
it's changing the signature and API of a builtin?

Cheers,
Nick.




-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list