[Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

Donald Stufft donald at stufft.io
Sun Aug 17 23:55:45 CEST 2014

> On Aug 17, 2014, at 5:19 PM, Raymond Hettinger <raymond.hettinger at gmail.com> wrote:
> On Aug 17, 2014, at 11:33 AM, Ethan Furman <ethan at stoneleaf.us <mailto:ethan at stoneleaf.us>> wrote:
>> I've had many of the problems Nick states and I'm also +1.
> There are two code snippets below which were taken from the standard library.
> Are you saying that:
> 1) you don't understand the code (as the pep suggests)
> 2) you are willing to break that code and everything like it
> 3) and it would be more elegantly expressed as:  
>         charmap = bytearray.zeros(256)
>     and
>         mapping = bytearray.zeros(256)
> At work, I have network engineers creating IPv4 headers and other structures
> with bytearrays initialized to zeros.  Do you really want to break all their code?
> No where else in Python do we create buffers that way.  Code like
> "msg, who = s.recvfrom(256)" is the norm.
> Also, it is unclear if you're saying that you have an actual use case for this
> part of the proposal?
>    ba = bytearray.byte(65)
> And than the code would be better, clearer, and faster than the currently working form?
>    ba = bytearray([65])
> Does there really need to be a special case for constructing a single byte?
> To me, that is akin to proposing "list.from_int(65)" as an important special
> case to replace "[65]".
> If you must muck with the ever changing bytes() API, then please 
> leave the bytearray() API alone.  I think we should show some respect
> for code that is currently working and is cleanly expressible in both
> Python 2 and Python 3.  We aren't winning users with API churn.
> FWIW, I guessing that the differing view points in the thread stem
> mainly from the proponents experiences with bytes() rather than
> from experience with bytearray() which doesn't seem to have any
> usage problems in the wild.  I've never seen a developer say they
> didn't understand what "buf = bytearray(1024)" means.   That is
> not an actual problem that needs solving (or breaking).
> What may be an actual problem is code like "char = bytes(1024)"
> though I'm unclear what a user might have actually been trying
> to do with code like that.

I think this is probably correct. I generally don’t think that bytes(1024)
makes much sense at all, especially not as a default constructor. Most likely
it exists to be similar to bytearray().

I don't have a specific problem with bytearray(1024), though I do think it's
more elegantly and clearly described as bytearray.zeros(1024), but not by much.

I find bytes.byte()/bytearray to be needed as long as there isn't a simple way
to iterate over a bytes or bytearray in a way that yields bytes or bytearrays
instead of integers. To be honest I can't think of a time when I'd actually
*want* to iterate over a bytes/bytearray as integers. Although I realize there
is unlikely to be a reasonable method to change that now. If iterbytes is added
I'm not sure where i'd personally use either bytes.byte() or bytearray.byte().

In general though I think that overloading a single constructor method to do
something conceptually different based on the type of the parameter leads to
these kind of confusing scenarios and that having differently named constructors
for the different concepts is far clearer.

So given all that, I am:

* +10000 for some method of iterating over both types as bytes instead of
* +1 on adding .zeros to both types as an alternative and preferred method of
  creating a zero filled instance and deprecating the original method[1].
* -0 on adding .byte to both types as an alternative method of creating a
  single byte instance.
* -1 On changing the meaning of bytearray(1024).
* +/-0 on changing the meaning of bytes(1024), I think that bytes(1024) is
  likely to *not* be what someone wants and that what they really want is
  bytes([N]). I also think that the number one reason for someone to be doing
  bytes(N) is because they were attempting to iterate over a bytes or bytearray
  object and they got an integer. I also think that it's bad that this changes
  from 2.x to 3.x and I wish it hadn't. However I can't decide if it's worth
  reverting this at this time or not.

[1] By deprecating I mean, raise a deprecation warning, or something but my
    thoughts on actually removing the other methods are listed explicitly.

Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140817/97884aca/attachment.html>

More information about the Python-Dev mailing list