[Python-Dev] bytes & bytearray

Tue Jan 20 17:51:31 CET 2015

On Tue, Jan 20, 2015 at 11:48:10AM +0200, Paul Sokolovsky wrote:
> Hello,
> 
> On Tue, 20 Jan 2015 18:15:02 +1300
> Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> 
> > Guido van Rossum wrote:
> > > On Mon, Jan 19, 2015 at 11:43 AM, Paul Sokolovsky
> > > <pmiscml at gmail.com <mailto:pmiscml at gmail.com>> wrote:
> > > 
> > >     b.lower_inplace()
> > >     b.lower_i()
> > > 
> > > Please don't go there. The use cases are too rare.
> > 
> > And if you have such a use case, it's not too
> > hard to do
> > 
> >    b[:] = b.lower()
> 
> The point of inplace operations (memoryview's, other stuff already in
> Python) is to avoid unneeded memory allocation and copying. For 1Tb
> bytearray with 1Tb of RAM, it will be very hard to do. (Ditto for 100K
> bytearray with 150K RAM.)

You can just loop through the bytearray and assign elements.  I use
something along the lines of this for PyParallel where I'm operating on
bytearrays that are backed by underlying socket buffers, where I don't
want to do any memory allocations/reallocations:

    def toupper_bytes(data):
        assert isinstance(data, bytearray)
        a = ord('a')
        z = ord('z')
        for i in range(0, len(data)):
            c = data[i]
            if c >= a and c <= z:
                data[i] = c - 32

Low overhead, mostly stays within the same ceval frame.  Should be a
walk in the park for PyPy, Cython or Numba to optimize, too.

    Trent.