[Python-Dev] new buffer in python2.7

Antoine Pitrou solipsis at pitrou.net
Wed Oct 27 12:36:22 CEST 2010

On Wed, 27 Oct 2010 10:13:12 +0800
Kristján Valur Jónsson <kristjan at ccpgames.com> wrote:
> Although 2.7 has the new buffer interface and memoryview
> objects, these are widely not accepted in the built in modules.

That's true, and slightly unfortunate. It could be a reason for
switching to 3.1/3.2 :-)

> IMHO this is unfortunate.  For example when doign network io, you would want code like this:
> Buffer = bytearray(10)
> Socket.recv_into(Buffer)
> Header = struct.unpack("i", memoryview(Buffer)[:4])[0]

This can be an useless micro-optimization.

People are often misled by the implicit analogy with C. In Python,
a "lazy slice" still allocates memory for a whole new PyObject (for
example a memoryview). So lazy slices are only a win if they are
actually big (because a raw memcpy() is fast).

Actually, lazy slices can be *slower* if they instantatiate an object
whose allocation is less optimized than the built-in bytes object's.

Here are micro-benchmarks under 3.2:

$ ./python -m timeit -s "x = b'x'*10000" "x[:100]"
10000000 loops, best of 3: 0.134 usec per loop
$ ./python -m timeit -s "x = memoryview(b'x'*10000)" "x[:100]"
10000000 loops, best of 3: 0.151 usec per loop

$ ./python -m timeit -s "x = b'x'*10000" "x[:1000]"
1000000 loops, best of 3: 0.228 usec per loop
$ ./python -m timeit -s "x = memoryview(b'x'*10000)" "x[:1000]"
10000000 loops, best of 3: 0.151 usec per loop

So, as you see, creating a 100-byte slice from a 10 KB bytestring is
faster when using normal (eager) slices.
It becomes slower when creating a 1KB slice, but is still very fast
(under one microsecond).

> Not forgetggin the StringI object in cStringIO.
> IMHO, not accepting buffers by these objects can be consided a bug,
> that needs fixing.

It is often tempting to say that a "necessary" feature is a bug, but
it's a slippery slope. I would say it's only a bug when it's been
documented to work. I don't think StringIO objects have ever supported
the buffer protocol. In 3.2, though, you can use the
BytesIO.getbuffer() method:

(another reason to switch perhaps :-))

In any case, I think it should be the release manager's decision here.



More information about the Python-Dev mailing list