[portland] Raw buffers in python

Michael Schurter michael at susens-schurter.com
Tue Nov 1 00:44:50 CET 2011


On Mon, Oct 31, 2011 at 4:30 PM, Benjamin van der Veen
<b at bvanderveen.com> wrote:
> Hey all,
>
> What's the best way to make a raw buffer out of a string in Python? They
> key requirement is that inspecting a range of characters in the resulting
> object does not cause a copy/allocation. Correct me if I'm wrong, but it
> seems like sequence types (produced by memoryview, bytearray, etc) all
> return a character or integer object when you access items by index.
>
> An example usecase is a parser—you have a string buffer the user gave you,
> and you're iterating over each character. It would be nice not to allocate
> a heap object for every iteration.

If I remember correctly, instances of all single byte ints (0 <= x <
256) are created automatically, so iterating over a bytearray may
avoid new object creation.

> Hm, although, now that I'm typing this up, I'm realizing it might be a bit
> ridiculous to be asking for a raw char or int in Python. :S

Yeah, everything is a heap allocated PyObject in CPython. Check out
Cython for writing Python that compiles to C and can use raw C types
(bytes & chars) without much effort. Depending on your use case and
dependencies, PyPy might be worth checking out. Doing lots of
iterations over homogeneous data types is where JITs shine.


More information about the Portland mailing list