[Python-Dev] Cython for cPickle?

Stefan Behnel stefan_ml at behnel.de
Thu Apr 19 10:55:24 CEST 2012


Hi,

I noticed that there is a PEP (3154) and a GSoC proposal about improving
Pickle. Given the recent discussion on this list about using Cython for the
import module, I wonder if it wouldn't make even more sense to switch from
a C (accelerator) implementation to Cython for _pickle.

The rationale is that C code that deals a lot with object operations tends
to be rather verbose, and _pickle specifically looks very verbose in many
places. Some of this is optimised I/O, ok, but most of it seems to take its
complexity from code specialisations for builtin types and a lot of error
handling code. A Cython reimplementation would take a lot of weight out of
this.

Note that the approach won't be as simple as compiling pickle.py. _pickle
uses a lot of optimisations that only work at the C level, at least
efficiently. So the idea would be to rewrite _pickle in Cython instead.
It's currently about 6500 lines of C. Even if we divide that only by a
rather conservative factor of 3, we'd end up with some 2000 lines of Cython
code, all extracted straight from the existing C code. That sounds like
less than two weeks of work, maybe even if we add the marshal module to it.
In less than a month of GSoC time, this could easily reach a point where
it's "close to the speed of what we have" and "fast enough", but a lot more
accessible and maintainable, thus also making it easier to add the
extensions described in the PEP.

What do you think?

Stefan



More information about the Python-Dev mailing list