[Python-Dev] reference counting in Py3K
jcarlson at uci.edu
Wed Sep 7 09:57:36 CEST 2005
Guido van Rossum <guido at python.org> wrote:
> On 9/6/05, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> > A better plan would be to build something akin to
> > Pyrex into the scheme of things, so that all the
> > refcount/GC issues are taken care of automatically.
> That sounds exciting. I have to admit that despite hearing many
> enthusiastic reviews, I've never used it myself -- in fact I've
> written very little C code in the last few years, and zero new
> extension modules. (Lots of Java, but that's another story. :-)
Here's a perspective "from the trenches" as it were.
I've been writing quite a bit of code, initially all in Python (27k
lines in the last year or so). It worked reasonably well and fast. It
wasn't fast enough. I needed a 25x increase in performance, which would
have been easily attainable if I were to rewrite everything in C, but
writing a module in pure C is a bit of a pain (as others can attest), so
I gave Pyrex a shot (after scipy.weave.inline, ick).
Initial versions ran around 2-3x as fast as pure Python. With various
tricks, we are now running 75-100x faster in the pure Pyrex portions,
with another 2-3x improvement possible (even using the VC6 compiler in
Windows and old versions of gcc in linux, talk about multi-platform
With experience comes wisdom. I write new functionality that needs to
be fast in pure C, wrapping it with Pyrex as necessary (which is quite
simple), and make it all work with Python.
> I expect that many standard extensions could benefit from a rewrite in
> Pyrex, although this might take a lot of work and in some cases not
> necessarily result in better code (_tkinter comes to mind -- though I
> don't really know why this would be). So this shouldn't be the goal
> (yet). Instead, we should encourage folks to write *new* extensions
> using Pyrex.
I'm not sure this is necessarily desireable. In my limited experience,
one starts doing a line-by-line translation, getting Python objects as
variables, etc. Then one starts predefining C variables and working
with them, increasing speed by some measureable amount. Then one starts
thinking about the data structures that are being passed (lists of lists,
dictionary of lists, lists of dictionaries, ...), at which point one
starts digging into PyList_GetItem, etc., manual in/decrefing, ..., and
one's code starts getting the ugly of C modules, without the braces and
Offering it up as a standard library module: cool, +1. Give people one
of the the best tools for wrapping C code and writing high-performance
Encouraging its use for the writing of new extension modules: ick, -1.
Writing pretty yet high performing Pyrex is an art that I'm not sure
anyone can master.
Perhaps a bit into the future, extending import semantics to notice .pyx
files, compare their checksum against a stored md5 in the compiled
.pyd/.so, and automatically recompiling them if they (or their includes)
have changed: +10 (I end up doing this kind of thing by hand with
phantom auto-build modules).
More information about the Python-Dev