[Python-Dev] Cython for cPickle?

Stefan Behnel stefan_ml at behnel.de
Thu Apr 19 23:08:20 CEST 2012


Matt Joiner, 19.04.2012 16:13:
> Personally I find the unholy product of C and Python that is Cython to be
> more complex than the sum of the complexities of its parts. Is it really
> wise to be learning Cython without already knowing C, Python, and the
> CPython object model?

The main obstacle that I regularly see for users of the C-API is actually
reference counting and an understanding of what borrowed references and
owned references imply in a given code context. In fact, I can't remember
seeing any C extension code getting posted on Python mailing lists (core
developers excluded) that has no ref-counting bugs or at least a severe
lack of error handling. Usually, such code is also accompanied by a comment
that the author is not sure if everything is correct and asks for advice,
and that's rather independent of the functional complexity of the code
snippet. OTOH, I've also seen a couple of really dangerous code snippets
already that posters apparently meant to show off with, so not everyone is
aware of these obstacles.

Also, the C code by inexperienced programmers tends to be fairly
inefficient because they simply do not know what impact some convenience
functions have. So they tend to optimise prematurely in places where they
feel more comfortable, but that can never make up for the overhead that
simple and very conveniently looking C-API functions introduce in other
places. Value packing comes to mind.

So, from my experience, there is a serious learning curve beyond knowing C,
right from the start when trying to work on C extensions, including
CPython's own code, because the C-API is far from trivial.

And that's the kind of learning curve that Cython tries to lower. It makes
it substantially easier to write correct code, simply by letting you write
Python code instead of C plus C-API code. And once it works, you can start
making it explicitly faster by applying "I know what I'm doing" schemes to
proven hot spots or by partially rewriting it. And if you do not know yet
what you're doing, then *that's* where the learning curve begins. But by
then, your code is basically written, works more or less and can be
benchmarked.


> While code generation alleviates the burden of tedious languages, it's also
> infinitely more complex, makes debugging very difficult and adds to
> prerequisite knowledge, among other drawbacks.

You can use gdb for source level debugging of Cython code and cProfile to
profile it. Try that with C-API code.

Stefan



More information about the Python-Dev mailing list