[pypy-dev] cpyext performance

Stefan Behnel stefan_ml at behnel.de
Tue Jul 3 19:56:31 CEST 2012


Amaury Forgeot d'Arc, 03.07.2012 18:26:
> 2012/7/3 Stefan Behnel
>> BTW, are PyObject structures currently cached in a free-list somewhere?
>> That would be really helpful for the iteration performance.
> 
> No optimization of any kind have been done in cpyext (it's difficult enough
> to get it right...)

I'm sure it is.


> A freelist would be a nice thing, but there would still be the cost of
> attaching the PyObject to the pypy w_object.

Any reduction in the cost of passing and cleaning up objects would
dramatically improve the overall performance of the interface.


> Maybe we should use a weak dictionary to cache the PyObject structure.
> This already exists for objects defined and created from C...

That would be really helpful. In particular, it would solve one of the most
annoying problems that extensions currently have: even if you keep a Python
reference to an object, e.g. in a list, its PyObject structure will die
once the last C reference to it is gone. That is really hard to work around
in some cases. It's very common to keep e.g. a Python list (or set) of byte
strings and pass their char* buffer pointers into a C library. That doesn't
currently work with cpyext.


>>>> OverflowError: array too large
>>>
>>> Looks like a ctypes bug to me. Which OS, Python, etc. are you using?
>>
>> Ah - totally, sure. I accidentally ran the system Py2.5 on 64bit Linux.
>> Running it with Py2.7 fixes this specific problem, thanks for the hint!
>> Although it now names the extension module "nbody.so" instead of
>> "nbody.pypy-19.so". Comprend qui peut ...
>>
>> After figuring out that I was supposed to enable cpyext manually and
>> running strace to see what extension module name it is actually looking
>> for, I failed to make it load the module it just built regardless of how I
>> named it, so I tried building it within the same run as follows:
>>
>>   pypy/bin/py.py --withmod-cpyext -c 'import setup; import nbody; \
>>                                       nbody.test_nbody(1)'  build_ext -i
> 
> Ah, but this won't work!
> py.py runs on top of CPython, so the PyString_AsString symbol is already
> defined by your CPython interpreter!

Right. I keep forgetting that. This inherent indirection in PyPy makes
things seriously complicated. (And no, that's not a good thing.)

It would be helpful if it printed an error message giving a hint of why it
failed, instead of just stating that it failed to load the extension (I can
see that it failed, dammit!).


> There is a workaround though: compile your extension module with
>    python2.7 pypy/module/cpyext/presetup.py setup.py build_ext -i
> 
> presetup.py will patch distutils, and create a module "nbody.pypy-19i.so"
> (note the i) which works on top of an *interpreted* pypy.

Is there a reason why an interpreted PyPy cannot always do this? I mean, it
can't work without this, can it?


> Among the hacks, all symbols are renamed: #define PyString_AsString
> PyPyString_AsString.
> 
> Then this should work:
>    pypy/bin/py.py --withmod-cpyext -c "import nbody"

Ok, that did the trick.

This should be in a tutorial somewhere. Or maybe I should just dump it into
a blog post.


> *very* slowly of course, but I was able to debug pygames this way!

The problem is not so much that it's generally slow but that the
performance characteristics of the Python code are likely way different
than those of the translated C code. That's certainly the case for Cython
code, running cProfile over Python code, running it over the compiled
module and running callgrind over it often yields totally different
results. That's why I would prefer running this through callgrind instead
of Python+profile (I noticed that cProfile doesn't work either).

Is there a way to get readable debugging symbols in a translated PyPy that
would tell me what is being executed?

Stefan



More information about the pypy-dev mailing list