[pypy-dev] PyPy as part of a larger, bundled project?

Stefan Behnel stefan_ml at behnel.de
Wed Apr 11 09:24:52 CEST 2012


Leo Trottier, 11.04.2012 02:23:
> A number of Python applications (e.g. http://calibre-ebook.com/,
> http://www.psychopy.org/ ...
> http://en.wikipedia.org/wiki/List_of_Python_software#Applications) are
> deployed together with the libraries and interpreter that they will use.
> 
> Often, these applications are larger, and can end up performing operations
> that are computationally intensive. In the case of Calibre, e.g., large
> batch conversions from one book format to another can take more than an
> hour (for sufficiently large batches).

Are you sure the bottleneck is in Python code here? PyPy won't magically
speed up image conversions for you, for example. You can expect it to be
faster for HTML processing with its bundled html5lib, though, and maybe
also PDF generation, which it seems to be using pyPDF for. However, for XML
processing, which I would expect to be a substantial part of the work when
converting between e-book formats, it appears to be using lxml - you can't
beat that with PyPy.

Calibre likely won't run in PyPy directly as the GUI uses PyQT4 and it also
uses extension modules for plugins. So I'm rather confident that it will
not be easy to make it work at all with PyPy, or even to make any of the
more interesting conversion pipelines work entirely in PyPy.

You can still give it a try, though. Maybe you can manage to get at least
an HTML-to-PDF pipeline working by forking off an external PyPy process and
porting the libraries.

However, you seem to be more interested in making it run fast than in
making it run in PyPy. Your time may better be invested into pushing more
parallel processing into the right places. You mentioned batch processing,
that sounds like the bulk of the workload is trivially parallelisable. And
maybe a bit of profiling against your specific processing needs would hint
at a specific bottleneck that's easy to fix?

Stefan



More information about the pypy-dev mailing list