
Actually, my motivation was not to get Calibre to be faster -- I use it only occasionally. All I knew was that Calibre was an application (1) built on Python, that (2) the Python interpreter it used was baked-in to the distribution, and (3) it seemed to perform a number of operations somewhat slowly. It seems that whenever (1) and (2) hold, there is a potential opportunity for the wide-scale deployment of PyPy, taking it from being used on a handful of servers and enthusiasts computers to instead being deployed on thousands or 10s of thousands of end-user applications. Perhaps PyPy *might* not immediately lead to an increase in performance (though one suspects that in general, it would), but the mere fact that it's available to the application developer could inspire new development paradigms that take advantage of PyPy's features. And it could serve as a practical test-bed for deploying PyPy and for evaluating tweaks to it. Leo On Wed, Apr 11, 2012 at 12:24 AM, Stefan Behnel <stefan_ml@behnel.de> wrote:
Leo Trottier, 11.04.2012 02:23:
A number of Python applications (e.g. http://calibre-ebook.com/, http://www.psychopy.org/ ... http://en.wikipedia.org/wiki/List_of_Python_software#Applications) are deployed together with the libraries and interpreter that they will use.
Often, these applications are larger, and can end up performing operations that are computationally intensive. In the case of Calibre, e.g., large batch conversions from one book format to another can take more than an hour (for sufficiently large batches).
Are you sure the bottleneck is in Python code here? PyPy won't magically speed up image conversions for you, for example. You can expect it to be faster for HTML processing with its bundled html5lib, though, and maybe also PDF generation, which it seems to be using pyPDF for. However, for XML processing, which I would expect to be a substantial part of the work when converting between e-book formats, it appears to be using lxml - you can't beat that with PyPy.
Calibre likely won't run in PyPy directly as the GUI uses PyQT4 and it also uses extension modules for plugins. So I'm rather confident that it will not be easy to make it work at all with PyPy, or even to make any of the more interesting conversion pipelines work entirely in PyPy.
You can still give it a try, though. Maybe you can manage to get at least an HTML-to-PDF pipeline working by forking off an external PyPy process and porting the libraries.
However, you seem to be more interested in making it run fast than in making it run in PyPy. Your time may better be invested into pushing more parallel processing into the right places. You mentioned batch processing, that sounds like the bulk of the workload is trivially parallelisable. And maybe a bit of profiling against your specific processing needs would hint at a specific bottleneck that's easy to fix?
Stefan
_______________________________________________ pypy-dev mailing list pypy-dev@python.org http://mail.python.org/mailman/listinfo/pypy-dev