[pypy-dev] performance problems with Krakatau

Robert Grosse n210241048576 at gmail.com
Sun Apr 13 05:10:45 CEST 2014


Hi again,

I recently updated Pypy to (pypy-c-jit-70483-2d8eaa5f5079-win32), and
Pypy's performance is much better now. I also addressed the previously
mentioned issues in Krakatau so it is faster on both CPython and Pypy.
However, I have noticed that there are still some cases in which CPython
outperforms Pypy.

I created a benchmark using one class I noticed with the biggest discrepancy

https://github.com/Storyyeller/Krakatau.git commit
88a5a24deb3a8e6d0d92aca2052ea1db6a7274a0

You can run it via
python Krakatau\benchmark.py -path whatever\rt.jar
where you pass the path to your JRE's rt.jar as appropriate

This benchmark is based on decompiling a single class,
sun/text/normalizer/Utility from the JRE. The benchmark decompiles the
class 40 times beforehand to warmup the jit and then measures the time
taken to decompile it 200 times using time.time(). I recorded memory usage
manually via the Windows Task Manager using Peak Working Set. I used the
Java 7u51 JRE, but I expect any version to be the same as I doubt the class
changed much.

CPython: 202.8 seconds, 47.5mb
Pypy: 284.3 seconds, 229.2mb

The memory usage isn't too concerning to me, since I imagine that a JIT has
higher fixed overhead, but I find it strange that CPython also executes
faster for this class, since it is all pure Python CPU bound computation.




On Thu, Jan 16, 2014 at 5:51 AM, Maciej Fijalkowski <fijall at gmail.com>wrote:

> Hi Robert.
>
> This is going to be a long mail, so bear with me :)
>
> The first take away is that pypy warmup is atrocious (that's
> unimpressive, but you might be delighted to hear I'm working on it
> right now, except I'm writing this mail). It's quite a bit of work, so
> it might or might not make it to the next pypy release. We also don't
> know how well it'll work.
>
> The runs that I have now, when running 3 times in the same process
> look like this (this includes other improvements mentioned later):
>
> 46s 32s 29s (cpython takes always 29s)
>
> Now, this is far from ideal and we're working on making it better (in
> fact it's a very useful benchmark), but I can pinpoint some stuff that
> we will fix and some stuff we won't fix in the near future. One thing
> that I've already fixed today is loops over tuple when doing x in
> tuple (so tuple.__contains__).
>
> One of the problems with this code is that I don't think it's very
> efficient. While that's not a good reason to be slower than cpython,
> it gives you an upper bound on what can be optimized away. Example
> (from java/structuring.py):
>
> new = new if old is None else tuple(x for x in old if x in new)
>
> now note that this has a complexity of O(n^2), because you're
> iterating for all of the one tuple and then for each over all of the
> elements of the other tuple.
>
> Another example:
>
> return [x for x in zip(*map(self._doms.get, nodes)) if
> len(set(x))==1][-1][0]
>
> this creates quite a few lists, while all it wants to do is to grab
> the last one.
>
> Those tiny loops are found a bit everywhere. I think more consistent
> data structures will make it a lot faster on both CPython and PyPy.
>
> From our side, we'll improve generator iterators today and warmup some
> time in the not-so-near future.
>
> Speaking of which - memory consumptions is absolutely atrocious. It's
> a combination of JIT using too much memory, generator iterators not
> being cleaned correctly *and* some bug that prevents JIT loops from
> being freed. we'll deal with all of it, give us some time (that said,
> the memory consumption *will* be bigger than cpython, but hopefully by
> not that much).
>
> I'm sorry I can't help you as much as I wanted
>
> Cheers,
> fijal
>
>
> On Thu, Jan 16, 2014 at 10:50 AM, Maciej Fijalkowski <fijall at gmail.com>
> wrote:
> > On Wed, Jan 15, 2014 at 7:20 PM, Robert Grosse <n210241048576 at gmail.com>
> wrote:
> >> Oh sorry, I forgot about that.
> >>
> >> You need to find the rt.jar from your Java installation and pass the
> path on
> >> the command line. For example, if it's located in C:\Program
> >> Files\Java\jre7\lib, you could do
> >> python -i Krakatau\decompile.py -out temp asm-debug-all-4.1.jar -path
> >> "C:\Program Files\Java\jre7\lib\rt.jar"
> >> Obviously on Linux it will be somewhere else. It shouldn't really matter
> >> which version of Java you have since the standard library is pretty
> stable..
> >
> > Thanks, I'm looking into it. Would you mind if we add Krakatau as a
> > benchmark for our nightlies?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20140412/066e5ebc/attachment.html>


More information about the pypy-dev mailing list