[pypy-dev] go benchmark

Mon Aug 15 20:07:02 CEST 2011

Hi,
I've been looking a bit at the go benchmark, which sometimes is
significantly slower on jit-short_from_state.

What's happening with default trace_limit=12000 on trunk is that the
loop in random_playout is compiled into a loop containing 209 guards
and the loop in computer_move is compiled into a loop with 390 guards.
After that a lot of bridges are created. So I'm guessing we have some
large loops here with a lot of different paths that are common.

I patched go.py to print the number of loops, bridges and guards that
was created each iteration using a pypyjit.set_compile_hook. It also
prints when a loop with more than 200 guards are produced:

    http://paste.pocoo.org/show/458824/

Could this be fixed by...
... stop producing bridges after a fixed number of have been produced
with the same root loop?
... never produce a bridge if it's parent trace have a large number of
guards that have failed more than trace_eagerness/2 times?
... falling back on using a method jit in the above cases ;)

Is this something I should investigate on jit-short_from_state, or
could we consider it a separate problem and merge jit-short_from_state
anyway? The only connection to jit-short_from_state I can think of is
that bridges probably have become somewhat more expensive to produce.

-- 
Håkan Ardö