[pypy-dev] towards more parallelization in the tracer/optimizer
arigo at tunes.org
Fri Mar 23 08:15:49 CET 2012
On Thu, Mar 15, 2012 at 22:35, Timo Paulssen <timo at wakelift.de> wrote:
> If the raw trace - or a minimally optimized version of it - can be run
> directly, why not run it once or thrice while waiting for the optimizer to
> finish optimizing the trace fully?
Yes, it's something we already thought of, but there are a lot of
issues along the way. One is that tracing itself takes time too (not
just optimizing) and that cannot be done in parallel. Also, there are
harder issues about tweaking the GC: right now it "parallelizes"
trivially because it assumes the GIL, but that's no longer true if we
want to run the optimizer-and-backend part truly in parallel.
Note that the speed differences are larger than you seem to assume: in
very very rough orders of magnitude, if the interpreter takes 1 unit
of time to run one iteration of a loop, and the JITted trace takes
0.1, then I think that tracing takes 100 or 1000; and optimizing too.
But maybe more importantly, there is the following issue. Remember
that typically, in real-life cases, we need several tracings to make
one loop performant, each one tracing a different path, until all
common paths are covered. As long as not all common paths are traced,
running the partial trace is really slow: every time it hits a
not-yet-compiled branch, we need to fall back to the interpreter. So
it's unclear that we would get an overall speed-up over today's
More information about the pypy-dev