[pypy-dev] Threaded interpretation (was: Re: compiler optimizations: collecting ideas)

Paolo Giarrusso p.giarrusso at gmail.com
Sat Jan 3 05:59:57 CET 2009


On Fri, Jan 2, 2009 at 14:23, Armin Rigo <arigo at tunes.org> wrote:
> Hi Paolo,

> On Thu, Dec 25, 2008 at 12:42:18AM +0100, Paolo Giarrusso wrote:
>> If I'll want to try something without refcounting, I'll guess I'd turn
>> to PyPy, but don't hold your breath for that. The fact that indirect
>> threading didn't work, that you're 1.5-2x slower than CPython, and
>> that you store locals in frame objects, they all show that the
>> abstraction overhead of the interpret is too high.

> True, but a 1.5x slowdown is not a big deal on many application; the
> blocker is mostly elsewhere.

Let's say 1.5x * 2x = 3x, since CPython is not as fast as it could be,
because of refcounting for instance. The 2x is taken from PyVM
performance reports (see http://www.python.org/dev/implementations/).
And for PyPy a slower interpreter means a slower JIT output, isn't it?
See below.

Also, according to CS literature, interpreter performance makes more
difference than JIT. According to the paper about efficient
interpreters I'm mentioning endless times, an inefficient interpreter
is 1000x slower than C, while an efficient one is only 10x slower.
Python is not that slow, but what you wrote about Psyco seems to imply
that there is still lots of room for improvement.

"You might get 10x to 100x speed-ups. It is theoretically possible to
actually speed up this kind of code up to the performance of C
itself."

At some point, I should repeat these comparison with the OcaML interpreter:
  http://mail.python.org/pipermail/python-list/2000-December/063212.html
to see how faster it got :-).

> And on the other hand, we've been working
> on the JIT generator -- since a while now, so I cannot make any promise
> -- and the goal is to turn this slowish interpreter into a good JITting
> one "for free".  As far as I know, this cannot be done so easily if you
> start from a lower-level interpreter like CPython.  This is our "real"
> goal somehow, or one of our real goals: a JITting virtual machine for
> any language that we care to write an interpreter for.

That's amazing - I knew that, but your explanation is much nicer,
since it gives the emphasis to the right thing.
>From a native interpreter you can only do code copying, which removes
dispatch costs but is still suboptimal.

And well, the current overhead can be removed with further
improvements on the interpreter and/or the RPython translator. And it
is important, for two reasons:
1) an interpreter is still needed to profile the code; experience with
Java showed that only having a JIT gives too much latency, and people
expect even less latency from Python.
2) partially different translators are used I guess, but since the
compiler and the interpreter are built from the same source, part of
the 3x slowdown would have effect also on the JIT-ed code.

In any case, even if it may be unclear, I do like your plans.
Believe me, lack of time until now is the only thing which prevented
me from actually joining you and coding.

>> And the original idea was to show that real multithreading (without a
>> global interpreter lock) cannot be done in Python just because of the
>> big design mistakes of CPython.

> I disagree on this point.  Jython shows that without redesign of the
> language, they can implement a real multithreading model without a GIL.

Ok, first I have to rephrase what I said. Python in itself is a Good
Thing. BUT, there are some big design mistakes about some semantic
details of __del__, which make a difference for code using files
without calling close() for instance. To be polite, those semantics
are totally brain-dead, unless one worships reference counting and
considers GC as the devil.

Having clarified that, I can state that indeed, Jython may only show
the opposite of what you say; indeed I like to say that they didn't
bother to be bug-to-bug compatible with Python _specification bugs_,
and that's why it seems they proved your point.

Need I point out that reference cycles can't be reclaimed in Python if
any involved object implements __del__?
I know the Python gc module allows to fix that, but in really clumsy
ways. I was going to bash Python specs a lot more, but I'll spare your
time for now :-).

Regards
-- 
Paolo Giarrusso



More information about the Pypy-dev mailing list