Newbie: "compiling" scrips (?)

Alan Kennedy alanmk at
Wed Jun 25 18:20:40 CEST 2003

Gerrit Holl wrote:

> I don't understand how pypy can improve performance. Am I correct that pypy
> is an effort to implement Python as much as possible in Python? 

Yes, that's a part of what the PyPy people are doing.

> And am I
> correct that implementing a module in C will probably speed it up? 

In some scenarios, implementing a module in C will speed things up, but it is
not guaranteed. It depends on the data being transferred back and forth between
the C extension and the users python script. This transfer comes at a cost. If
the cost of the transfer outweighs the time spent in the C extension, then it
will be slower. Which doesn't happen very often, mind.

> Combining
> these two, wouldn't implementing Python in Python slow things down? 

Not necessarily. That's true only if you assume that the python virtual machine
interpreting the python byte codes doesn't run any faster than the current
Python VM.

> If so,
> what is the advantage of pypy?

>From my readings, I think the PyPy project is very ambitious (and I wish them
the very best of luck getting funding for it). 

Since the current python implementation has gone through many evolutions, it has
become tough to manage (at a code level) in certain areas. This is in no small
part due to the number of platforms and C compilers that the current python
works with, and the number of utilities used in producing the compiler, for
lexing, parsing, etc. Because there is so much "inertia" in the codebase, a lot
of time and effort go into making small changes.

Reimplementing python, from the grammar up, in python would free up much
development resources, allowing the python developers to focus more on things
that are important, e.g. improving the language, and adding new libraries, etc.
Also, PyPy can "leverage" (I hate that word) the fantastic portability of the
current python implementation to be running everywhere quickly: i.e. the current
Python VM would be the "bootstrap" VM.

Also, a clean redesign of the compiler and interpreter would allow for hooks to
be inserted for such products as psyco, which change the operation of the main
interpreter loop (usually optimising it for speed efficiency). This is
comparable to JIT in the jython world. Your jython script *may* run faster on
one Java VM compared to another, because the JVM interpreting the jvm bytecodes
may be using fancy assembler-level tricks to optimise for speed/memory/etc, e.g.
the "hotspot" VM. You get the benefit of increased execution speed, without a
single code change.

Another advantage is that because your VM is simpler to understand, because it's
written in python, it is easier to maintain and optimise. There is a data
structure called an "Abstract Syntax Tree", which contains a representation of
python your scripts as they are running. Sometimes, by recording statistics and
doing pattern matching on the bytecodes that are running through the machine, it
is possible to reduce the number of operations needed to execute a given script:
this is called optimisation. Native code compilers do it all the time, but
obviously only once at compilation time. Interpreters can optimise in exactly
same way, but at run time. The above mentioned hotspot java VM is an instance of
one of these.

There are many pieces to what the PyPy people are trying to do, and they have a
lot of work to do building the separate pieces, with nothing to show until it's
all ready to fit nicely together. But I think their overall model looks
promising. So hopefully, with the description above, you'll be in a better
position to understand the content of these links.

For a look at the kinds of tricks that optimising interpreters can use, in the
java world, take a look over this link

Looking at the PyPy object model, I can't help but wonder how they're going to
deal with continuations? (If full integration of stackless is the intention).

alan kennedy
check http headers here:
email alan:    

More information about the Python-list mailing list