[pypy-dev] Re: psyco in Python was: Minimal Python project

Armin Rigo arigo at tunes.org
Sat Jan 18 20:00:54 CET 2003


Hello Edward,

All your comments about Psyco are founded, but you are focusing too much on
the "back-end" part --- which I understand, given your impressive compiler
technology background!

On Fri, Jan 17, 2003 at 10:34:40AM -0600, Edward K. Ream wrote:
> 1. psyco has, in fact, no better information than does a C compiler.

People answered that Psyco can have a bit more information, because it could
do more constant folding or optimize multiplications by constants.  Rarely a
huge win.  But that's not what I had in mind when saying that "Psyco has more
information".  It has a much higher-level view of the program.  If you have a
specific algorithm which you completely coded in a couple of C functions, then
after compiling it with a good C compiler the algorithm will run at a speed
that cannot be beaten.  But C is not well suited for larger applications
consisting mainly of management stuff --- precisely what Python is much better
for (but you know this well, I'm sure).  These are the cases that interest me.

Python gives a higher-level view of the application. Psyco can, for example,
measure that some data structure (say a list) is most used in this or that
way, and choose a suited implementation.  For example, a list in the middle of
which numerous inserts and deletes are done could be implemented as a
red-black tree.  Of course, in a pure C implementation of the application we
can also use red-black trees, but who does?  Not many C application actually
use the correct implementation of their data structures :-(  More importantly,
which implementation is the best one must be hard-wired in advance in C and
makes it difficult to switch later.  This is where I expect interesting gains
from a sufficiently advanced Psyco.

There is already one example of this.  In Python, if you build a large string
by successively concatenating a lot of small strings, you get bad results.  
You have to rewrite your algorithm in a less straightforward style, e.g.
accumulating the strings in a list and only at the end using "''.join(list)".  
You have the same problem in C if you repeatedly use a simple two-strings
concatenation function, but the C compiler cannot do anything to help here.  
Psyco (upcoming version 1.0) can already help: it compiles the Python function
by choosing to implement the string as a Python list of strings.  The "join()"
is only done when the resulting string is needed outside the function.  It is
a case where higher-level programming languages let compiling tools select
algorithms that actually decrease the complexity --- meaning that the result
can run faster than the C version by more than a constant factor.

Of course, you could have made the C version wiser in this case.  But again
you cannot do this when your application becomes very large and where you just
don't know what implementation is better without doing sophisticated profiling
on hopefully representative sample data.


> 1. There is no need to burden this project with unrealistic expectations.
> Python doesn't have to beat C for Python to rule the world! :-)

Yes, exactly.  I hope I have made it clear that Psyco is not the
all-or-nothing way for this project to succeed.  In my opinion it is essential
to write the Python-in-Python interpreter with important restrictions, so that
static tools can do a lot from this code.  Compile-time optimizations of
Python, if you like, althought I prefer to see it as translation from a
well-defined Pythonic frame to any of several possible projects that could use
the source (including a CPython-like interpreter, and including a Psyco-like
project).  The current goal should clearly be to have a Python interpreter in
Python, written with restrictions that are well understood.


A bientot,

Armin.



More information about the Pypy-dev mailing list