[pypy-dev] Questions for Armin

Sun Jan 19 05:40:33 CET 2003

Hello Armin,

> On Fri, Jan 17, 2003 at 01:08:28PM -0600, Edward K. Ream wrote:
> > 1.  How often and under what circumstances does psyco_compatible get
called?
> >
> > My _guess_ is that it gets called once per every invocation of every
> > "psycotic" function (function optimized by psyco).  Is this correct?
>
> No: psyco_compatible() is only called at compile-time.
[snip]
> Only when a new, not-already-seen type appears does it follow the
> "uncommon_case" branch.  This triggers more compilation, i.e. emission of
more
> machine code...

Many thanks for this most interesting and informative reply.  It clears up a
lot of my questions.  I feel much more free to focus on the big picture.

> All your comments about Psyco are founded, but you are focusing too much
on
> the "back-end" part...

Yes.  I have been focusing on an "accounting" question: how often does the
compiler run?  If the compiler starts from scratch every time a program is
run, then I gather from your example that the compiler will be called once
for every type of very argument for every executed function _every time the
program runs_.  Perhaps you are assuming that the gains from compiling will
be so large that it doesn't matter how often the compiler runs.

Yesturday I realized that it _doesn't matter_ whether this assumption is
true or not.  Indeed, suppose that we expand the notion of what "byte code"
is to include information generated by the compiler: compiled machine code,
statistics, requests for further optimizations, whatever.  The compiler
could rewrite the byte code in order to avoid work the next time the program
runs.  Now the compiler runs less often: using exactly the same scheme as
before the compiler will run once for every type of very argument for every
executed function _every time the source code changes_.  This means that no
matter how slowly the compiler runs the _amortized_ runtime cost of the
compiler can be made to be asymptotically zero!

This is an important theoretical result: the project can never fail due to
the cost of compilation.  This result might also allow us to expand our
notion of what is possible in the implementation.  You are free to consider
any kind of algorithm at all, no matter how expansive.  For instance, there
was some discussion on another thread of a minimal VM for portability.
Maybe that vm could be the intermediate code list of gcc?  If compilation
speed isn't important, the "compiler" would simply be the front end for gcc.
We would only need to modify the actual emitters in gcc to output to the
"byte code".  We get all the good work of the gcc code generators for free.
Retargeting psyco would be trivial.

These are not proposals for implementation, and certainly not requests that
you modify what you are planning to do in any way.  Rather, they are "safety
proofs" that we need not be concerned about compilation speed _at all_,
provided that you (or rather Guido) is willing to expand the notion of the
"byte code".  This could be done whenever convenient, or never.  The point
is that my worries about the cost of compilation were unfounded.
Compilation cost can never be a "gotcha"; a pressure-relief value is always
available.  Perhaps this has always been obvious to you; it wasn't at all
clear to me until yesterday.

Edward