[ann] Minimal Python project

Mon Jan 13 01:16:24 EST 2003

Christian Tismer <tismer at tismer.com> writes:
> > I remember seeing a paper once about OpenGenera which was a
> > proprietary (not "open" despite its name) program to run Symbolics
> > Genera programs on the DEC Alpha by simulating the Lisp machine
> > instruction set.  It basically said they found that by making the
> > simulation's interpreter loop and register set small enough to fit
> > in the Alpha's primary caches it ran as fast as they'd expect
> > microcode to run.
> 
> I agree. With a real engine like an Alpha.
> 
> My target physical engine is clearly the whole set of X86's which
> are dominating the world (still).  I've written several small
> intepreters in C, also tried to make some fast Forth interpreter,
> and always found that the X86 doesn't have enough registers to make
> a fast, interpreted stack engine.  Whenever you try to get the C
> compiler to keep your vital variables in registers, then you have
> nearly nothing left to work with.  Maybe I could have optimized
> towards the primary cache, I never did this.  By creating a register
> machine, I might have the chance to get the register file into that
> cache, and to have the interpreter loop variables in registers, still.

I think for the x86 you should try to generate native code rather than
an efficient interpreter.  In general though, if the interpreter does
a reasonable job, IMO it's not worth trying to optimize it too much.
I'd instead concentrate more on making the compiler easy to retarget.

Btw, recent x86's have very fast L1 caches.  With just a little bit of
attention to pipeline scheduling you should be able to avoid almost
all stalls.