infos from the berlin-sprint (was Re: [pypy-dev] Sprint results?)
holger krekel
hpk at trillke.net
Sun Oct 5 18:36:32 CEST 2003
Hi Florian,
[Florian Schulze Sat, Oct 04, 2003 at 10:34:25PM +0200]
> Hi!
>
> How well did the sprint work out?
>
> I have seen that there is some pyrex code generation now and there are
> tests, but what where the results in this area during the sprint?
>
> Just a very short mail with some information would be grately appreciated.
Here is my take. Other mileages may vary so excuse me if i miss anything.
On Monday morning we made a few design decision which led to the
implementation of the following abstractions in the next two days:
- a new FlowObjSpace which does abstract interpretation
plus some very nice tricks (which we came up with during a
long-winded discussion in a restaurant :-) to construct
a FunctionGraph. This functiongraph (fully) represents the abstract
or symbolic execution of a function. e.g. for this function:
def while_func(i):
total = 0
while i > 0:
total = total + i
i = i - 1
return total
the following graph is generated (shown here in an slightly
optimized version):
http://codespeak.net/~hpk/while_func.ps
- 'make_dot.py' takes this flowmodel and generates a
nice graphical graph from it (see above urls)
- the pyrex-translator also takes this objectmodel (in flowmodel.py) and
generates Pyrex-Code from it. The generated code looks pretty low-level
but this is expected as we eventually want to generate C or assembly
directly. For the above function the following pyrex-source code is
generated (again with some easy optimizations applied):
def while_func(object v413):
v419, v420 = v413, 0
cinline "Label1:"
v422 = v419 > 0
if v422:
v424 = v420 + v419
v425 = v419 - 1
v419, v420 = v425, v424
cinline "goto Label1;"
else:
return v420
btw, the 'cinline' statement is a hack to pyrex and allows to insert
arbitrary C-code. An objectspace cannot really identify loops
and so we need "goto". We consider goto to be useful unless you have
to type and understand them manually :-)
- translator/annotation.py also takes the flowmodel and applies a
new technique for type inference: it uses space-operations to
note 'assertions' about variables and relaxes those assertions
during analysis of the flowgraph. IOW we didn't come up with
yet another type-system (which is the classical approach) but
reuse the notion of "space-operations" which we had from the beginning
of the project. Btw, Armin thinks that this type-inference algorithm
is worth a scientific paper but more about this either later and/or
from him.
- we adapted Jonathan David Riehl's Python-Parser (written completly
in python using its own "rex"-approach) and adapted it so that
it will be a drop-in replacement for CPython's current parser
(living the boring life of a C-extension). Actually Jonathan's
larger 'basil' project is now in the codespeak-repository and
we can easily link it into PyPy or branch off it if neccessary.
So alltogether the Flowgraph/Functiongraph/flowmodel (there is no
completly fixed terminology yet) is the central point for several
independent algorithms that - if combined - eventually produce typed C-code.
To sum it up there are the following abstractions:
interpreter interpreting bytecode, dispatching operations on objects to
objectspace implementing operations on boxed objects
stdobjspace a concrete space implementing python's standard type system
flowobjspace a conrete space performing abstract/symbolic interpretation and
producing a (bytecode-indepedent) flowmodel of execution
annotator analysing the flowmodel to infer types.
genpyrex taking the (annotated) flowmodel to generate pyrex-code
pyrex translating into an C-extension
As a consequence the former Ann(otation)Space has been ripped apart
(partly into flowobjspace) and is gone now. Long live the flowspace.
A really nice property of the above abstractions is that they allow
development and testing *independently* from one another which was
of invaluable help. Thanks here go to Greg Ewing for Pyrex and sorry
for the evil cinline-hack :-)
Anybody interested in helping with the next steps might look into
the TODO file in the pypy-root directory. We also have discussed
yesterday evening a refactored flowmodel which we want to employ
soon.
Big thanks go to Tomek Meka and Christian Tismer for organizing the
sprint and Stephan Diehl and Dinu Gherman for their help in various
organizational areas. And especially to Jonathan David Riehl who
made it from Chicago. We hope he can stay with us more often. And
here is a (hopefully complete) list of people who attended and made
all of the above possible:
Armin Rigo
Christian Tismer
Dinu Gherman
Guenter Jantzen
Jonathan David Riehl
Samuele Pedroni
Stephan Diehl
Tomek Meka
and shame on me if i forgot anyone (i am tired ...)
And of course many many thanks to Laura Creighton (AB Strakt),
Nicolas Chauvat (Logilab) and Alistair Burt (DFKI) who tried hard to
work with us on EU-funding-issues. Actually we came up with a nice technical
2-year plan but a lot of business issues still need to be resolved
and fixed. Let's hope that the EU-funding effort is as successful as
our coding sprints this year has been. Ah yes, the next sprint we hope
to do mid-december probably in Amsterdam. If all goes well (some more
people helping between the sprints that is :-) we might even do a first
public release with PyPy prototypically running as a C-extension to CPython.
That's it for now from me. (sprinters: Please correct/fix any issues i
misrepresented)
cheers,
holger
More information about the Pypy-dev
mailing list