[pypy-dev] More on optimization

Samuele Pedroni pedronis at bluewin.ch
Sun Oct 31 16:42:33 CET 2004

Armin Rigo wrote:
> Hi!
> Sorry for focusing the next sprint so much on translation.  This might have
> put some people off.  Well, lesson learned.
> It doesn't mean we should stop talking about translation :-)  Pushing the
> previously dicussed ideas to their conclusion, we get an interesting point of
> view...
> First, for the context, the current situation in the repository:  RPython code
> can be turned into a flow graph.  Then the annotation pass on the flow graph
> infers types, with the side effect that it infers which functions call which
> functions, so it helps to build the complete list of functions that should be
> translated.

yes, given that RPython is still OO, call-graph construction and type 
inference are related.

>  Finally, genc.py produces naive C code using PyObject* only.  
> The inferred types are not really used;

I still hope that for less C-like languages as target, the annotated 
graph is still directly useful.

> the annotation phase is currently
> useful only to build the complete call graph.  (Let's ignore the Lisp and
> Pyrex generators for the moment.)
> Here is an example with the flow graph's operations on the left and the
> corresponding C code (after macro expansion) on the right:
>   v3 = add(v1, v2)            v3 = PyNumber_Add(v1, v2);
> Quite obvious, and quite slow.  Imagine instead that there is a C-ish
> low-level language, whose input "syntax" is of course flow graphs, with only
> operations like the following ones:
> * llcall(func, arg1, arg2...)   # calls another function
> * lladd_int(x, y)               # adds two ints and returns an int
> * lleq_ptr(x, y)                # compares two pointers for equality
> * llstruct('name')              # creates an empty structure in the heap
> * llget_int(x, 'key')           # reads the field 'key' of struct object x
> * llset_int(x, 'key', value)    # writes the field 'key' of struct object x
> * llget_ptr(x, 'key')           # \
> * llget_ptr(x, 'key', value)    # / the same with fields containing a pointer
> The only data types would be "pointer to structure" and the atomic types like
> "int" and "char".  A structure would be essentially like a dictionary, with
> either strings or integers as keys (a dict with int keys is much like a list,
> without the ability to insert or remove elements easily).
> This very limited set of operations allows interesting global analysis and
> optimizations, like removing the allocation of structures in the heap
> altogether, or not writing some fields of some structures when they have a
> known constant value, or (most commonly) not using a hash table but just a 'C
> struct' with fields when all the keys are known.
> It is easy to modify our regular flow graphs until they only use the above
> operations: we replace high-level operations like 'add' with calls to (or
> inlined copies of) support functions like:
> def rpyhon_add(v, w):
>     if type(v) is int and type(w) is int:
>         a = llget_int(v, 'ob_ival')
>         b = llget_int(w, 'ob_ival')
>         x = llstruct()
>         llset(x, 'ob_type', int)
>         llset(x, 'ob_ival', lladd_int(a, b))
>         return x
>     else:
>         return llcall('PyNumber_Add', v, w)
> Only such support functions can use the ll*() functions, which stand directly
> for the corresponding ll* operation.  The ll*() functions themselves are not
> executable in the above Python source (actually they don't have to be callable
> objects at all).  They are only meant as placeholders.  If you wish, it's just
> an easy way to write down a complicated flow graph using ll* operations:
> instead of inventing a syntax and writing a parser for it, we use the Python
> syntax and the flowobjspace to build our flow graphs.  Note that the
> type() used above is a function defined as:
> def type(v):
>     return llget_ptr(v, 'ob_type')

it seems, along this path we forget the assumption that int are 
primitive in RPython, to get that info back through analysis. Just to 
point this out.

> (Actually, we might want to directly execute such code, for testing, with
> Python dictionaries in most variables and (say) llget_ptr() defined as
> dict.__getitem__(), but that's not the main goal here.)
> Let's come back to PyPy.  We can now replace each high-level operation with a
> call to a support function like the above one.  Initially these support
> functions could contain only the llcall() to the CPython API; then we can
> progressively add more cases to cover all the RPython behavior.  The goal
> would be that eventually the support functions and the optimizer become good
> enough to remove all calls to the CPython API.  The optimizer can do that
> using global analysis: e.g. in the above rpython_add(), if both 'v' and 'w'
> come from a previous llstruct() where 'ob_type' was set to 'int', then we know
> that the first branch is taken.  This is a kind of generalized type inference
> using global constant propagation; it would superscede half of the annotation
> pass' job.
> Also, the incremental nature of this approach is nice (and very Psyco-like,
> btw).  It is good for testing along the way, and also means that when we reach
> the goal of supporting enough for RPython, the result is still nicely useful
> for non-RPython code; in this case, when types cannot be inferred, there are
> remaining calls to the CPython API.  I think that this would be a widely
> useful tool!
> I also like a lot the idea of a low-level language and optimizer which is
> entierely independent from Python and PyPy.  We could even (still) devise an
> assembly-like syntax and a parser for it, and make it a separate release.  

it seems a goal similar to part/some aspect of the LLVM project iself.

> Such an optimizer is obviously not easy to write 


> but I think it's quite worth
> the effort.

I'm still thinking (no answer yet) whether the major problem you called 
structure-inlining can be more easely resolved in some RPython specific 
way, or it is best to attack it in this general way.
The relevant question seems to be: to which program mem locations 
((other) struct fields, function locals, inter function parameters) a 
from-this-particular-struct-field value propagates.

>  Most of it would be independent on the target language:

yes, depending a bit whether it will introduce/need function pointers, 
which are not natural for some possible target languages.

> it would
> e.g. replace operations with still more precise operations describing the
> usage of the structs, like
>    llheapalloc('C struct name')
>    llheapget_int(x, 'field', 'C struct name')
> Also, we should look closely for related work.  It looks like it *has to* have
> been done somewhere before...  But the kind of global optimizations that we
> need might be different from the ones any reasonable language would need,
> because no one would reasonably write code like the rpython_add() above when
> he actually means to do a simple integer addition :-)  What's essential here,
> after constant propagation, is to do liveness and alias analysis for the heap
> structures, so that they can be not allocated in the heap at all as often as
> possible.  Also, if we want to do explicit memory management (i.e. we would
> have to use a 'llfree' operation) with reference counters, then the above
> example of rpython_add() becomes even longer -- but of course it should still
> be optimized away.

this LLVM paper (I just skimmed it)


may contain some relevant info, at the very least the bibliography.

More information about the Pypy-dev mailing list