[pypy-dev] More on optimization
pedronis at bluewin.ch
Sun Oct 31 16:42:33 CET 2004
Armin Rigo wrote:
> Sorry for focusing the next sprint so much on translation. This might have
> put some people off. Well, lesson learned.
> It doesn't mean we should stop talking about translation :-) Pushing the
> previously dicussed ideas to their conclusion, we get an interesting point of
> First, for the context, the current situation in the repository: RPython code
> can be turned into a flow graph. Then the annotation pass on the flow graph
> infers types, with the side effect that it infers which functions call which
> functions, so it helps to build the complete list of functions that should be
yes, given that RPython is still OO, call-graph construction and type
inference are related.
> Finally, genc.py produces naive C code using PyObject* only.
> The inferred types are not really used;
I still hope that for less C-like languages as target, the annotated
graph is still directly useful.
> the annotation phase is currently
> useful only to build the complete call graph. (Let's ignore the Lisp and
> Pyrex generators for the moment.)
> Here is an example with the flow graph's operations on the left and the
> corresponding C code (after macro expansion) on the right:
> v3 = add(v1, v2) v3 = PyNumber_Add(v1, v2);
> Quite obvious, and quite slow. Imagine instead that there is a C-ish
> low-level language, whose input "syntax" is of course flow graphs, with only
> operations like the following ones:
> * llcall(func, arg1, arg2...) # calls another function
> * lladd_int(x, y) # adds two ints and returns an int
> * lleq_ptr(x, y) # compares two pointers for equality
> * llstruct('name') # creates an empty structure in the heap
> * llget_int(x, 'key') # reads the field 'key' of struct object x
> * llset_int(x, 'key', value) # writes the field 'key' of struct object x
> * llget_ptr(x, 'key') # \
> * llget_ptr(x, 'key', value) # / the same with fields containing a pointer
> The only data types would be "pointer to structure" and the atomic types like
> "int" and "char". A structure would be essentially like a dictionary, with
> either strings or integers as keys (a dict with int keys is much like a list,
> without the ability to insert or remove elements easily).
> This very limited set of operations allows interesting global analysis and
> optimizations, like removing the allocation of structures in the heap
> altogether, or not writing some fields of some structures when they have a
> known constant value, or (most commonly) not using a hash table but just a 'C
> struct' with fields when all the keys are known.
> It is easy to modify our regular flow graphs until they only use the above
> operations: we replace high-level operations like 'add' with calls to (or
> inlined copies of) support functions like:
> def rpyhon_add(v, w):
> if type(v) is int and type(w) is int:
> a = llget_int(v, 'ob_ival')
> b = llget_int(w, 'ob_ival')
> x = llstruct()
> llset(x, 'ob_type', int)
> llset(x, 'ob_ival', lladd_int(a, b))
> return x
> return llcall('PyNumber_Add', v, w)
> Only such support functions can use the ll*() functions, which stand directly
> for the corresponding ll* operation. The ll*() functions themselves are not
> executable in the above Python source (actually they don't have to be callable
> objects at all). They are only meant as placeholders. If you wish, it's just
> an easy way to write down a complicated flow graph using ll* operations:
> instead of inventing a syntax and writing a parser for it, we use the Python
> syntax and the flowobjspace to build our flow graphs. Note that the
> type() used above is a function defined as:
> def type(v):
> return llget_ptr(v, 'ob_type')
it seems, along this path we forget the assumption that int are
primitive in RPython, to get that info back through analysis. Just to
point this out.
> (Actually, we might want to directly execute such code, for testing, with
> Python dictionaries in most variables and (say) llget_ptr() defined as
> dict.__getitem__(), but that's not the main goal here.)
> Let's come back to PyPy. We can now replace each high-level operation with a
> call to a support function like the above one. Initially these support
> functions could contain only the llcall() to the CPython API; then we can
> progressively add more cases to cover all the RPython behavior. The goal
> would be that eventually the support functions and the optimizer become good
> enough to remove all calls to the CPython API. The optimizer can do that
> using global analysis: e.g. in the above rpython_add(), if both 'v' and 'w'
> come from a previous llstruct() where 'ob_type' was set to 'int', then we know
> that the first branch is taken. This is a kind of generalized type inference
> using global constant propagation; it would superscede half of the annotation
> pass' job.
> Also, the incremental nature of this approach is nice (and very Psyco-like,
> btw). It is good for testing along the way, and also means that when we reach
> the goal of supporting enough for RPython, the result is still nicely useful
> for non-RPython code; in this case, when types cannot be inferred, there are
> remaining calls to the CPython API. I think that this would be a widely
> useful tool!
> I also like a lot the idea of a low-level language and optimizer which is
> entierely independent from Python and PyPy. We could even (still) devise an
> assembly-like syntax and a parser for it, and make it a separate release.
it seems a goal similar to part/some aspect of the LLVM project iself.
> Such an optimizer is obviously not easy to write
> but I think it's quite worth
> the effort.
I'm still thinking (no answer yet) whether the major problem you called
structure-inlining can be more easely resolved in some RPython specific
way, or it is best to attack it in this general way.
The relevant question seems to be: to which program mem locations
((other) struct fields, function locals, inter function parameters) a
from-this-particular-struct-field value propagates.
> Most of it would be independent on the target language:
yes, depending a bit whether it will introduce/need function pointers,
which are not natural for some possible target languages.
> it would
> e.g. replace operations with still more precise operations describing the
> usage of the structs, like
> llheapalloc('C struct name')
> llheapget_int(x, 'field', 'C struct name')
> Also, we should look closely for related work. It looks like it *has to* have
> been done somewhere before... But the kind of global optimizations that we
> need might be different from the ones any reasonable language would need,
> because no one would reasonably write code like the rpython_add() above when
> he actually means to do a simple integer addition :-) What's essential here,
> after constant propagation, is to do liveness and alias analysis for the heap
> structures, so that they can be not allocated in the heap at all as often as
> possible. Also, if we want to do explicit memory management (i.e. we would
> have to use a 'llfree' operation) with reference counters, then the above
> example of rpython_add() becomes even longer -- but of course it should still
> be optimized away.
this LLVM paper (I just skimmed it)
may contain some relevant info, at the very least the bibliography.
More information about the Pypy-dev