
Hi! Sorry for focusing the next sprint so much on translation. This might have put some people off. Well, lesson learned. It doesn't mean we should stop talking about translation :-) Pushing the previously dicussed ideas to their conclusion, we get an interesting point of view... First, for the context, the current situation in the repository: RPython code can be turned into a flow graph. Then the annotation pass on the flow graph infers types, with the side effect that it infers which functions call which functions, so it helps to build the complete list of functions that should be translated. Finally, genc.py produces naive C code using PyObject* only. The inferred types are not really used; the annotation phase is currently useful only to build the complete call graph. (Let's ignore the Lisp and Pyrex generators for the moment.) Here is an example with the flow graph's operations on the left and the corresponding C code (after macro expansion) on the right: v3 = add(v1, v2) v3 = PyNumber_Add(v1, v2); Quite obvious, and quite slow. Imagine instead that there is a C-ish low-level language, whose input "syntax" is of course flow graphs, with only operations like the following ones: * llcall(func, arg1, arg2...) # calls another function * lladd_int(x, y) # adds two ints and returns an int * lleq_ptr(x, y) # compares two pointers for equality * llstruct('name') # creates an empty structure in the heap * llget_int(x, 'key') # reads the field 'key' of struct object x * llset_int(x, 'key', value) # writes the field 'key' of struct object x * llget_ptr(x, 'key') # \ * llget_ptr(x, 'key', value) # / the same with fields containing a pointer The only data types would be "pointer to structure" and the atomic types like "int" and "char". A structure would be essentially like a dictionary, with either strings or integers as keys (a dict with int keys is much like a list, without the ability to insert or remove elements easily). This very limited set of operations allows interesting global analysis and optimizations, like removing the allocation of structures in the heap altogether, or not writing some fields of some structures when they have a known constant value, or (most commonly) not using a hash table but just a 'C struct' with fields when all the keys are known. It is easy to modify our regular flow graphs until they only use the above operations: we replace high-level operations like 'add' with calls to (or inlined copies of) support functions like: def rpyhon_add(v, w): if type(v) is int and type(w) is int: a = llget_int(v, 'ob_ival') b = llget_int(w, 'ob_ival') x = llstruct() llset(x, 'ob_type', int) llset(x, 'ob_ival', lladd_int(a, b)) return x else: return llcall('PyNumber_Add', v, w) Only such support functions can use the ll*() functions, which stand directly for the corresponding ll* operation. The ll*() functions themselves are not executable in the above Python source (actually they don't have to be callable objects at all). They are only meant as placeholders. If you wish, it's just an easy way to write down a complicated flow graph using ll* operations: instead of inventing a syntax and writing a parser for it, we use the Python syntax and the flowobjspace to build our flow graphs. Note that the type() used above is a function defined as: def type(v): return llget_ptr(v, 'ob_type') (Actually, we might want to directly execute such code, for testing, with Python dictionaries in most variables and (say) llget_ptr() defined as dict.__getitem__(), but that's not the main goal here.) Let's come back to PyPy. We can now replace each high-level operation with a call to a support function like the above one. Initially these support functions could contain only the llcall() to the CPython API; then we can progressively add more cases to cover all the RPython behavior. The goal would be that eventually the support functions and the optimizer become good enough to remove all calls to the CPython API. The optimizer can do that using global analysis: e.g. in the above rpython_add(), if both 'v' and 'w' come from a previous llstruct() where 'ob_type' was set to 'int', then we know that the first branch is taken. This is a kind of generalized type inference using global constant propagation; it would superscede half of the annotation pass' job. Also, the incremental nature of this approach is nice (and very Psyco-like, btw). It is good for testing along the way, and also means that when we reach the goal of supporting enough for RPython, the result is still nicely useful for non-RPython code; in this case, when types cannot be inferred, there are remaining calls to the CPython API. I think that this would be a widely useful tool! I also like a lot the idea of a low-level language and optimizer which is entierely independent from Python and PyPy. We could even (still) devise an assembly-like syntax and a parser for it, and make it a separate release. Such an optimizer is obviously not easy to write but I think it's quite worth the effort. Most of it would be independent on the target language: it would e.g. replace operations with still more precise operations describing the usage of the structs, like llheapalloc('C struct name') llheapget_int(x, 'field', 'C struct name') Also, we should look closely for related work. It looks like it *has to* have been done somewhere before... But the kind of global optimizations that we need might be different from the ones any reasonable language would need, because no one would reasonably write code like the rpython_add() above when he actually means to do a simple integer addition :-) What's essential here, after constant propagation, is to do liveness and alias analysis for the heap structures, so that they can be not allocated in the heap at all as often as possible. Also, if we want to do explicit memory management (i.e. we would have to use a 'llfree' operation) with reference counters, then the above example of rpython_add() becomes even longer -- but of course it should still be optimized away. A bientôt, Armin.

Armin Rigo wrote:
yes, given that RPython is still OO, call-graph construction and type inference are related.
Finally, genc.py produces naive C code using PyObject* only. The inferred types are not really used;
I still hope that for less C-like languages as target, the annotated graph is still directly useful.
it seems, along this path we forget the assumption that int are primitive in RPython, to get that info back through analysis. Just to point this out.
it seems a goal similar to part/some aspect of the LLVM project iself.
Such an optimizer is obviously not easy to write
yes
but I think it's quite worth the effort.
I'm still thinking (no answer yet) whether the major problem you called structure-inlining can be more easely resolved in some RPython specific way, or it is best to attack it in this general way. The relevant question seems to be: to which program mem locations ((other) struct fields, function locals, inter function parameters) a from-this-particular-struct-field value propagates.
Most of it would be independent on the target language:
yes, depending a bit whether it will introduce/need function pointers, which are not natural for some possible target languages.
this LLVM paper (I just skimmed it) http://llvm.cs.uiuc.edu/pubs/2003-04-29-DataStructureAnalysisTR.html may contain some relevant info, at the very least the bibliography.

Hi Samuele, So LLVM comes again under the focus; we should really give it a serious try at some point. There are clearly two point of views that we can take on RPython. The first point of view is that RPython is a kind of nice syntax over a C- or Java-like language (in this view integers are primitive, and type inference is used to fill in the missing type declarations). The alternative is to see RPython as semantically close to Python, and all the variables contain objects, but we can obtain better performance via optimization. The current annotation-and-Pyrex-or-Lisp approach follows the 1st approach, but I feel like the 2nd one can give better results. Well, I guess you guessed :-) Let me try to explore a few consequences of such a choice. A general critique I have regarding a lot of research (including the paper you cited) is that the techniques are flow-insensitive. I'm not sure if it is a problem for the 1st approach but it is for the 2nd one: the low-level flow graph of, say, "a+b" is: if type(a) is int and type(b) is int: x = make_int_object(add_long(a.ob_ival, b.ob_ival)) else: ... It is essential to follow the exact control path in this function, which tells us that a and b only have a field 'ob_ival' to read when they are actually of type int. There are still good ideas to consider from the paper you cited (which references did you think would be most relevant?). One is that this algorithm is nicely context-sensitive (i.e. it is almost like all the functions were systematically inlined, which is great for optimization -- but which is impossible in practice, so this algorithm gives an efficient way to compute what *would* occur if all functions were inlined). This looks like an alternative to the traditional way of making functions polymorphic (where we try to guess, when a function gets called with various kinds of arguments, if we should analyse the function several times with each specialized set of arguments or if we should generalize and analyse the function only once). I tend now to think that the "structure-inlining" problem is best attacked at the more general level, independently of RPython. It's quite possible that it could be done with an adaptation of the techniques that we already have in the annotation subdirectory. It would be simpler because we have fewer, more elementary types. It would be harder because we'd have to focus on the basic problem of tracking references more precisely. Trying to sketch the basics by putting together some equivalents of the SomeXxx classes and factories, it looks like it works... (but some more thinking needed) Re function pointers: in Java we'd probably have to group functions by families (two functions are in the same family as soon as there is a single variable that could point to either one), and then replace function pointers by integers and use switches... Lots of one-method classes with a common interface look like a waste... But the integer tricks seems independent of the optimization techniques. Did you have better tricks in mind that would require more care? Armin

Armin Rigo wrote:
I see, I just think that maybe we should try to get as far as possible with what we constructed so far, for example even ignoring structure inlining for the first approx. In Java, lisp etc is not that much a relevant problem because for example Java arrays carry a length anyway and are heap allocated. One thing with approach 2 is that we need to rewrite a fair chunk of Python semantics themself in the compiler that way. I'm thinking that as long as we depend/use on CPython ref counting, ownership issues will crop up, they will likely not mix easely with trying to be clever with structures. In general I think that likely both 1 and 2 are worth pursuing, because of differences for target languages, characheteristic of produced code. OTOH I'm sure whether trying to go with full force with 2 is the best thing to get the first protototype interpreter running as an extension in CPython, especially thinking about ref counting. But you worked more on that, so you can best judge what is necessary.
yes, that is what I was thinking about, you need to track "types" for thing like object/structure from a specific creation point or coming from reading a specific field and track their propagation.
Jython right now use the switch thing in some cases, but I suspect it is not JIT inlining friendly so a lot of inner classes are preferable.

Hi Samuele, On Tue, Nov 02, 2004 at 06:09:57PM +0100, Samuele Pedroni wrote:
In general I think that likely both 1 and 2 are worth pursuing, because of differences for target languages, characheteristic of produced code.
Makes sense. We must set priorities, and trying to go as far as possible with the existing approach certainly looks like a good idea. This means that we shouldn't put too many optimization efforts on the current C backend. I suggest that the next goal would be to get something that works, even slowly, and try to keep the Lisp backend up-to-date. I think that the type annotations are not used much in the Lisp backend so far, but it should be easy to generate type declarations. It might be a good way to get a "fast enough" translation.
Indeed, it looks like structure inlining ("option 2") is not really useful for all target languages.
One thing with approach 2 is that we need to rewrite a fair chunk of Python semantics themself in the compiler that way.
Yes, though these semantics have to be somewhere anyway; currently they are distributed among the type inference, the backend itself, and in manually written support code for the target language (i.e. the C macro file genc.h, or the snippets of Lisp code included in gencl.py). Approach 2 would put them in a central place, making the meta-programming Psyco-ish approach more explicit.
I don't think that reference counting is a major problem. The way I see approach 2 so far is that we'd have basically two kinds of annotations for the low-level graphs, SomeInlinedStructure and SomeHeapStructure. We'd use the former as long as possible, i.e. as long as we can track all references to the structure; and when we no longer can, we mark the structure's creation point as "too complex" and reflow from there using SomeHeapStructure instead. Only the latter would have reference counters and a CPython-compatible layout in memory.
I agree with you now for the reasons above, even if I think that the refcounts are not a major issue (but maybe I'm overlooking something).
Jython right now use the switch thing in some cases, but I suspect it is not JIT inlining friendly so a lot of inner classes are preferable.
Good to know. Well, I guess we'll have to try both and compare, anyway. A bientôt, Armin

On Tue, Nov 02, 2004 at 06:09:57PM +0100, Samuele Pedroni wrote:
Ops, by the way: we can't use a Java array to implement a RPython list because you can't resize Java arrays. So there is an extra indirection even in Java and there is a need to remove it with enough analysis and optimizations... Or am I missing something? In Lisp you can declare resizable arrays, though it may mean an extra indirection too. Armin

Samuele Pedroni wrote:
oops, sorry now I see, you want RPython lists to be lists, that's probably different that what we wanted originally. listobject is still written asssuming that ob_item is a chunk of memory (i), but in other parts we are assuming RPython list can be extended and appended to (ii). (btw question is btw whether resizing policy should be in listobject on in the RPython list sematics.) With approach 2 we would go from all the fields necesessary from a general list to just len+items for the first case because some thing are constants. It's still true that both an ad-hoc approach given knowledge about the specific type that exist in RPython, or a general one is possible. In Java as first approx you would use an array for (i) and ArrayList for (ii) I still think that a baseline intepreter with straightforward optimizations is a good first goal, also because we can use it as reference to track our progress and see what gives the biggest payoffs.

Hi Samuele, On Thu, Nov 04, 2004 at 03:51:50PM +0100, Samuele Pedroni wrote:
No, I'm thinking about the following problem: suppose that an RPython list is meant to represent an expensively resizeable array. The problem with implementing that using a bare Java array is that when you realloc and copy, all the references to the array should be updated to point to the new array. This is impossible unless you either track all references using complex analysis, or you encapsulate the array in another Java object. It's like, say, in Python where you cannot use bare strings to implement mutable strings: although you can use slices and concatenation to build a modified string, you cannot update all the places that pointed to the old string to now point to the new string. So you have to encapsulate the string in an instance. Armin

Armin Rigo wrote:
that what I was implicitly thinking with "be lists". My point anyway was simply that for Java the first relevant thing is to dinstiguish constant-size lists (the code in listobject uses such things) vs. variable size lists, and both then have natural reprs. In a more subtle approach also lists that have always just _one_ reference to them in the whole heap (and are not passed as parameters perhaps) can likely also be implemented with a direct array reference. It is true that the generalization and understanding of this kind of criteria is what we need for general structure inlining.

Hi Samuele, On Sat, Nov 06, 2004 at 07:47:58PM +0100, Samuele Pedroni wrote:
Ah: I didn't realize that listobject's "ob_item" was never resized. I though that _list_resize() would extend the existing list object, but instead it creates a new one and assigns it to "w_self.ob_item". I was insisting on this point based on this misunderstanding, sorry. Indeed, it looks much easier to just detect that "ob_item" never changes its size, and use a plain Java array because of that. Armin

Hi Armin, hi Samuele, [Armin Rigo Tue, Nov 02, 2004 at 04:19:05PM +0000]
So LLVM comes again under the focus; we should really give it a serious try at some point.
indeed. We may also give joeq (http://joeq.sourceforge.net) a look although development seems to be stalled.
To me the pyrex approach was just a short-cut to produce what your new C-backend now produces directly.
the choice of the backend should not influence the flow/annotation analysis, should it? I guess i don't see the connection you seem to imply between these two parts of the translation process.
Does this relate to a previous idea of generating two flow branches for the above if/else that would not get merged at the next point after the if/else? IIRC, we currently merge the "framestates" right after the if/else. cheers, holger

Hi Holger, On Tue, Nov 02, 2004 at 07:36:53PM +0100, holger krekel wrote:
To me the pyrex approach was just a short-cut to produce what your new C-backend now produces directly.
Sure, but the issue under discussion is the next step: how type information should be used. The C-backend doesn't use them now.
There are two options. What I called "option 1" is to use the existing annotation analysis, and then make use of this information in the backends. This is a good approach for generating Java code, but maybe not so much for C, where "option 2" would give better results. This "option 2" would not use the existing annotation analysis at all. Instead, it would replace the (unannotated) high-level operations with low-level operations and then do a slightly different kind of analysis on these low-level operations, and finally the back-end would turn that into C. As far as I can tell, only "option 2" can give the best results to translate, say, our W_ListObject class, with its 'ob_item' field containing a list that is "owned" by the W_ListObject instance and never visible to other parts of the code.
No, it's not about merging or not the two flow branches (or the results of the analysis) after the if/else: in both cases, flow-sensitive or flow-insensitive, they are merged. The difference between flow-sensitive and flow-insensitive is that in the former case, if we discover that we know the types of a and b statically, then we only explore the first branch (so there is no merging to be done, because the 2nd branch is never analysed). This is what we do currently in our annotations. By contrast, flow-insensitive algorithms like the one in the paper Samuele pointed out (or the one done by Starkiller) don't work like that: they just look at all instructions in the function and accumulate their possible effects, without worrying about the control flow. A bientôt, Armin

Hi Armin, [Armin Rigo Wed, Nov 03, 2004 at 02:10:21PM +0000]
So we do already keep possibly different function versions for differently typed sets of input parameters to this function? I guess i am mixing up flow analysis and annotation somewhat here. Or even have a deeper misunderstanding :-)
right. They have to do that because they look at the static source code and not at a representation obtained from abstract execution like we do in objspace/flow. cheers, holger

Hi Holger, On Wed, Nov 03, 2004 at 03:28:42PM +0100, holger krekel wrote:
So we do already keep possibly different function versions for differently typed sets of input parameters to this function?
Er, no...
I guess i am mixing up flow analysis and annotation somewhat here. Or even have a deeper misunderstanding :-)
Hum. We have the flow analysis which, using merge point detection, produces a flow graph that follows the control flow of the input functions (one graph per function). This works fine and we don't plan to change anything there. Then we have the annotation phase which follows by abstract interpretation the existing flow graphs, and which merges the dicovered information exactly when two branches meet in flow graph. This is a "flow-sensitive" kind of analysis. It is good because in this kind of code: if x: y = 1 else: y = 2 when the annotation follows the flow graph of this piece of code, if 'x' is not a constant, then both paths are followed and when the two paths merge, the two incompatible results ("y is the constant 1" and "y is the constant 2") are merged into less precise information ("y is an integer"). But if 'x' is a constant, then only one path will be followed, and we keep the precise knowledge of the value of 'y' even after the if/else statement. Flow-insensitive algorithms just look at the whole function and see two statements: "y=1" and "y=2". From that they guess that 'y' must be an integer no matter what. They can't tell you the value of 'y' even if they knew the value of 'x'. This difference is in the algorithm used; it's not related to:
No, that's not related. It's just that there are big advantages if you don't care about the precise flow for some kind of algorithms. We could also collect all the operations that appear in a flow graph and deduce things in a flow-insensitive manner if we wanted to; and conversely Starkiller could also look at the static source and still infer things in a more flow-sensitive way. A bientôt, Armin.

Hi again, [Armin Rigo Wed, Nov 03, 2004 at 02:50:46PM +0000]
me hum, too :-) this means that http://codespeak.net/pypy/index.cgi?doc/architecture.html is not right when it says: The translation process is done in three steps: - producing a flow graph representation of the standard interpreter. A combination of a plain interpreter and a flow object space performs "abstract interpretation" to record the flow of objects and execution throughout a python program into such a flow graph. - ... because your current prescribed notion of abstract interpretation starts with already existing flow graphs. Hum. I am ready to accept your current use of the terms but it is confusing that we keep changing notions and terms quite a bit ... I guess that when i get confused about this that other people could get confused about this as well.
But if one part of the program uses the one branch and another part uses the other branch then you will generalize the function to annotate 'y' as being an integer. This was what i was refering to when i was suprisedly asking: So we do already keep possibly different function versions for differently typed sets of input parameters to this function? to which the answer apparently is "no". Only when just one branch is used will the flow-sensitive annotation analysis regard 'y' as being a constant.
But in order to do that starkiller would probably need to utilize a representation that is reflecting the actual possible flows of execution. I didn't intend to claim that looking at source code couldn't be done in a flow-sensitive manner. Such a claim wouldn't make much sense as source code and its execution flow are mostly equivalent. cheers, holger

Hi Holger, On Wed, Nov 03, 2004 at 04:40:01PM +0100, holger krekel wrote:
Abstract interpretation is the name of a very general technique: we use it twice. We use it to produce the flow graph (i.e. the architecture document is correct), and then we also use it to produce annotations on the graph (the architecture document doesn't say how we do that, so it's not wrong not to say we do it with another kind of abstract interpretation too). Now I see where the confusion comes from. The e-mails in this thread have nothing to do with the flowobjspace or the flow graph construction itself. What you said about it is correct but off topic: the topic is annotations and backends. All the "abstract interpretation" I talked about is the one we do to infer the annotations.
Right. Now I understand your question:
So we do already keep possibly different function versions for differently typed sets of input parameters to this function?
The original code example with "if type(a) is int and ..." was supposed to be inlined at a lot of places, i.e. to replace verbatim all the high-level "add" operations. You're right, I was quite vague. If that code example was put in a separate function, then probably all the branches would have to be analysed anyway, unless we kept different versions of the function for differently typed sets of inputs. Instead, to avoid the problem, it's better to create an inlined copy of the code everywhere and then solve the code bloat by optimizing away all branches apart one. A bientôt, Armin

Hi Armin, [Armin Rigo Wed, Nov 03, 2004 at 04:58:31PM +0000]
Thanks for the clarification.
Right. However, in the past we have sometimes been mixing the flow analysis and annotation part, both in talking and coding. I do recall a discussion regarding the determination of "merge points" in the flow analysis that we _could_ connect to type/annotation analysis (during a screen-session where we hacked together on the flow/* code) but let's postpone this as i agree it doesn't neccessarily relate to the current topic. Btw, it will be interesting to come up with a formal definition of abstract interpretation at some point along with somewhat more formal definitions of the other terms we often use. (not neccessarily completly mathematically definitions, mind you). Kind of a a brief "glossary" thingie to allow us and others to understand what we are talking about. Maybe you or Samuele could even start a brief but current "glossary.txt" before the Vilnius sprint ... ? cheers, holger

Armin Rigo wrote:
yes, given that RPython is still OO, call-graph construction and type inference are related.
Finally, genc.py produces naive C code using PyObject* only. The inferred types are not really used;
I still hope that for less C-like languages as target, the annotated graph is still directly useful.
it seems, along this path we forget the assumption that int are primitive in RPython, to get that info back through analysis. Just to point this out.
it seems a goal similar to part/some aspect of the LLVM project iself.
Such an optimizer is obviously not easy to write
yes
but I think it's quite worth the effort.
I'm still thinking (no answer yet) whether the major problem you called structure-inlining can be more easely resolved in some RPython specific way, or it is best to attack it in this general way. The relevant question seems to be: to which program mem locations ((other) struct fields, function locals, inter function parameters) a from-this-particular-struct-field value propagates.
Most of it would be independent on the target language:
yes, depending a bit whether it will introduce/need function pointers, which are not natural for some possible target languages.
this LLVM paper (I just skimmed it) http://llvm.cs.uiuc.edu/pubs/2003-04-29-DataStructureAnalysisTR.html may contain some relevant info, at the very least the bibliography.

Hi Samuele, So LLVM comes again under the focus; we should really give it a serious try at some point. There are clearly two point of views that we can take on RPython. The first point of view is that RPython is a kind of nice syntax over a C- or Java-like language (in this view integers are primitive, and type inference is used to fill in the missing type declarations). The alternative is to see RPython as semantically close to Python, and all the variables contain objects, but we can obtain better performance via optimization. The current annotation-and-Pyrex-or-Lisp approach follows the 1st approach, but I feel like the 2nd one can give better results. Well, I guess you guessed :-) Let me try to explore a few consequences of such a choice. A general critique I have regarding a lot of research (including the paper you cited) is that the techniques are flow-insensitive. I'm not sure if it is a problem for the 1st approach but it is for the 2nd one: the low-level flow graph of, say, "a+b" is: if type(a) is int and type(b) is int: x = make_int_object(add_long(a.ob_ival, b.ob_ival)) else: ... It is essential to follow the exact control path in this function, which tells us that a and b only have a field 'ob_ival' to read when they are actually of type int. There are still good ideas to consider from the paper you cited (which references did you think would be most relevant?). One is that this algorithm is nicely context-sensitive (i.e. it is almost like all the functions were systematically inlined, which is great for optimization -- but which is impossible in practice, so this algorithm gives an efficient way to compute what *would* occur if all functions were inlined). This looks like an alternative to the traditional way of making functions polymorphic (where we try to guess, when a function gets called with various kinds of arguments, if we should analyse the function several times with each specialized set of arguments or if we should generalize and analyse the function only once). I tend now to think that the "structure-inlining" problem is best attacked at the more general level, independently of RPython. It's quite possible that it could be done with an adaptation of the techniques that we already have in the annotation subdirectory. It would be simpler because we have fewer, more elementary types. It would be harder because we'd have to focus on the basic problem of tracking references more precisely. Trying to sketch the basics by putting together some equivalents of the SomeXxx classes and factories, it looks like it works... (but some more thinking needed) Re function pointers: in Java we'd probably have to group functions by families (two functions are in the same family as soon as there is a single variable that could point to either one), and then replace function pointers by integers and use switches... Lots of one-method classes with a common interface look like a waste... But the integer tricks seems independent of the optimization techniques. Did you have better tricks in mind that would require more care? Armin

Armin Rigo wrote:
I see, I just think that maybe we should try to get as far as possible with what we constructed so far, for example even ignoring structure inlining for the first approx. In Java, lisp etc is not that much a relevant problem because for example Java arrays carry a length anyway and are heap allocated. One thing with approach 2 is that we need to rewrite a fair chunk of Python semantics themself in the compiler that way. I'm thinking that as long as we depend/use on CPython ref counting, ownership issues will crop up, they will likely not mix easely with trying to be clever with structures. In general I think that likely both 1 and 2 are worth pursuing, because of differences for target languages, characheteristic of produced code. OTOH I'm sure whether trying to go with full force with 2 is the best thing to get the first protototype interpreter running as an extension in CPython, especially thinking about ref counting. But you worked more on that, so you can best judge what is necessary.
yes, that is what I was thinking about, you need to track "types" for thing like object/structure from a specific creation point or coming from reading a specific field and track their propagation.
Jython right now use the switch thing in some cases, but I suspect it is not JIT inlining friendly so a lot of inner classes are preferable.

Hi Samuele, On Tue, Nov 02, 2004 at 06:09:57PM +0100, Samuele Pedroni wrote:
In general I think that likely both 1 and 2 are worth pursuing, because of differences for target languages, characheteristic of produced code.
Makes sense. We must set priorities, and trying to go as far as possible with the existing approach certainly looks like a good idea. This means that we shouldn't put too many optimization efforts on the current C backend. I suggest that the next goal would be to get something that works, even slowly, and try to keep the Lisp backend up-to-date. I think that the type annotations are not used much in the Lisp backend so far, but it should be easy to generate type declarations. It might be a good way to get a "fast enough" translation.
Indeed, it looks like structure inlining ("option 2") is not really useful for all target languages.
One thing with approach 2 is that we need to rewrite a fair chunk of Python semantics themself in the compiler that way.
Yes, though these semantics have to be somewhere anyway; currently they are distributed among the type inference, the backend itself, and in manually written support code for the target language (i.e. the C macro file genc.h, or the snippets of Lisp code included in gencl.py). Approach 2 would put them in a central place, making the meta-programming Psyco-ish approach more explicit.
I don't think that reference counting is a major problem. The way I see approach 2 so far is that we'd have basically two kinds of annotations for the low-level graphs, SomeInlinedStructure and SomeHeapStructure. We'd use the former as long as possible, i.e. as long as we can track all references to the structure; and when we no longer can, we mark the structure's creation point as "too complex" and reflow from there using SomeHeapStructure instead. Only the latter would have reference counters and a CPython-compatible layout in memory.
I agree with you now for the reasons above, even if I think that the refcounts are not a major issue (but maybe I'm overlooking something).
Jython right now use the switch thing in some cases, but I suspect it is not JIT inlining friendly so a lot of inner classes are preferable.
Good to know. Well, I guess we'll have to try both and compare, anyway. A bientôt, Armin

On Tue, Nov 02, 2004 at 06:09:57PM +0100, Samuele Pedroni wrote:
Ops, by the way: we can't use a Java array to implement a RPython list because you can't resize Java arrays. So there is an extra indirection even in Java and there is a need to remove it with enough analysis and optimizations... Or am I missing something? In Lisp you can declare resizable arrays, though it may mean an extra indirection too. Armin

Samuele Pedroni wrote:
oops, sorry now I see, you want RPython lists to be lists, that's probably different that what we wanted originally. listobject is still written asssuming that ob_item is a chunk of memory (i), but in other parts we are assuming RPython list can be extended and appended to (ii). (btw question is btw whether resizing policy should be in listobject on in the RPython list sematics.) With approach 2 we would go from all the fields necesessary from a general list to just len+items for the first case because some thing are constants. It's still true that both an ad-hoc approach given knowledge about the specific type that exist in RPython, or a general one is possible. In Java as first approx you would use an array for (i) and ArrayList for (ii) I still think that a baseline intepreter with straightforward optimizations is a good first goal, also because we can use it as reference to track our progress and see what gives the biggest payoffs.

Hi Samuele, On Thu, Nov 04, 2004 at 03:51:50PM +0100, Samuele Pedroni wrote:
No, I'm thinking about the following problem: suppose that an RPython list is meant to represent an expensively resizeable array. The problem with implementing that using a bare Java array is that when you realloc and copy, all the references to the array should be updated to point to the new array. This is impossible unless you either track all references using complex analysis, or you encapsulate the array in another Java object. It's like, say, in Python where you cannot use bare strings to implement mutable strings: although you can use slices and concatenation to build a modified string, you cannot update all the places that pointed to the old string to now point to the new string. So you have to encapsulate the string in an instance. Armin

Armin Rigo wrote:
that what I was implicitly thinking with "be lists". My point anyway was simply that for Java the first relevant thing is to dinstiguish constant-size lists (the code in listobject uses such things) vs. variable size lists, and both then have natural reprs. In a more subtle approach also lists that have always just _one_ reference to them in the whole heap (and are not passed as parameters perhaps) can likely also be implemented with a direct array reference. It is true that the generalization and understanding of this kind of criteria is what we need for general structure inlining.

Hi Samuele, On Sat, Nov 06, 2004 at 07:47:58PM +0100, Samuele Pedroni wrote:
Ah: I didn't realize that listobject's "ob_item" was never resized. I though that _list_resize() would extend the existing list object, but instead it creates a new one and assigns it to "w_self.ob_item". I was insisting on this point based on this misunderstanding, sorry. Indeed, it looks much easier to just detect that "ob_item" never changes its size, and use a plain Java array because of that. Armin

Hi Armin, hi Samuele, [Armin Rigo Tue, Nov 02, 2004 at 04:19:05PM +0000]
So LLVM comes again under the focus; we should really give it a serious try at some point.
indeed. We may also give joeq (http://joeq.sourceforge.net) a look although development seems to be stalled.
To me the pyrex approach was just a short-cut to produce what your new C-backend now produces directly.
the choice of the backend should not influence the flow/annotation analysis, should it? I guess i don't see the connection you seem to imply between these two parts of the translation process.
Does this relate to a previous idea of generating two flow branches for the above if/else that would not get merged at the next point after the if/else? IIRC, we currently merge the "framestates" right after the if/else. cheers, holger

Hi Holger, On Tue, Nov 02, 2004 at 07:36:53PM +0100, holger krekel wrote:
To me the pyrex approach was just a short-cut to produce what your new C-backend now produces directly.
Sure, but the issue under discussion is the next step: how type information should be used. The C-backend doesn't use them now.
There are two options. What I called "option 1" is to use the existing annotation analysis, and then make use of this information in the backends. This is a good approach for generating Java code, but maybe not so much for C, where "option 2" would give better results. This "option 2" would not use the existing annotation analysis at all. Instead, it would replace the (unannotated) high-level operations with low-level operations and then do a slightly different kind of analysis on these low-level operations, and finally the back-end would turn that into C. As far as I can tell, only "option 2" can give the best results to translate, say, our W_ListObject class, with its 'ob_item' field containing a list that is "owned" by the W_ListObject instance and never visible to other parts of the code.
No, it's not about merging or not the two flow branches (or the results of the analysis) after the if/else: in both cases, flow-sensitive or flow-insensitive, they are merged. The difference between flow-sensitive and flow-insensitive is that in the former case, if we discover that we know the types of a and b statically, then we only explore the first branch (so there is no merging to be done, because the 2nd branch is never analysed). This is what we do currently in our annotations. By contrast, flow-insensitive algorithms like the one in the paper Samuele pointed out (or the one done by Starkiller) don't work like that: they just look at all instructions in the function and accumulate their possible effects, without worrying about the control flow. A bientôt, Armin

Hi Armin, [Armin Rigo Wed, Nov 03, 2004 at 02:10:21PM +0000]
So we do already keep possibly different function versions for differently typed sets of input parameters to this function? I guess i am mixing up flow analysis and annotation somewhat here. Or even have a deeper misunderstanding :-)
right. They have to do that because they look at the static source code and not at a representation obtained from abstract execution like we do in objspace/flow. cheers, holger

Hi Holger, On Wed, Nov 03, 2004 at 03:28:42PM +0100, holger krekel wrote:
So we do already keep possibly different function versions for differently typed sets of input parameters to this function?
Er, no...
I guess i am mixing up flow analysis and annotation somewhat here. Or even have a deeper misunderstanding :-)
Hum. We have the flow analysis which, using merge point detection, produces a flow graph that follows the control flow of the input functions (one graph per function). This works fine and we don't plan to change anything there. Then we have the annotation phase which follows by abstract interpretation the existing flow graphs, and which merges the dicovered information exactly when two branches meet in flow graph. This is a "flow-sensitive" kind of analysis. It is good because in this kind of code: if x: y = 1 else: y = 2 when the annotation follows the flow graph of this piece of code, if 'x' is not a constant, then both paths are followed and when the two paths merge, the two incompatible results ("y is the constant 1" and "y is the constant 2") are merged into less precise information ("y is an integer"). But if 'x' is a constant, then only one path will be followed, and we keep the precise knowledge of the value of 'y' even after the if/else statement. Flow-insensitive algorithms just look at the whole function and see two statements: "y=1" and "y=2". From that they guess that 'y' must be an integer no matter what. They can't tell you the value of 'y' even if they knew the value of 'x'. This difference is in the algorithm used; it's not related to:
No, that's not related. It's just that there are big advantages if you don't care about the precise flow for some kind of algorithms. We could also collect all the operations that appear in a flow graph and deduce things in a flow-insensitive manner if we wanted to; and conversely Starkiller could also look at the static source and still infer things in a more flow-sensitive way. A bientôt, Armin.

Hi again, [Armin Rigo Wed, Nov 03, 2004 at 02:50:46PM +0000]
me hum, too :-) this means that http://codespeak.net/pypy/index.cgi?doc/architecture.html is not right when it says: The translation process is done in three steps: - producing a flow graph representation of the standard interpreter. A combination of a plain interpreter and a flow object space performs "abstract interpretation" to record the flow of objects and execution throughout a python program into such a flow graph. - ... because your current prescribed notion of abstract interpretation starts with already existing flow graphs. Hum. I am ready to accept your current use of the terms but it is confusing that we keep changing notions and terms quite a bit ... I guess that when i get confused about this that other people could get confused about this as well.
But if one part of the program uses the one branch and another part uses the other branch then you will generalize the function to annotate 'y' as being an integer. This was what i was refering to when i was suprisedly asking: So we do already keep possibly different function versions for differently typed sets of input parameters to this function? to which the answer apparently is "no". Only when just one branch is used will the flow-sensitive annotation analysis regard 'y' as being a constant.
But in order to do that starkiller would probably need to utilize a representation that is reflecting the actual possible flows of execution. I didn't intend to claim that looking at source code couldn't be done in a flow-sensitive manner. Such a claim wouldn't make much sense as source code and its execution flow are mostly equivalent. cheers, holger

Hi Holger, On Wed, Nov 03, 2004 at 04:40:01PM +0100, holger krekel wrote:
Abstract interpretation is the name of a very general technique: we use it twice. We use it to produce the flow graph (i.e. the architecture document is correct), and then we also use it to produce annotations on the graph (the architecture document doesn't say how we do that, so it's not wrong not to say we do it with another kind of abstract interpretation too). Now I see where the confusion comes from. The e-mails in this thread have nothing to do with the flowobjspace or the flow graph construction itself. What you said about it is correct but off topic: the topic is annotations and backends. All the "abstract interpretation" I talked about is the one we do to infer the annotations.
Right. Now I understand your question:
So we do already keep possibly different function versions for differently typed sets of input parameters to this function?
The original code example with "if type(a) is int and ..." was supposed to be inlined at a lot of places, i.e. to replace verbatim all the high-level "add" operations. You're right, I was quite vague. If that code example was put in a separate function, then probably all the branches would have to be analysed anyway, unless we kept different versions of the function for differently typed sets of inputs. Instead, to avoid the problem, it's better to create an inlined copy of the code everywhere and then solve the code bloat by optimizing away all branches apart one. A bientôt, Armin

Hi Armin, [Armin Rigo Wed, Nov 03, 2004 at 04:58:31PM +0000]
Thanks for the clarification.
Right. However, in the past we have sometimes been mixing the flow analysis and annotation part, both in talking and coding. I do recall a discussion regarding the determination of "merge points" in the flow analysis that we _could_ connect to type/annotation analysis (during a screen-session where we hacked together on the flow/* code) but let's postpone this as i agree it doesn't neccessarily relate to the current topic. Btw, it will be interesting to come up with a formal definition of abstract interpretation at some point along with somewhat more formal definitions of the other terms we often use. (not neccessarily completly mathematically definitions, mind you). Kind of a a brief "glossary" thingie to allow us and others to understand what we are talking about. Maybe you or Samuele could even start a brief but current "glossary.txt" before the Vilnius sprint ... ? cheers, holger
participants (4)
-
Armin Rigo
-
holger krekel
-
hpk@trillke.net
-
Samuele Pedroni