[pypy-svn] rev 2166 - pypy/trunk/doc/translation

arigo at codespeak.net arigo at codespeak.net
Tue Nov 4 20:42:47 CET 2003


Author: arigo
Date: Tue Nov  4 20:42:46 2003
New Revision: 2166

Modified:
   pypy/trunk/doc/translation/annotation.txt
Log:
Separated again the control flow generation from the annotation pass.
Added Holger's suggestion about polymorphic code.


Modified: pypy/trunk/doc/translation/annotation.txt
==============================================================================
--- pypy/trunk/doc/translation/annotation.txt	(original)
+++ pypy/trunk/doc/translation/annotation.txt	Tue Nov  4 20:42:46 2003
@@ -1,5 +1,10 @@
-Mixed control flow / annotation pass
-====================================
+The annotation pass
+===================
+
+Let's assume that the control flow graph building pass can be
+done entierely before the annotation pass.  (See notes at the
+end for why we'd like to mix them.)
+
 
 Factorial
 ---------
@@ -13,15 +18,30 @@
     else:
       return 1
 
-Suppose that we know that ``f`` takes an ``int`` argument.
-We start with the flow objspace applied to the entry point::
+The flow objspace gives us the following graph (after the
+simplification that makes ``simple_call``)::
 
   StartBlock(v1):
     v2 = ge(v1, 2)
     exitswitch(v2)
+      link "True" to Block2(v1)
+      link "False" to Block3
 
-We suspend the flow objspace here, and we immediately do the
-type inference on this block::
+  Block2(v3):
+    v4 = sub(v3, 1)
+    v7 = simple_call(f, v4)
+    v8 = mul(v3, v7)
+    jump to ReturnBlock(v8)
+
+  Block3:
+    v9 = 1
+    jump to ReturnBlock(v9)
+
+  ReturnBlock(retval):
+    (empty, just returns retval)
+
+Suppose that we know that ``f`` takes an ``int`` argument.
+We start type inference on the first block::
 
   Analyse(StartBlock):
     v1 ----> X1   type(X1)=int
@@ -34,34 +54,8 @@
 about the unknown heap objects.  The arrows represent binding from
 variables to objects.
 
-We perform the type inference early because it may give interesting
-information about ``v2``, the variable whose truth-value determines in
-which block we must continue.  In this case we don't know if ``v2`` will
-be True or False, but in some cases type inference can help (for
-example, in code like ``a, b = c, d`` the type inference can tell that
-the right-hand tuple is of length two and so no ValueError will be
-thrown for unpacking a tuple of the wrong length).
-
-Let's come back to StartBlock.  We add an exit corresponding to the case
-``v2==True``, jumping to Block2 which we flow-analyse now::
-
-  Block2(v3):
-    v4 = sub(v3, 1)
-    v5 = newtuple(v4)
-    v6 = newdict()
-    v7 = call(f, v5, v6)
-    v8 = mul(v3, v7)
-    jump to ReturnBlock(v8)
-
-This is simplified into::
-
-  Block2(v3):
-    v4 = sub(v3, 1)
-    v7 = simple_call(f, v4)
-    v8 = mul(v3, v7)
-    jump to ReturnBlock(v8)
-
-Type inference::
+After StartBlock, we proceed to the type inference of its exits;
+first Block2::
 
   Analyse(Block2):
     v3 ------------> X1   # copied from StartBlock
@@ -70,22 +64,14 @@
 
 It fails at the simple_call to f, because we don't know yet anything
 about the return value of f.  We suspend the analysis of Block2 and
-resume at some other non-blocked point -- for example, we can now
-consider adding an exit to StartBlock for the case ``v2==False``,
-jumping to Block3::
-
-  Block3:
-    v9 = 1
-    jump to ReturnBlock(v9)
+resume at some other non-blocked point -- in this case, the other exit
+from the StackBlock, which is jumping to Block3::
 
   Analyse(Block3):
     v9 --------> 1    # and we have type(1)=int automatically
 
 Then we proceed to ReturnBlock::
 
-  ReturnBlock(retval):
-    (empty, just returns retval)
-
   Analyse(ReturnBlock):
     retval --------> 1
 
@@ -154,17 +140,17 @@
 
 A program of more than one function is analysed in exactly the same way,
 starting from an entry point and following calls.  We have a cache of all
-the blocks that the flow objspace produced, and a list of pending blocks
+the flowgraphs that the flow objspace produced, and a list of pending blocks
 that we have to type-analyse.  When the analysis of a block temporarily
 fails (as above for the first recursive call to ``f``) we move the block
-at the end of the pending list.  There is only one heap of annotations
-for the whole program, so that we can track where the objects come from
-and go through the whole program.  (This makes separate compilation very
-difficult, I guess.)
+back into the pending list.  There is only one heap of annotations for the
+whole program, so that we can track where the objects come from and go
+through the whole program.  (This makes separate compilation very difficult,
+I guess.)
 
 
-Empty lists
------------
+Empty lists and mutable objects
+-------------------------------
 
 Nothing special is required for empty lists.  Let's try::
 
@@ -184,10 +170,10 @@
     v3 = simple_call(g, v1, 6)
 
   Analyse(F_StartBlock):
-    v1 -------> X1   type(X1)=list  len(X1)=0  getitem(X1,?)=?
+    v1 -------> X1   type(X1)=list  len(X1)=0  getitem(X1,*)=?
     v2 -------> crash
 
-The ``?`` is a special value meaning ``no analysis information``.  The type analysis fails because of the calls to ``g``, but it triggers the analysis of ``g`` with the input arguments' annotations::
+The ``?`` is a special value meaning ``no analysis information``, and ``*`` is a special catch-all value.  The type analysis fails because of the calls to ``g``, but it triggers the analysis of ``g`` with the input arguments' annotations::
 
   G_StartBlock(v4, v5):
     v6 = getattr(v4, 'append')
@@ -220,7 +206,7 @@
 
 And so this time the list ``X1`` is updated with::
 
-    getitem(X1,?)=X5
+    getitem(X1,*)=X5
 
 and now we know that we have a list of integers.
 
@@ -236,3 +222,33 @@
 themselves use a representation that depends on all the lists that could
 come at this point.  All these places and lists will use a common,
 most general representation.
+
+
+Polymorphism and mixed flowing/inference
+----------------------------------------
+
+We might eventually mix type inference and control flow generation a bit
+more than described above.  The annotations could influence the generation
+of the graph.
+
+The most interesting influence would be to occasionally prevent two
+FrameStates from being merged.  This would result in a bigger control flow
+graph in which several basic blocks can contain the operations about the
+same bytecode positions, with different annotations.  In particular, a
+quite interesting idea is to disallow two states to be merged if the
+resulting intersection of annotations is too poor -- say if it would make
+genpyrex.py use the fall-back generic object type, which is not available
+to other genxxx.py.
+
+The result is that we will automatically generate several specialized
+version of the RPython code when it is meant to be polymorphic.  For
+example, in a function such as::
+
+    def push(stack, item):
+        stack.append(item)
+
+the different entry points, specifying quite different type annotations
+for ``item``, are all unmergeable, as merging them would result in
+insufficently many annotations left.  By contrast, in the factorial
+example above, all merges are fine because they conserve at least the
+``type(X)=int`` annotation.


More information about the Pypy-commit mailing list