[pypy-svn] r19526 - pypy/dist/pypy/doc

arigo at codespeak.net arigo at codespeak.net
Fri Nov 4 16:53:14 CET 2005


Author: arigo
Date: Fri Nov  4 16:53:12 2005
New Revision: 19526

Modified:
   pypy/dist/pypy/doc/draft-dynamic-language-translation.txt
Log:
small changes and typos and (argh) reindented large sections
around the proofs


Modified: pypy/dist/pypy/doc/draft-dynamic-language-translation.txt
==============================================================================
--- pypy/dist/pypy/doc/draft-dynamic-language-translation.txt	(original)
+++ pypy/dist/pypy/doc/draft-dynamic-language-translation.txt	Fri Nov  4 16:53:12 2005
@@ -280,7 +280,7 @@
 
 * the `Annotator`_ is performing type inference.  This part is best
   implemented separately from other parts because it is based on a
-  fixpoint research algorithm.  It is mostly this part that defines and
+  fixpoint search algorithm.  It is mostly this part that defines and
   restricts what RPython exactly is.  After annotation, the control flow
   graphs still contain all the original relatively-high-level RPython
   operations; the inferred information is only used in the next step.
@@ -643,7 +643,7 @@
 
 The goal of the annotator is to find the most precise annotation that
 can be given to each variable of all control flow graphs while
-respecting constrains imposed by the operations in which these variables
+respecting constraints imposed by the operations in which these variables
 are involved.
 
 More precisely, it is usually possible to deduce information about the
@@ -663,7 +663,7 @@
 is a more naive approach than the one taken by type systems specifically
 designed to enable more advanced inference algorithms.  For example,
 `Hindley-Milner`_ type inference works in an inside-out direction, by
-starting from individual operations and propagating type constrains
+starting from individual operations and propagating type constraints
 outwards.
 
 Naturally, simply propagating annotations forward requires the use of a
@@ -1366,129 +1366,125 @@
 
 1. Input variables of blocks
 
-       The annotation of these variables are only modified by the
-       ``phi`` rule, which is based on ``merge``.  The ``merge``
-       operation trivially guarantees the property of generalisation
-       because it is based on the union operator ``\/`` of the lattice.
+   The annotation of these variables are only modified by the ``phi``
+   rule, which is based on ``merge``.  The ``merge`` operation trivially
+   guarantees the property of generalisation because it is based on the
+   union operator ``\/`` of the lattice.
 
 2. Auxiliary variables of operations
 
-       The binding of an auxiliary variable *z'* of an operation is
-       never directly modified: it is only ever identified with other
-       variables via *E*.  So ``b(z')`` is only updated by the rule
-       ``(z'~v) in E``, which is based on the ``merge`` operator as
-       well.
+   The binding of an auxiliary variable *z'* of an operation is never
+   directly modified: it is only ever identified with other variables
+   via *E*.  So ``b(z')`` is only updated by the rule ``(z'~v) in E``,
+   which is based on the ``merge`` operator as well.
 
 3. Variables corresponding to attributes of classes
 
-       The annotation of such variables can only be modified by the
-       ``setattr`` rule (with a ``merge``) or as a result of having been
-       identified with other variables via *E*.  We conclude as above.
+   The annotation of such variables can only be modified by the
+   ``setattr`` rule (with a ``merge``) or as a result of having been
+   identified with other variables via *E*.  We conclude as above.
 
 4. Input and result variables of operations
 
-       By construction of the flow graphs, the input variables of any
-       given operation must also appear before in the block, either as
-       the result variable of a previous operation, or as an input
-       variable of the block itself.  So assume for now that the input
-       variables of this operation can only get generalised; we claim
-       that in this case the same holds for its result variable.  If
-       this holds, then we conclude by induction on the number of
-       operations in the block: indeed, starting from point 1 above for
-       the input variables of the block, it shows that each result
-       variable -- so also all input arguments of the next operation --
-       can only get generalised.
-
-       To prove our claim, first note that none of these input and
-       result variables is ever identified with any other variable via
-       *E*: indeed, the rules described above only identify auxiliary
-       variables and attribute-of-class variables with each other.  It
-       means that the only way the result variable *z* of an operation
-       can be modified is directly by the rule or rules specific to that
-       operation.  This allows us to check the property of
-       generalisation on a case-by-case basis.
-
-       Most cases are easy to check.  Cases like ``b' = b with
-       (z->b(z'))`` are based on point 2 above.  The only non-trivial
-       case is in the rule for ``getattr``::
-
-           b' = b with (z->lookup_filter(b(z'), C))
-
-       For this case, we need the following lemma:
-
-           Let ``v_C.attr`` be an attribute-of-class variable.  Let
-           ``(b,E)`` be any state seen during the annotation process.
-           Assume that ``b(v_C.attr) = Pbc(set)`` where *set* is a set
-           containing potential bound method objects.  Call *m* the
-           family of potential bound method objects appearing in *set*.
-           Then *m* has the following shape: it is "regular" in the
-           sense that it contains only potential bound method objects
-           ``D.f`` such that ``f`` is exactly the function found under
-           the name ``attr`` in some class ``D``; moreover, it is
-           "downwards-closed" in the sense that if it contains a
-           ``D.f``, then it also contains all ``E.g`` for all subclasses
-           ``E`` of ``D`` that override the method (i.e. have a function
-           ``g`` found under the same name ``attr``).
-
-       Proof:
-
-           As we have seen in `Classes and instances`_ above, the
-           initial binding of ``v_C.attr`` is regular and
-           downwards-closed by construction.  Moreover, the ``setattr``
-           rule explicitly checks that it is never adding any potential
-           bound method object to ``b(v_C.attr)``, so that the only way
-           such objects can be added to ``b(v_C.attr)`` is via the
-           identification of ``v_C.attr`` with other ``v_B.attr``
-           variables, for the same name ``attr`` -- which implies that
-           the set *m* will always be regular.  Moreover, the union of
-           two downwards-closed sets of potential bound method objects
-           is still downwards-closed.  This concludes the proof.
-
-       Let us consider the rule for ``getattr`` again::
-
-           b' = b with (z->lookup_filter(b(z'), C))
-
-       The only interesting case is when the binding ``b(z')`` is a
-       ``Pbc(set)`` -- more precisely, we are interested in the part *m*
-       of *set* that is the subset of all potential bound method
-       objects; the function ``lookup_filter`` is the identity on the
-       rest.  Given that ``b(z')`` comes from ``z'`` being identified
-       with various ``v_C.attr`` for a fixed name ``attr``, the set *m*
-       is regular and downwards-closed, as we can deduce from the lemma.
-
-       The class ``C`` in the rule for ``getattr`` comes from the
-       annotation ``Inst(C)`` of the first input variable of the
-       operation.  So what we need to prove is the following: if the
-       binding of this input variable is generalised, and/or if the
-       binding of ``z'`` is generalised, then the annotation computed by
-       ``lookup_filter`` is also generalised (if modified at all)::
-
-           if          b(x)  = Inst(C)   <=   b'(x)  = Inst(C')
-           and         b(z') = Pbc(set)  <=   b'(z') = Pbc(set')
-
-           then  lookup_filter(b(z'), C) <= lookup_filter(b'(z'), C')
-
-       Proof:
-
-           Call respectively *m* and *m'* the subsets of potential bound
-           method objects of *set* and *set'*, as above.  Call *l* the
-           subset of *m* as computed by ``lookup_filter``, i.e.: *l*
-           contains all objects ``D.f`` of *m* for strict subclasses
-           ``D`` of ``C``, plus the single ``B.g`` coming for the most
-           derived non-strict superclass ``B>=C`` which appears in *m*.
-           (Note that as *m* is regular, it cannot actually contain
-           several potential bound method objects with the same class.)
-           Similarly for *l'* computed from *m'* and ``C'``.
-
-           By hypothesis, *m* is contained in *m'*, but we need to check
-           that *l* is contained in *l'*.  This is where we will use the
-           fact that *m* is downwards-closed.  Let ``D.f`` be an element
-           of *l*.
-
-           Case 1: if ``D`` is a strict subclass of ``C``, then it is
-           also a strict subclass of ``C'``.  In this case, ``D.f``,
-           which also belong to *m* and thus to *m'*, also belongs to
-           *l'*.  Graphically::
+   By construction of the flow graphs, the input variables of any given
+   operation must also appear before in the block, either as the result
+   variable of a previous operation, or as an input variable of the
+   block itself.  So assume for now that the input variables of this
+   operation can only get generalised; we claim that in this case the
+   same holds for its result variable.  If this holds, then we conclude
+   by induction on the number of operations in the block: indeed,
+   starting from point 1 above for the input variables of the block, it
+   shows that each result variable -- so also all input arguments of the
+   next operation -- can only get generalised.
+
+   To prove our claim, first note that none of these input and result
+   variables is ever identified with any other variable via *E*: indeed,
+   the rules described above only identify auxiliary variables and
+   attribute-of-class variables with each other.  It means that the only
+   way the result variable *z* of an operation can be modified is
+   directly by the rule or rules specific to that operation.  This
+   allows us to check the property of generalisation on a case-by-case
+   basis.
+
+   Most cases are easy to check.  Cases like ``b' = b with (z->b(z'))``
+   are based on point 2 above.  The only non-trivial case is in the rule
+   for ``getattr``::
+
+       b' = b with (z->lookup_filter(b(z'), C))
+
+   For this case, we need the following lemma:
+
+       Let ``v_C.attr`` be an attribute-of-class variable.  Let
+       ``(b,E)`` be any state seen during the annotation process.
+       Assume that ``b(v_C.attr) = Pbc(set)`` where *set* is a set
+       containing potential bound method objects.  Call *m* the family
+       of potential bound method objects appearing in *set*.  Then *m*
+       has the following shape: it is "regular" in the sense that it
+       contains only potential bound method objects ``D.f`` such that
+       ``f`` is exactly the function found under the name ``attr`` in
+       some class ``D``; moreover, it is "downwards-closed" in the sense
+       that if it contains a ``D.f``, then it also contains all ``E.g``
+       for all subclasses ``E`` of ``D`` that override the method
+       (i.e. have a function ``g`` found under the same name ``attr``).
+
+   Proof:
+
+       As we have seen in `Classes and instances`_ above, the initial
+       binding of ``v_C.attr`` is regular and downwards-closed by
+       construction.  Moreover, the ``setattr`` rule explicitly checks
+       that it is never adding any potential bound method object to
+       ``b(v_C.attr)``, so that the only way such objects can be added
+       to ``b(v_C.attr)`` is via the identification of ``v_C.attr`` with
+       other ``v_B.attr`` variables, for the same name ``attr`` -- which
+       implies that the set *m* will always be regular.  Moreover, the
+       union of two downwards-closed sets of potential bound method
+       objects is still downwards-closed.  This concludes the proof.
+
+   Let us consider the rule for ``getattr`` again::
+
+       b' = b with (z->lookup_filter(b(z'), C))
+
+   The only interesting case is when the binding ``b(z')`` is a
+   ``Pbc(set)`` -- more precisely, we are interested in the part *m* of
+   *set* that is the subset of all potential bound method objects; the
+   function ``lookup_filter`` is the identity on the rest.  Given that
+   ``b(z')`` comes from ``z'`` being identified with various
+   ``v_C.attr`` for a fixed name ``attr``, the set *m* is regular and
+   downwards-closed, as we can deduce from the lemma.
+
+   The class ``C`` in the rule for ``getattr`` comes from the annotation
+   ``Inst(C)`` of the first input variable of the operation.  So what we
+   need to prove is the following: if the binding of this input variable
+   is generalised, and/or if the binding of ``z'`` is generalised, then
+   the annotation computed by ``lookup_filter`` is also generalised (if
+   modified at all)::
+
+       if          b(x)  = Inst(C)   <=   b'(x)  = Inst(C')
+       and         b(z') = Pbc(set)  <=   b'(z') = Pbc(set')
+
+       then  lookup_filter(b(z'), C) <= lookup_filter(b'(z'), C')
+
+   Proof:
+
+       Call respectively *m* and *m'* the subsets of potential bound
+       method objects of *set* and *set'*, as above.  Call *l* the
+       subset of *m* as computed by ``lookup_filter``, i.e.: *l*
+       contains all objects ``D.f`` of *m* for strict subclasses ``D``
+       of ``C``, plus the single ``B.g`` coming for the most derived
+       non-strict superclass ``B>=C`` which appears in *m*.  (Note that
+       as *m* is regular, it cannot actually contain several potential
+       bound method objects with the same class.)  Similarly for *l'*
+       computed from *m'* and ``C'``.
+
+       By hypothesis, *m* is contained in *m'*, but we need to check
+       that *l* is contained in *l'*.  This is where we will use the
+       fact that *m* is downwards-closed.  Let ``D.f`` be an element of
+       *l*.
+
+       Case 1: if ``D`` is a strict subclass of ``C``, then it is also a
+       strict subclass of ``C'``.  In this case, ``D.f``, which also
+       belong to *m* and thus to *m'*, also belongs to *l'*.
+       Graphically::
 
                                C'
                              / | \
@@ -1499,23 +1495,23 @@
                        /  |  \
                       D1  D2  D3
 
-           (In this diagram, as far as they correspond to real methods,
-           potential bound method objects ``D1.f1``, ``D2.f2`` and
-           ``D3.f3`` all belong to both *l* and *l'*.  The family *l'*
-           additionally contains ``C.g``, ``E2.h2`` and ``E3.h3``.  Both
-           families *l* and *l'* also contain one extra item coming from
-           the second part of their construction rule.)
-
-           Case 2: ``D`` is instead to most derived non-strict
-           superclass of ``C`` which appears in *m*; assume that ``D``
-           is still a strict subclass of ``C'``.  In this case, ``D.f``
-           belongs to *l'* as previously.  (For example, if there is a
-           ``C.g`` in *m*, then ``D = C`` and as seen in the above
-           diagram ``C.g`` is indeed in *l'*.)
-
-           Case 3: ``D`` is still the most derived non-strict superclass
-           of ``C`` which appears in *m*, but this time ``D`` is not a
-           strict subclass of ``C'``.  The situation is as follows::
+       (In this diagram, as far as they correspond to real methods,
+       potential bound method objects ``D1.f1``, ``D2.f2`` and ``D3.f3``
+       all belong to both *l* and *l'*.  The family *l'* additionally
+       contains ``C.g``, ``E2.h2`` and ``E3.h3``.  Both families *l* and
+       *l'* also contain one extra item coming from the second part of
+       their construction rule.)
+
+       Case 2: ``D`` is instead to most derived non-strict superclass of
+       ``C`` which appears in *m*; assume that ``D`` is still a strict
+       subclass of ``C'``.  In this case, ``D.f`` belongs to *l'* as
+       previously.  (For example, if there is a ``C.g`` in *m*, then ``D
+       = C`` and as seen in the above diagram ``C.g`` is indeed in
+       *l'*.)
+
+       Case 3: ``D`` is still the most derived non-strict superclass of
+       ``C`` which appears in *m*, but this time ``D`` is not a strict
+       subclass of ``C'``.  The situation is as follows::
 
                                D
                                |
@@ -1523,17 +1519,17 @@
                                |
                                C
 
-           where extra intermediate classes could be present too.  To be
-           able to conclude that ``D.f`` is in *l'*, we now have to
-           prove that ``D`` is also the most derived non-strict
-           superclass of ``C'`` that appears in *m'*.  Ab absurdo,
-           assume that this is not the case.  Then there is a class
-           ``B``, strict superclass of ``C'`` but strict subclass of
-           ``D``, such that *m'* contains an element ``B.g``.  But *m*
-           contains ``D.f`` and it is downwards-closed, so it must also
-           contain ``B.g``.  This contradicts the original hypothesis on
-           ``D``: this ``B`` would be another more derived superclass of
-           ``C`` that appears in *m*.  QED.
+       where extra intermediate classes could be present too.  To be
+       able to conclude that ``D.f`` is in *l'*, we now have to prove
+       that ``D`` is also the most derived non-strict superclass of
+       ``C'`` that appears in *m'*.  Ab absurdo, assume that this is not
+       the case.  Then there is a class ``B``, strict superclass of
+       ``C'`` but strict subclass of ``D``, such that *m'* contains an
+       element ``B.g``.  But *m* contains ``D.f`` and it is
+       downwards-closed, so it must also contain ``B.g``.  This
+       contradicts the original hypothesis on ``D``: this ``B`` would be
+       another more derived superclass of ``C`` that appears in *m*.
+       QED.
 
 
 Termination
@@ -1610,128 +1606,125 @@
 
 1. The final state *(b_n, E_n)* is sound.
 
-       The proof is based on the fact that the "effect" of any rule only
-       depends on the annotation of its input and auxiliary variables.
-       This "effect" is to merge some bindings and/or add some
-       identifications; it can formalised by saying that ``r( (b,E) ) =
-       (b,E) union (bf,Ef)`` for a certain *(bf,Ef)* that contains only
-       the new bindings and identifications.
-
-       More precisely, let *r* be a rule.  Let *V_r* be the set of input
-       and auxiliary variables of *r*, i.e.::
-
-           V_r = { v | r in Rules_v }
-
-       Let *(b,E)* be a state.  Then there exists a state *(bf,Ef)*
-       representing the "effect" of the rule on *(b,E)* as follows:
-
-           r( (b,E) )   = (b,E)   union (bf,Ef)
-
-       and the same *(bf,Ef)* works for any *(b',E')* which is equal to
-       *(b,E)* on *V_r*::
-
-           r( (b',E') ) = (b',E') union (bf,Ef)
-
-       This is easily verified on a case-by-case basis for each kind of
-       rule presented above.  The details are left to the reader.
-
-       To show the proposition, we proceed by induction on *i* to show
-       that each of the meta-states *(S_i, b_i, E_i)* has the following
-       property: for each rule *r* which is not in *S_i* we have ``r(
-       (b_i, E_i) ) = (b_i, E_i)``.  The result will follow from this
-       claim because the final *S_n* is empty.
-
-       The claim is trivially true for ``i=0``.  Let us assume that it
-       holds for some ``i<n`` and prove it for ``i+1``: let *r* be a
-       rule not in *S_i+1*.  By definition of *S_i+1*, the
-       input/auxiliary variables of *r* have the same bindings at the
-       steps ``i`` and ``i+1``, i.e. *(b_i, E_i)* is equal to *(b_i+1,
-       E_i+1)* on *V_r*.  Let *(bf,Ef)* be the effect of *r* on *(b_i,
-       E_i)* as above.  We have::
-
-           r( (b_i,   E_i) )   = (b_i,   E_i)   union (bf,Ef)
-           r( (b_i+1, E_i+1) ) = (b_i+1, E_i+1) union (bf,Ef)
-
-       Case 1: *r* is in *S_i*.  As it is not in *S_i+1* it must be
-       precisely *r_i*.  In this case ``r( (b_i, E_i) ) = (b_i+1,
-       E_i+1)``, so that::
-
-           r( (b_i+1, E_i+1) ) = (b_i+1, E_i+1) union (bf,Ef)
-                               = r( (b_i, E_i) ) union (bf,Ef)
-                               = (b_i, E_i) union (bf,Ef) union (bf,Ef)
-                               = (b_i, E_i) union (bf,Ef)
-                               = r( (b_i, E_i) )
-                               = (b_i+1, E_i+1).
-
-       which concludes the induction step.
-
-       Case 2: *r* is not in *S_i*.  By induction hypothesis ``(b_i,
-       E_i) = r( (b_i, E_i) )``.
-
-           (b_i, E_i) = (b_i, E_i) union (bf,Ef)
-           (b_i, E_i) >= (bf,Ef)
-           (b_i+1, E_i+1) >= (b_i, E_i) >= (bf,Ef)
-           (b_i+1, E_i+1) = (b_i+1, E_i+1) union (bf,Ef)
-           (b_i+1, E_i+1) = r( (b_i+1, E_i+1) ).
+   The proof is based on the fact that the "effect" of any rule only
+   depends on the annotation of its input and auxiliary variables.  This
+   "effect" is to merge some bindings and/or add some identifications;
+   it can formalised by saying that ``r( (b,E) ) = (b,E) union (bf,Ef)``
+   for a certain *(bf,Ef)* that contains only the new bindings and
+   identifications.
+
+   More precisely, let *r* be a rule.  Let *V_r* be the set of input and
+   auxiliary variables of *r*, i.e.::
+
+       V_r = { v | r in Rules_v }
+
+   Let *(b,E)* be a state.  Then there exists a state *(bf,Ef)*
+   representing the "effect" of the rule on *(b,E)* as follows:
+
+       r( (b,E) )   = (b,E)   union (bf,Ef)
+
+   and the same *(bf,Ef)* works for any *(b',E')* which is equal to
+   *(b,E)* on *V_r*::
+
+       r( (b',E') ) = (b',E') union (bf,Ef)
+
+   This is easily verified on a case-by-case basis for each kind of rule
+   presented above.  The details are left to the reader.
+
+   To show the proposition, we proceed by induction on *i* to show that
+   each of the meta-states *(S_i, b_i, E_i)* has the following property:
+   for each rule *r* which is not in *S_i* we have ``r( (b_i, E_i) ) =
+   (b_i, E_i)``.  The result will follow from this claim because the
+   final *S_n* is empty.
+
+   The claim is trivially true for ``i=0``.  Let us assume that it holds
+   for some ``i<n`` and prove it for ``i+1``: let *r* be a rule not in
+   *S_i+1*.  By definition of *S_i+1*, the input/auxiliary variables of
+   *r* have the same bindings at the steps ``i`` and ``i+1``,
+   i.e. *(b_i, E_i)* is equal to *(b_i+1, E_i+1)* on *V_r*.  Let
+   *(bf,Ef)* be the effect of *r* on *(b_i, E_i)* as above.  We have::
+
+       r( (b_i,   E_i) )   = (b_i,   E_i)   union (bf,Ef)
+       r( (b_i+1, E_i+1) ) = (b_i+1, E_i+1) union (bf,Ef)
+
+   Case 1: *r* is in *S_i*.  As it is not in *S_i+1* it must be
+   precisely *r_i*.  In this case ``r( (b_i, E_i) ) = (b_i+1, E_i+1)``,
+   so that::
+
+       r( (b_i+1, E_i+1) ) = (b_i+1, E_i+1) union (bf,Ef)
+                           = r( (b_i, E_i) ) union (bf,Ef)
+                           = (b_i, E_i) union (bf,Ef) union (bf,Ef)
+                           = (b_i, E_i) union (bf,Ef)
+                           = r( (b_i, E_i) )
+                           = (b_i+1, E_i+1).
+
+   which concludes the induction step.
+
+   Case 2: *r* is not in *S_i*.  By induction hypothesis ``(b_i, E_i) =
+   r( (b_i, E_i) )``.
+
+       (b_i, E_i) = (b_i, E_i) union (bf,Ef)
+       (b_i, E_i) >= (bf,Ef)
+       (b_i+1, E_i+1) >= (b_i, E_i) >= (bf,Ef)
+       (b_i+1, E_i+1) = (b_i+1, E_i+1) union (bf,Ef)
+       (b_i+1, E_i+1) = r( (b_i+1, E_i+1) ).
 
-       This concludes the proof.
+   This concludes the proof.
 
 2. *(b_n, E_n)* is the minimum of all sound non-degenerated states.
 
-       Let *(bs,Es)* be any sound non-degenerated state such that
-       ``(b_0, E_0) <= (bs,Es)``.  We will show by induction that for
-       each ``i<=n`` we have ``(b_i, E_i) <= (bs,Es)``.  The conclusion
-       follows from the case ``i=n``.
-
-       The claim holds for ``i=0`` by hypothesis.  Let us assume that it
-       is true for some ``i<n`` and prove it for ``i+1``.  We need to
-       consider separate cases for each of the kind of rules that *r_i*
-       can be.  We only show a few representative examples and leave the
-       complete proof to the reader.  These example show why it is a key
-       point that *(bs,Es)* is not degenerated: most rules no longer
-       apply if an annotation degenerates to ``Top``, but continue to
-       apply if it is generalised to anything below ``Top``.  The
-       general idea is to turn each rule into a step of the complete
-       proof, showing that if a sound state is at least as general as
-       ``(b_i, E_i)`` then it must be at least as general as ``(b_i+1,
-       E_i+1)``.
-
-       Example 1.  The rule *r_i* is::
-
-              z=add(x,y), b(x)=Int, Bool<=b(y)<=Int
-           ------------------------------------------------------
-                    b' = b with (z->Int)
-
-       In this example, assuming that the guards are satisfied, *b_i+1*
-       is *b_i* with ``z->Int``.  We must show that ``bs(z) >= Int``.
-       We know from ``(b_i, E_i) <= (bs,Es)`` that ``bs(x) >= Int`` and
-       ``bs(y) >= Bool``.  As *bs* is not degenerated, we have more
-       precisely ``bs(x) = Int, Bool <= bs(y) <= Int``.  Moreover, by
-       definition ``r( (bs,Es) ) = (bs,Es)``.  We conclude that ``bs(z)
-       = Int``.
-
-       Example 2.  The rule *r_i* is::
-
-              y = phi(x)
-           ----------------------------------------
-                    merge b(x) => y
-
-       We must show that ``bs(y) >= b_i+1(y)``.  We need to subdivide
-       this example in two cases: either ``b(x)`` and ``b(y)`` are both
-       ``List`` annotations or not.
-
-       If they are not, we know that ``b_i+1(y) = b_i(y) \/ b_i(x)`` and
-       ``bs(y) >= b_i(y)``, so that we must show that ``bs(y) >=
-       b_i(x)``.  We know that ``r( (bs,Es) ) = (bs,Es)`` so that
-       ``bs(y) = bs(x) \/ bs(y)``, i.e. ``bs(y) >= bs(x)``.  We conclude
-       by noting that ``bs(x) >= b_i(x)``.
-
-       On the other hand, if ``b(x) = List(v)`` and ``b(y) = List(w)``,
-       then *b_i+1* is *b_i* but *E_i+1* is *E_i* with ``v~w``.  We must
-       show that ``(v~w) in Es``.  As *(bs,Es)* is at least as general
-       as *(b_i, E_i)* but not degenerated we know that ``bs(x) =
-       List(v)`` and ``bs(y) = List(w)`` as well.  Again, because ``r(
-       (bs,Es) ) = (bs,Es)`` we conclude that ``(v~w) in Es``.
+   Let *(bs,Es)* be any sound non-degenerated state such that ``(b_0,
+   E_0) <= (bs,Es)``.  We will show by induction that for each ``i<=n``
+   we have ``(b_i, E_i) <= (bs,Es)``.  The conclusion follows from the
+   case ``i=n``.
+
+   The claim holds for ``i=0`` by hypothesis.  Let us assume that it is
+   true for some ``i<n`` and prove it for ``i+1``.  We need to consider
+   separate cases for each of the kind of rules that *r_i* can be.  We
+   only show a few representative examples and leave the complete proof
+   to the reader.  These example show why it is a key point that
+   *(bs,Es)* is not degenerated: most rules no longer apply if an
+   annotation degenerates to ``Top``, but continue to apply if it is
+   generalised to anything below ``Top``.  The general idea is to turn
+   each rule into a step of the complete proof, showing that if a sound
+   state is at least as general as ``(b_i, E_i)`` then it must be at
+   least as general as ``(b_i+1, E_i+1)``.
+
+   Example 1.  The rule *r_i* is::
+
+          z=add(x,y), b(x)=Int, Bool<=b(y)<=Int
+       ------------------------------------------------------
+                b' = b with (z->Int)
+
+   In this example, assuming that the guards are satisfied, *b_i+1* is
+   *b_i* with ``z->Int``.  We must show that ``bs(z) >= Int``.  We know
+   from ``(b_i, E_i) <= (bs,Es)`` that ``bs(x) >= Int`` and ``bs(y) >=
+   Bool``.  As *bs* is not degenerated, we have more precisely ``bs(x) =
+   Int, Bool <= bs(y) <= Int``.  Moreover, by definition ``r( (bs,Es) )
+   = (bs,Es)``.  We conclude that ``bs(z) = Int``.
+
+   Example 2.  The rule *r_i* is::
+
+          y = phi(x)
+       ----------------------------------------
+                merge b(x) => y
+
+   We must show that ``bs(y) >= b_i+1(y)``.  We need to subdivide this
+   example in two cases: either ``b(x)`` and ``b(y)`` are both ``List``
+   annotations or not.
+
+   If they are not, we know that ``b_i+1(y) = b_i(y) \/ b_i(x)`` and
+   ``bs(y) >= b_i(y)``, so that we must show that ``bs(y) >= b_i(x)``.
+   We know that ``r( (bs,Es) ) = (bs,Es)`` so that ``bs(y) = bs(x) \/
+   bs(y)``, i.e. ``bs(y) >= bs(x)``.  We conclude by noting that ``bs(x)
+   >= b_i(x)``.
+
+   On the other hand, if ``b(x) = List(v)`` and ``b(y) = List(w)``, then
+   *b_i+1* is *b_i* but *E_i+1* is *E_i* with ``v~w``.  We must show
+   that ``(v~w) in Es``.  As *(bs,Es)* is at least as general as *(b_i,
+   E_i)* but not degenerated we know that ``bs(x) = List(v)`` and
+   ``bs(y) = List(w)`` as well.  Again, because ``r( (bs,Es) ) =
+   (bs,Es)`` we conclude that ``(v~w) in Es``.
 
 
 Complexity
@@ -1924,9 +1917,9 @@
 will only skim it and refer to the reference documentation when
 appropriate.
 
-The main difficulty with turning annotated flow graphs into C code is
-that the RPython definition is still quite large, in the sense that an
-important fraction of the built-in data structures of Python, and the
+The main difficulty with turning annotated flow graphs into, say, C code
+is that the RPython definition is still quite large, in the sense that
+an important fraction of the built-in data structures of Python, and the
 methods on them, are supported -- sometimes requiring highly non-trivial
 implementations, in a polymorphic way.  Various approaches have been
 tried out, including writing a lot of template C code that gets filled
@@ -1934,12 +1927,18 @@
 
 The approach eventually selected is different.  We proceed in two steps:
 
-* the annotated graphs are rewritten so that each RPython-level
+* the RTyper_ rewrites the annotated graphs so that each RPython-level
   operation is replaced by one or a few low-level operations (or a call
   to a helper for more complex operations);
 
-* the low-level flow graphs thus obtained are easy to handle in a
-  back-end -- we can currently turn them into either C or LLVM_ code.
+* a back-end generates code for the target language and environment
+  based on the low-level flow graphs thus obtained.
+
+We can currently generate C-like low-level flow graphs and turn them
+into either C or LLVM_ code; or experimentally, PowerPC machine code or
+JavaScript_.  If we go through slightly less low-level flow graphs
+instead, we can also interface with an experimental back-end generating
+Squeak_ and in the future Java and/or .NET.
 
 
 RTyper
@@ -1952,12 +1951,28 @@
 annotation state is sound.  It is described in more details in the
 `RPython typer`_ reference.
 
+
+Low-level flow graphs
+*********************
+
+The purpose of the RTyper is to produce control flow graphs that contain
+a different set of variables and operations.  At this level, the
+variables are typed, and the operations between them are constrained to
+be well-typed.  The exact set of types and operations depends on the
+target environment's language; currently, we have defined two such sets:
+
+* lltype XXX
+
+* ootype XXX
+
+
+
 A noteworthy point of the RTyper is that for each operation that has no
 obvious C-level equivalent, we write a helper function in Python; each
 usage of the operation in the source (high-level) annotated flow graph
 is replaced by a call to this function.  The function in question is
 implemented in terms of "simpler" operations.  The function is then fed
-back into the flow object space and the annotator and the RTyper itself
+back into the flow object space and the annotator and the RTyper itself,
 so that it gets turned into another low-level control flow graph.  At
 this point, the annotator runs with a different set of default
 specialisations: it allows several copies of the helper functions to be
@@ -1965,8 +1980,8 @@
 do this by default at this level because of the intended purpose of
 these helpers: they are usually methods of a polymorphic container.
 
-This approach shows that our annotator is versatile enough to accommodate
-different kinds of sub-languages at different levels: it is
+This approach shows that our annotator is versatile enough to
+accommodate different kinds of sub-languages at different levels: it is
 straightforward to adapt it for the so-called "low-level Python"
 language in which we constrain ourselves to write the low-level
 operation helpers.  Automatic specialisation was a key point here; the
@@ -1974,6 +1989,8 @@
 declarations.
 
 
+
+
 Generating C code
 ~~~~~~~~~~~~~~~~~
 
@@ -1983,8 +2000,8 @@
 like C source code.
 
 This step is theoretically straightforward, if messy in practice for
-various reasons including the limitations, constrains and irregularities
-of the C language.
+various reasons including the limitations, constraints and
+irregularities of the C language.
 
 The `GenC back-end`_ works again in two steps:
 
@@ -2037,5 +2054,7 @@
 .. _LLVM: http://llvm.cs.uiuc.edu/
 .. _`RPython typer`: translation.html#rpython-typer
 .. _`GenC back-end`: translation.html#genc
+.. _JavaScript: http://www.ecma-international.org/publications/standards/Ecma-262.htm
+.. _Squeak: http://www.squeak.org/
 
 .. include:: _ref.txt



More information about the Pypy-commit mailing list