[pypy-svn] r12794 - pypy/dist/pypy/documentation

arigo at codespeak.net arigo at codespeak.net
Wed May 25 13:53:11 CEST 2005


Author: arigo
Date: Wed May 25 13:53:11 2005
New Revision: 12794

Modified:
   pypy/dist/pypy/documentation/_ref.txt
   pypy/dist/pypy/documentation/translation.txt
Log:
- Updated translation.txt to the changes in lltype.py.
- Give a guided tour of lltype.



Modified: pypy/dist/pypy/documentation/_ref.txt
==============================================================================
--- pypy/dist/pypy/documentation/_ref.txt	(original)
+++ pypy/dist/pypy/documentation/_ref.txt	Wed May 25 13:53:11 2005
@@ -1,8 +1,8 @@
-.. _`issue40`: http://codespeak.net/issue/pypy-dev/issue40
 .. _`demo/`: http://codespeak.net/svn/pypy/dist/demo
 .. _`lib-python/`: http://codespeak.net/svn/pypy/dist/lib-python
 .. _`pypy/annotation`:
 .. _`annotation/`: http://codespeak.net/svn/pypy/dist/pypy/annotation
+.. _`annotation/binaryop.py`: http://codespeak.net/svn/pypy/dist/pypy/annotation/binaryop.py
 .. _`documentation/`: http://codespeak.net/svn/pypy/dist/pypy/documentation
 .. _`documentation/revreport/`: http://codespeak.net/svn/pypy/dist/pypy/documentation/revreport
 .. _`documentation/website/`: http://codespeak.net/svn/pypy/dist/pypy/documentation/website
@@ -26,9 +26,11 @@
 .. _`pypy/objspace/std`:
 .. _`objspace/std/`: http://codespeak.net/svn/pypy/dist/pypy/objspace/std
 .. _`objspace/thunk.py`: http://codespeak.net/svn/pypy/dist/pypy/objspace/thunk.py
-.. _`objspace/trace.py`:
-.. _`pypy/objspace/trace.py`: http://codespeak.net/svn/pypy/dist/pypy/objspace/trace.py
+.. _`pypy/objspace/trace.py`:
+.. _`objspace/trace.py`: http://codespeak.net/svn/pypy/dist/pypy/objspace/trace.py
 .. _`rpython/`: http://codespeak.net/svn/pypy/dist/pypy/rpython
+.. _`rpython/lltype.py`: http://codespeak.net/svn/pypy/dist/pypy/rpython/lltype.py
+.. _`rpython/rlist.py`: http://codespeak.net/svn/pypy/dist/pypy/rpython/rlist.py
 .. _`pypy/test_all.py`: http://codespeak.net/svn/pypy/dist/pypy/test_all.py
 .. _`tool/`: http://codespeak.net/svn/pypy/dist/pypy/tool
 .. _`tool/pytest/`: http://codespeak.net/svn/pypy/dist/pypy/tool/pytest

Modified: pypy/dist/pypy/documentation/translation.txt
==============================================================================
--- pypy/dist/pypy/documentation/translation.txt	(original)
+++ pypy/dist/pypy/documentation/translation.txt	Wed May 25 13:53:11 2005
@@ -465,35 +465,85 @@
 
     v3 = int_add(v1, v2)
 
-where -- in C notation -- all three variables v1, v2 and v3 are typed ``int``.  This is done by attaching an attribute ``concretetype`` to v1, v2 and v3 (which might be instances of Variable or possibly Constant).  In our model, this ``concretetype`` is ``pypy.rpython.lltypes.Signed``.  Of course, the purpose of replacing the operation called ``add`` with ``int_add`` is that code generators no longer have to worry about what kind of addition (or concatenation maybe?) it means.
+where -- in C notation -- all three variables v1, v2 and v3 are typed ``int``.  This is done by attaching an attribute ``concretetype`` to v1, v2 and v3 (which might be instances of Variable or possibly Constant).  In our model, this ``concretetype`` is ``pypy.rpython.lltype.Signed``.  Of course, the purpose of replacing the operation called ``add`` with ``int_add`` is that code generators no longer have to worry about what kind of addition (or concatenation maybe?) it means.
 
 
 The process in more details
 ---------------------------
 
-The RPython Typer does the following transformations for each block of all the annotated flow graphs (each block is processed independently, by assuming that the already-computed annotations are globally correct):
+The RPython Typer has a structure similar to that of the Annotator_: both consider each block of the flow graphs in turn, and perform some analysis on each operation.  In both cases the analysis of an operation depends on the annotations of its input arguments.  This is reflected in the usage of the same ``__extend__`` syntax in the source files (compare e.g. `annotation/binaryop.py`_ and `rpython/rlist.py`_).
 
-* We first replace all Variables that have constant annotations with real Constants in the flow graph.
+The analogy stops here, though: while it runs, the Annotator is in the middle of computing the annotations, so it might need to reflow and generalize until a fixpoint is reached.  The Typer, by contrast, works on the final annotations that the Annotator computed, without changing them, assuming that they are globally consistent.  There is no need to reflow: the Typer considers each block only once.  And unlike the Annotator, the Typer completely modifies the flow graph, by replacing each operation with some low-level operations.
 
-* For the time being, we assume that each SomeXxx annotation has a canonical low-level representation.  For example, all variables annotated with SomeInteger() will correspond to the ``Signed`` low-level type.  Each input argument of the block are tagged with the canonical low-level representation (this is done by attaching an attribute ``concretetype`` to each Variable).
-
-* Each operation, with its argument's annotation, is looked up in a table which specifies with which low-level operation(s) it should be substituted.  If needed, the arguments are first converted (with extra operations) from their current ``concretetype`` to the required low-level types.  For constant arguments, we just attach the ``concretetype`` to the Constant instance; as for Variables, this tells the code generator of which type the constant really is.  Finally, the substitution rules specify the ``concretetype`` of the result.  It is attached to the result Variable, and will be used further down the block to detect when conversions are needed.
-
-* When a block has been transformed in this way, all the links are considered; if the concrete types of the Variables that exit do not match the canonical low-level types expected by the target block, conversions are inserted -- they are put in a new block inserted along the link, as they are of no concern to the other exit links.
-
-This may look like flowing, similar to what the annotator does, but it is limited to a single block; for global coherency it trusts the more involved fixpoint-based algorithm run by the annotator.
+The main assumption of the RTyper, for the time being, is that each SomeXxx annotation has a canonical low-level representation.  For example, all variables annotated with SomeInteger() will correspond to the ``Signed`` low-level type.  The RTyper computes the canonical low-level type for each Variable based on its annotation, and stores it in the attribute ``concretetype``.  It also computes a ``concretetype`` for Constants, to match the way they are used in the low-level operations (for example, ``int_add(x, 1)`` requires a ``Constant(1)`` with ``concretetype=Signed``, but an untyped ``add(x, 1)`` works with a ``Constant(1)`` that must actually be a PyObject at run-time).
 
 
 Low-Level Types
 ---------------
 
-For now, the RPython Typer uses a standard low-level model which we believe can correspond rather directly to various target languages from C to LLVM to Java.  This model is implemented in the first part of `lltypes.py`_.
+The RPython Typer uses a standard low-level model which we believe can correspond rather directly to various target languages from C to LLVM to Java.  This model is implemented in the first part of `rpython/lltype.py`_.
 
-The second part of `lltypes.py`_ is a runnable implementation of these types, for testing purposes.  It allows us to write and test plain Python code using a malloc() function to obtain and manipulate structures and arrays.  This is useful for example to implement RPython types like 'list' with its operations and methods.
+The second part of `rpython/lltype.py`_ is a runnable implementation of these types, for testing purposes.  It allows us to write and test plain Python code using a malloc() function to obtain and manipulate structures and arrays.  This is useful for example to implement RPython types like 'list' with its operations and methods.
 
 The basic assumption is that Variables (i.e. local variables and function arguments and return value) all contain "simple" values: basically, just integers or pointers.  All the "container" data structures (struct and array) are allocated in the heap, and they are always manipulated via pointers.  (There is no equivalent to the C notion of local variable of a ``struct`` type.)
 
-.. _`lltypes.py`: http://codespeak.net/svn/pypy/dist/pypy/rpython/lltypes.py
+Here is a quick tour::
+
+    >>> from pypy.rpython.lltype import *
+
+(The module is called ``lltypes`` in PyPy release 0.6.)
+
+Here are a few primitive low-level types, and the typeOf() function to figure them out::
+
+    >>> Signed
+    <Signed>
+    >>> typeOf(5)
+    <Signed>
+    >>> typeOf(r_uint(12))
+    <Unsigned>
+    >>> typeOf('x')
+    <Char>
+
+Let's say that we want to build a type "point", which is a structure with two integer fields "x" and "y"::
+
+    >>> POINT = GcStruct('point', ('x', Signed), ('y', Signed))
+    >>> POINT
+    <GcStruct point { x: Signed, y: Signed }>
+
+The structure is a ``GcStruct``, which means a structure that can be allocated in the heap and eventually freed by some garbage collector.  (For platforms where we use reference counting, think about ``GcStruct`` as a struct with an additional reference counter field.) (NB. in PyPy release 0.6, GcStruct and GcArray don't exist; you must use Struct and Array instead.)
+
+Giving a name ('point') to the GcStruct is only for clarity: it is used in the representation.
+
+    >>> p = malloc(POINT)
+    >>> p
+    <_ptrtype to struct point { x=0, y=0 }>
+    >>> p.x = 5
+    >>> p.x
+    5
+    >>> p
+    <_ptrtype to struct point { x=5, y=0 }>
+
+``malloc()`` allocates a structure from the heap, initalizes it to 0 (currently), and returns a pointer to it.  The point of all this is to work with a very limited, easily controllable set of types, and define implementations of types like list in this elementary world.  The ``malloc()`` function is a kind of placeholder, which must eventually be provided by the code generator for the target platform; but as we have just seen its Python implementation in `rpython/lltype.py`_ works too, which is primarily useful for testing, interactive exploring, etc.
+
+The argument to ``malloc()`` is the structure type directly, but it returns a pointer to the structure, as ``typeOf()`` tells you::
+
+    >>> typeOf(p)
+    <ptr(gc) to GcStruct point { x: Signed, y: Signed }>
+
+For the purpose of creating structures with pointers to other structures, we can declare pointer types explicitely::
+
+    >>> typeOf(p) == GcPtr(POINT)
+    True
+    >>> BIZARRE = GcStruct('bizarre', ('p1', GcPtr(POINT)), ('p2', GcPtr(POINT)))
+    >>> b = malloc(BIZARRE)
+    >>> b.p1
+    <_ptrtype to None>
+    >>> b.p1 = b.p2 = p
+    >>> b.p1.y = 42
+    >>> b.p2.y
+    42
+
+The world of low-level types is more complicated than integers and GcStructs, though.  The next pages are a reference guide.
 
 
 Primitive Types
@@ -519,9 +569,10 @@
 Structure Types
 +++++++++++++++
 
-Structure types are built as instances of ``pypy.rpython.lltypes.Struct``::
+Structure types are built as instances of ``pypy.rpython.lltype.Struct``::
 
     MyStructType = Struct('somename',  ('field1', Type1), ('field2', Type2)...)
+    MyStructType = GcStruct('somename',  ('field1', Type1), ('field2', Type2)...)
 
 This declares a structure (or a Pascal ``record``) containing the specified named fields with the given types.  The field names cannot start with an underscore.  As noted above, you cannot directly manipulate structure objects, but only pointer to structures living in the heap.
 
@@ -529,18 +580,23 @@
 
 A structure can also contain an inlined array (see below), but only as its last field: in this case it is a "variable-sized" structure, whose memory layout starts with the non-variable fields and ends with a variable number of array items.  This number is determined when a structure is allocated in the heap.  Variable-sized structures cannot be inlined in other structures.
 
+GcStructs have a platform-specific GC header (e.g. a reference counter); only these can be malloc()ed.  Structs have no header, and are suitable for being embedded ("inlined") inside other structures.  As an exception, a GcStruct can be embedded as the first field of a GcStruct: the parent structure uses the same GC header as the substructure.
+
 
 Array Types
 +++++++++++
 
-An array type is built as an instance of ``pypy.rpython.lltypes.Array``::
+An array type is built as an instance of ``pypy.rpython.lltype.Array``::
 
     MyArrayType = Array(('field1', Type1), ('field2', Type2)...)
+    MyArrayType = GcArray(('field1', Type1), ('field2', Type2)...)
 
 The items of an array are always structures; the arguments to Array() give the fields of these structures (it can of course be a single field).  The allowed field types follow the same rules as for Struct(), but this particular structure cannot be variable-sized.
 
 For now, each array stores its length explicitely in a header.  An array can never be resized: it occupies a fixed amount of bytes determined when it is allocated.
 
+GcArrays can be malloc()ed (the length must be specified when malloc() is called, and arrays cannot be resized).  Plain Arrays cannot be malloc()ed but can be used as the last field of a structure, to make a variable-sized structure.  The whole structure can then be malloc()ed, and the length of the array is specified at this time.
+
 
 Pointer Types
 +++++++++++++
@@ -550,7 +606,7 @@
    GcPtr(T, **flags)
    NonGcPtr(T, **flags)
 
-The so-called GC pointers are the ones that hold a reference to the object they point to.  Typically, the malloc() operation allocates and returns a GcPtr to a new structure or array.  In a refcounting implementation, malloc() would allocate enough space for a reference counter before the actual structure, and initialize it to 1.  Actually, GC pointers can only point to a malloc()ed structure or array.  Non-GC pointers are used when you know that a pointer doesn't hold a (counted) reference to an object, usually because the object has no reference counter at all: for example, functions don't have one; more importantly, inlined substructures don't have one either.  For them, care must be taken to ensure that the bigger structure of which they are part of isn't freed while the NonGcPtr to the substructure is still in use.
+The so-called GC pointers are the ones that hold a reference to the object they point to.  Only GcStruct, GcArray and PyObject can have GcPtrs to them.  Typically, the malloc() operation allocates and returns a GcPtr to a new structure or array.  In a refcounting implementation, malloc() would allocate enough space for a reference counter before the actual structure, and initialize it to 1.  Actually, GC pointers can only point to a malloc()ed structure or array.  Non-GC pointers are used when you know that a pointer doesn't hold a (counted) reference to an object, usually because the object has no reference counter at all: for example, functions don't have one; more importantly, inlined substructures don't have one either.  For them, care must be taken to ensure that the bigger structure of which they are part of isn't freed while the NonGcPtr to the substructure is still in use.
 
 All pointer types can also have additional flags, whose meaning is unspecified at this level (apart from the flag ``gc=True`` which GcPtrs have and NonGcPtrs miss).  Types with different flags are incompatible, but the cast_flags() operation is provided to perform explicit casts.  The intention is for example to represent the high-level object "the method append() of this list" as the type ``GcPtr(ListType, method='append')`` -- i.e. a pointer to the list in question with an additional flag specifying that the pointer represents the method append() of that list, as opposed to the list itself.
 
@@ -578,7 +634,7 @@
 
 This list concatenation flow graph is then annotated as usual, with one difference: the annotator has to be taught about malloc() and the way the pointer thus obtained can be manipulated.  This generates a flow graph which is hopefully completely annotated with the SomePtr annotation.  Introduced just for this case, SomePtr maps directly to a low-level pointer type.  This is the only change needed to the Annotator to allow it to perform type inferrence of our very-low-level snippets of code.
 
-See for example http://codespeak.net/svn/pypy/dist/pypy/rpython/rlist.py.
+See for example `rpython/rlist.py`_.
 
 
 
@@ -943,3 +999,6 @@
 
 XXX to be added later
 
+
+
+.. include:: _ref.txt



More information about the Pypy-commit mailing list