[pypy-svn] r25236 - pypy/dist/pypy/doc

gromit at codespeak.net gromit at codespeak.net
Sun Apr 2 20:51:20 CEST 2006


Author: gromit
Date: Sun Apr  2 20:51:18 2006
New Revision: 25236

Modified:
   pypy/dist/pypy/doc/ctypes-integration.txt
Log:
CHG: Discussed the alternatives. Added a complete description of the GC related keep alive management.

Modified: pypy/dist/pypy/doc/ctypes-integration.txt
==============================================================================
--- pypy/dist/pypy/doc/ctypes-integration.txt	(original)
+++ pypy/dist/pypy/doc/ctypes-integration.txt	Sun Apr  2 20:51:18 2006
@@ -45,6 +45,8 @@
 
     -   All types are defined at module load time and
         thus need not be rpython.
+        This means ctypes' functions like `POINTER` and
+        `pointer` must not be annotated.
 
     -   Free unions are not supported, because it is unclear
         whether they can be properly annotated.
@@ -77,8 +79,56 @@
 
 .. [#] This restriction will be lifted in future ctypes versions. 
 
-Memory-Layout
--------------
+
+Memory-Layout as Proposed By Gerald Klix
+----------------------------------------
+
+Primitive Types
+~~~~~~~~~~~~~~~
+
+Ctypes' primitive types are mapped directly to the correspondending
+PyPy type.
+
+Structures
+~~~~~~~~~~
+Structures will have the following memory layout if they were allocated by ctypes::
+
+    Ptr( GcStruct( "CtypesGcStructure_<ClassName> 
+            ( "c_data" 
+                    (Struct "C-Data_<ClassName>
+                            *<Fieldefintions>) ) ) )
+
+We will try hard not to expose the "c-data" member of the structure
+at rpython level.
+
+Structures that result form dereferencing a pointer will have the following
+layout::
+
+    Ptr( GcStruct( "CtypesStructure_<ClassName>
+        ( "c_data"
+                Ptr( Struct( "C-Data_<ClassName>
+                             *<Fieldefintions>) ) ) ) )
+
+Pointers
+~~~~~~~~
+Pointers pointing to structures allocated by ctypes will have the following memory layout::
+
+    Ptr( GcStruct( "CtypesGCPointer_<ClassName>
+        "contents" Ptr( GcStruct( "CtypesGcStructure_<Name>" ... ) ) ) )
+
+
+Pointers pointing returned from external functions have the follwing layout if the
+point to a structure::
+
+    Ptr( GcStruct( "CtypesPointer_<ClassName>"
+        "contents" Ptr( Struct( "CtypesStructure_<Name>" ... ) ) ) )
+
+Currently it is not decided whether assiging a pointers `contents` attribute from
+a GC-pointer should be allowed. The other case will only become valid if we implement
+structures with mixed memory state.
+
+Memory-Layout as Proposed by Armin Rigo
+---------------------------------------
 
 In Ctypes, all instances are mutable boxes containing either some raw
 memory with a layout compatible to that of the equivalent C type, or a
@@ -193,3 +243,241 @@
 ~~~~~~
 Arrays behave like structures, but use an Array instead of a Struct in
 the "c_data" or "c_data_ref" declaration.
+
+
+Use Cases
+=========
+
+This section will discussion various use cases, espcially the use of pointers
+in the context of primitive types and structures.
+
+Pointers to Primitive Types
+---------------------------
+This section discusses the memory layout resulting from allocating
+pointers to primitives.
+
+The original direct mapping of ctypes primitive types to PyPy 
+primitive types does not work with PyPy's notion of pointers. In
+PyPy it is not possible to create a pointer to a primitive.
+Therefore Armin Rigo's proposal is a better alternative.
+
+If a pointer is passed to an external function the 
+garbage collection header must be stripped. This is easy if
+the pointer directly passed to an external function. 
+
+If the pointer is contained in an other structure a similar
+procedure applies. The `c_data`-part of the structure
+must be assigned to the `c_data_ref`-part of the pointer
+and the pointer to the garbage collection header must be
+assigned to the `keepalive`-field.
+
+Pointers to Structures
+----------------------
+This section discusses the memory layout resulting from allocating
+pointers to structures.
+
+At least the original structure layout will not work,
+if pointers to structures are embedded in in other structures.
+
+The following example illustrates the resulting structure according
+to the original layout, if we embed a pointer to a structure instance
+of the same type::
+ 
+    Ptr( GcStruct( "CtypesGcStructure_<ClassName> 
+            ( "c_data" 
+                    (Struct "C-Data_<ClassName>
+                            Ptr( GcStruct ( "CtypesGcStructure_<ClassName>
+                                    "c_data" ... ) ) ) ) ) )
+
+It is clear that the `c_data` member does not point to a "C-compatible"
+structure, but to the garbage collection header. In the case of a linked list
+the whole list has to traversed and copied. This operation is O( n ) for
+if n is the number of pointers to structures.
+
+The same structure according to Armin Rigo's proposal will look like this -
+at least if I understood it right::
+
+    Ptr( GcStruct( "CtypesBox_<TypeName>
+            ( "c_data_ref"
+                    (Ptr (Struct "C_Data_<TypeName>
+                            ( "value", Ptr(...) ) ) ) ),
+            ( "keepalive"
+                    (Ptr (GcStruct (Struct "C_Data_<TypeName>
+                                    ( "value", Ptr(...) ) ) ) ) ) ) )
+                                
+In this case every pointer's `c_data_ref` is accompanied by 
+a corespondending `keepalive` field.
+
+Structures containing Structures
+----------------------------------
+
+Memory Layout
+~~~~~~~~~~~~~
+The original layout would embed the inner GcStruct in the outer
+structure. Again passing structures with this layout to
+external functions is not feasable without copying.
+
+Armin's approach will lead to the right memory layout, if one
+assumes, that the outer structure contains the `c_data` part
+of the inner structure.
+
+Dereferencing The Inner Structure Member
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The original solution has no problems in this case;
+it is enough to return the result of the getfield operation.
+
+Armin's layout must implement the same behaviour than CPython's
+ctypes. A new structure instance needs to allocated, because
+the field of the outer structure containing the inner
+strcuture lacks a garbage collection header.
+
+Solution
+========
+The final solution is close to Armin's approach, but describes the
+management of `keepalive` pointers to great detail.
+
+Guiding Principle
+-----------------
+The guiding principle behind building "C-compatible" external structures
+is to separate the garbage collection object graph from the actual "C-compatible"
+layout. This has the big advantage, that the "C-compatible" graph can
+be passed to external functions without change.
+
+Layouts by Type
+---------------
+The following sections describe the layout for various types.
+
+Primitive Types
+~~~~~~~~~~~~~~~
+The layout of atomic primitive types like integers or characters is like
+proposed by Armin Rigo::
+
+    Ptr( GcStruct( "CtypesBox_<TypeName>
+            ( "c_data"
+                    (Struct "C_Data_<TypeName>
+                            ( "value", Signed/Float/etc. ) ) ) ) )
+
+
+`c_char_p` will have the following layout::
+
+    Ptr( GcStruct( "CtypesBox_Char_p
+            ( "length" Int ),
+            ( "c_data"
+                    (Struct "C_Data_Char_p
+                            ( "value", Array( Char ) ) ) ) ) )
+
+`c_wchar_p` will have the following layout::
+
+    Ptr( GcStruct( "CtypesBox_Char_p
+            ( "length" Int ),
+            ( "c_data"
+                    (Struct "C_Data_Char_p
+                            ( "value", Array( Int ) ) ) ) ) )
+
+The use of `Int` here implies that it is 2 bytes long.
+ 
+Structures Without Pointers
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Structures, that do not contain pointers will have the layout
+proposed by Armin Rigo::
+
+    Ptr( GcStruct( "CtypesBox_<StructName>
+            ( "keepalive" Ptr( ... ) ),
+            ( "c_data" 
+                    Struct( "C_Data_<StructName>
+                            *<Fieldefintions>) ) ) )
+
+The `keepalive`-field is neccessary in the case of structures
+resulting from dereferencing a structure contained in another
+structure, as explained below.
+
+Pointers
+~~~~~~~~
+Pointers come in two flavors.
+
+The simple case is a pointer returned by an external function,
+because no care must be taken to keep the object pointed to alive.
+We can savely assume that the structure was allocated by the
+external funcion. [#]_ Therefore we can use a layout similar to
+Armin's proposal::
+
+    Ptr( GcStruct( "CtypesBox_<TypeName>
+            ( "c_data"
+                    (Struct "C_Data_<TypeName>
+                            ( "contents", Ptr(...) ) ) ) ) )
+
+Of course this layout can be simplified to::
+
+    Ptr( GcStruct( "CtypesBox_<TypeName>
+            ( "contents", Ptr(...) ) ) )
+
+but the layout with the `c_data`-field is similar to the ordinary case
+when the object pointed to was allocated by PyPy::
+
+    Ptr( GcStruct( "CtypesBox_<TypeName>
+            ( "keepalive" 
+                    GcStruct( "CtypesBox_<TypeName>
+                            ( "c_data"
+                                    Struct( "C_Data_<TypeName>
+                                            ( "value", Signed/Float/etc. ) ) ) ) ),
+            ( "c_data"
+                    Struct( "C_Data_<TypeName>
+                            ( "contents", Ptr(
+                                    Struct( "C_Data_<TypeName>
+                                            ( "value", Signed/Float/etc. ) ) ) ) ) ) )
+
+Of course the pointer's `c_data`-field and the keepalive-fields
+`GcStruct` point to the same object.
+
+.. [#] It may be neccessary to deallocate the structure returned.
+   In this case it should be simple to call an external deallocation
+   function, such as `malloc`.
+
+Structures Containing Pointers
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Structures containing pointers are a bit different, because all `keepalives`-fields
+need to be moved to one end of the structure [#]_ ::
+
+    Ptr( GcStruct( "CtypesBox_<StructName>
+            ( "container_keepalive" Ptr( ... ) ),
+            ( "pointer_keepalive"
+                    GcStruct( "C_Keepalive_<StructName>
+                            *<Pointer Fieldefintions>) ),
+            ( "c_data" 
+                    Struct( "C_Data_<StructName>
+                            *<Fieldefintions>) ) ) )
+
+Structures conatining pointers and structures that in turn contain
+structures with pointers, need the keepalive pointers to the begining
+of the structure. 
+
+.. [#] Obviously the start of the structure is a better choice, because it leaves
+   the structure's end for variable length arrays.
+
+Structures Created by Dereferencing an Inner Structure
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+In order to be ctypes compatible, structures that are created by dereferencing
+a structure inside another structure, need a `container_keepalive`-field. Otherwise
+a dangling pointer will result when the outer structure becomes unreachable and
+is freed by the garbage collector.
+
+Arrays
+~~~~~~
+Arrays of a primitive objects don't need keepalive-fields. Arrays of pointers 
+should consist of a "C-compatible" part and a `pointer_keepalive`-part unless they are part
+of another structure or array. In the later case the array's `pointer_keepalive`-part must
+be contained in the `pointer_keepalive` part of the outer structure or array.
+
+Management of Container-Keepalive fields
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+For every outermost structure or array the `container_keepalive`-field points
+to the `GcStruct` containing the structure or array. Everytime an array or 
+structure object is created be dereferencing an inner array or structure
+its `container_keepalive`-field is set with the value of the outermost array's
+or structure's `container_keepalive`-field.
+
+The same is true when creating a pointer that points to an inner structure or array.
+The pointers `keepalive`-field is set from the the `container_keepalive` field of
+the outermost structure or array.
+When such is a pointer is dereferenced the `container_keepalive` field of the newly
+created `GcStruct` must be set from the pointer's `keepalive`-field.



More information about the Pypy-commit mailing list