[Python-checkins] python/dist/src/Python compile.txt,,

bcannon at users.sourceforge.net bcannon at users.sourceforge.net
Wed Mar 16 21:20:21 CET 2005

Update of /cvsroot/python/python/dist/src/Python
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv14411/Python

Modified Files:
      Tag: ast-branch
Log Message:
Add a ton of XXX comments on where information is lacking or where work needs
to be done on the doc.

Added an empty "Known Bugs/Issues" section to act as a place to note where
current things are broken.

Added a "ToDo" section on known things that need to be worked on that are not
explicit bugs.

Index: compile.txt
RCS file: /cvsroot/python/python/dist/src/Python/Attic/compile.txt,v
retrieving revision
retrieving revision
diff -u -d -r1.1.2.8 -r1.1.2.9
--- compile.txt	15 Oct 2003 06:26:13 -0000
+++ compile.txt	16 Mar 2005 20:20:18 -0000
@@ -18,10 +18,13 @@
 The definition describes the structure of statements, expressions, and
 several specialized types, like list comprehensions and exception
 handlers.  Most definitions in the AST correspond to a particular
-source construct, like an if statement or an attribute lookup.  The
+source construct, like an 'if' statement or an attribute lookup.  The
 definition is independent of its realization in any particular
 programming language.
+XXX Is byte stream format what marshal_* fxns create?
+XXX no AST->python yet, right?
 The AST has concrete representations in Python and C.  There is also
 representation as a byte stream, so that AST objects can be passed
 between Python and C.  (ASDL calls this format the pickle format, but
@@ -32,6 +35,7 @@
 The following fragment of the Python abstract syntax demonstrates the
+XXX update example once decorators in
   module Python
@@ -76,7 +80,7 @@
 It also generates a series of constructor functions that generate a
-``stmt_ty`` with the appropriate initialization.  The ``kind`` field
+``stmt_ty`` struct with the appropriate initialization.  The ``kind`` field
 specifies which component of the union is initialized.  The
 ``FunctionDef`` C function sets ``kind`` to ``FunctionDef_kind`` and
 initializes the ``name``, ``args``, and ``body`` fields.
@@ -84,8 +88,10 @@
+XXX Make sure basic flow of execution is covered
 The parser generates a concrete syntax tree represented by a ``node
-*`` defined in ``Include/node.h``.  Node indexing starts at 0.  Every
+*`` as defined in ``Include/node.h``.  Node indexing starts at 0.  Every
 token that can have whitespace surrounding it is its own token.  This
 means that something like "else:" is actually two tokens: 'else' and
@@ -96,7 +102,7 @@
     mod_ty PyAST_FromNode(const node *n);
 It does this by calling various functions in the file that all have the
-name ast_for_xxx where xxx is what the rule of the grammar (as defined
+name ast_for_xx where xx is what the rule of the grammar (as defined
 in ``Grammar/Grammar``) that the function handles (alias_for_import_name
 is the exception to this).  These in turn call the constructor functions
 as defined by the ASDL grammar to create the nodes of the AST.
@@ -124,12 +130,17 @@
 Code Generation and Basic Blocks
+XXX Reformat: general discussion of basic blocks and compiler ideas (namespace
+generation, etc.), then discuss code structure and helper functions
 XXX Describe the structure of the code generator, the types involved,
 and the helper functions and macros.
+XXX Make sure flow of execution (namespace resolution, etc.) is covered after
+explanation of macros/functions
 - for each ast type (mod, stmt, expr, ...), define a function with a
   switch statement.  inline code generation for simple things,
-  call function compiler_xxx where xxx is kind name for others.
+  call the function compiler_xx where xx is the kind of type in question for
+  others.
 The macros used to emit specific opcodes and to generate code for
 generic node types use string concatenation to produce calls to the
@@ -153,10 +164,18 @@
 Code is generated using a simple, basic block interface.
   - each block has a single entry point
+      * means code in basic block always starts executing at a single place
+      * does not exclude multiple blocks pointing to the same entry point
   - possibly multiple exit points
   - when generating jumps, always jump to a block
   - for a code unit, blocks are identified by its int id
+Thus the basic blocks are used to model control flow through an application.
+This is often called a CFG (control flow graph).  It is directed and can
+contain cycles in subgraphs since modeling loops does require it.
+Below are are macros and functions used for managing basic blocks:
 - NEW_BLOCK() -- create block and set it as current
 - NEXT_BLOCK() -- NEW_BLOCK() plus jump from current block
 - compiler_new_block() -- create a block but don't use it
@@ -171,24 +190,26 @@
   ADDOP_O(c, opcode, oparg, namespace) -- oparg is a PyObject * ,
      namespace is the name of a code object member that contains
      the set of objects.  For example,
-         ADDOP_O(c, LOAD_CONST, obj, consts)
+         ``ADDOP_O(c, LOAD_CONST, obj, consts)``
      will make sure that obj is in co_consts and that the opcode
      argument will be an index into co_consts.  The valid names
      are consts, names, varnames, ... 
+  ADDOP_NAME(c, op, o, type) -- XXX
-     - Explain what each namespace is for.
+     XXX Explain what each namespace is for.
-  ADDOP_JABS() -- oparg is an absolute jump to block id
-  ADDOP_JREL() -- oparg is a relative jump to block id 
+  ADDOP_JABS(XXX) -- oparg is an absolute jump to block id
+  ADDOP_JREL(XXX) -- oparg is a relative jump to block id 
   XXX no need for JABS() and JREL(), always computed at the
   end from block id
 - symbol table pass and compiler_nameop()
 Code Objects
-XXX Describe Python code objects.
+XXX Describe Python code objects: fields, etc.
@@ -211,13 +232,15 @@
 + Python/
     - Python-ast.c
-        Creates C structs corresponding to the ASDL types.
+        Creates C structs corresponding to the ASDL types.  Also contains code
+	for marshaling AST nodes (core ASDL types have marshaling code in
+	asdl.c).
         "File automatically generated by ../Parser/asdl_c.py".
     - asdl.c
         Contains code to handle the ASDL sequence type.  Also has code
         to handle marshalling the core ASDL types, such as number and
-        identifier.
+        identifier.  used by Python-ast.c for marshaling AST nodes.
     - ast.c
         Converts Python's concrete syntax tree into the abstract syntax tree.
@@ -241,6 +264,37 @@
     - ast.h
         Declares PyAST_FromNode() external (from ../Python/ast.c).
+Known Bugs/Issues
++ Grammar support (Parser/Python.asdl, Parser/asdl_c.py)
+    - decorators
+    - empty base class list (``class Class(): pass``)
+    - AST->Python object support
++ CST->AST support (Python/ast.c)
+    - decorators
+    - generator expressions
++ AST->bytecode support (Python/newcompile.c)
+    - decorators
+    - generator expressions
++ Stdlib support
+    - AST->Python access?
+    - rewrite compiler package to mirror AST structure?
++ Documentation
+    - flesh out this doc
+	* compiler concepts covered
+	* structure and flow of all steps clearly explained
+	* break up into more sections/subsections
++ Universal
+    - make sure entire test suite passes
+    - fix memory leaks
+    - make sure return types are properly checked for errors

More information about the Python-checkins mailing list