[Types-sig] "Mobius2" -- high performance Python-extensible Python parser

Jeff Epler jepler@inetnebr.com
Tue, 17 Apr 2001 07:22:31 -0500


On Mon, Apr 16, 2001 at 07:58:39PM -0700, Michel Pelletier wrote:
> mycodegen generates the actual bytecode.  I'm not sure how to get this
> thing to say 'instanciate one of these things defined in a python module'.

Michael,

The organization of the demos is somewhat rough, as you can see.  The
basic organization is to associate one "my*" file with one step of the
compilation process.

The process looks like this:

source code -> (tokenization, parsing) -> AST
	This step is modified by using a different Grammar file
AST -> Nodes
	This step is modified by adding new nodes to 'mynodes.py'
	*and* visiting new AST subtrees in 'myxform.py'
Nodes -> Bytecode
	This step is modified by handling new nodes in 'mycodegen.py'

The AST -> Nodes -> Bytecode steps are preexisting in Python 2.0's
Tools/compiler, though I may have chosen a clumsy way to extend them.
(In particular, things are fragile, because you must magically make your
new Grammar have the same symbol numbers as the builtin grammar for all
nonterminals which exist in both, and then get the 'addsym' statements in
'mynodes' in the right order.  I'm trying to cook up a good way to address
this in the future)

Anyhow, as to how to add code to some special spots in your new grammar ...

Take a look at demo/html2/compile_template.py (originally snatched from
Quixote).  At each file_input (the top node in a module), it adds the
equivalent of 'from IO_MODULE import IO_CLASS', and at the top of each
function it instantiates an IO_CLASS.

html2 adds some nodes for the desired code within the transformer, and then
lets the code generation module generate the associated code.

Jeff
jepler@inetnebr.com

    def file_input(self, nodelist):
        # Add a "from IO_MODULE import IO_CLASS" statement to the
        # beginning of the module.
        doc = self.get_docstring(nodelist, symbol.file_input)
        imp = Node('from', IO_MODULE, [(IO_CLASS, None)])

        # Add an IO_INSTANCE binding for module level expressions (like
        # doc strings).  This instance will not be returned.
        klass = Node('name', IO_CLASS)
        instance = Node('call_func', klass, [])
        assign_name = Node('ass_name', IO_INSTANCE, OP_ASSIGN)
        assign = Node('assign', [assign_name], instance)

        stmts = [ imp, assign ]

        for node in nodelist:
            if node[0] != token.ENDMARKER and node[0] != token.NEWLINE:
                self.com_append_stmt(stmts, node)

        return Node('module', doc, Node('stmt', stmts))

    def funcdef(self, nodelist):
	# ...
	# create an instance, assign to IO_INSTANCE
        klass = Node('name', IO_CLASS)
        instance = Node('call_func', klass, [])
        assign_name = Node('ass_name', IO_INSTANCE, OP_ASSIGN)
        assign = Node('assign', [assign_name], instance)