[Types-sig] Re: PEP 269

Sat Sep 15 03:37:16 EDT 2001

[Jonathan Riehl, on
 http://python.sourceforge.net/peps/pep-0269.html]

Just noting that Donald Beaudry wrapped a module around pgen in the
mid-90's.  If you can find a mirror that still has the old Contrib directory
from python.org, the last version was pgenmodule-0.2.2.tar.gz.  I'll attach
its README file.

if-it-was-never-used-then-i'm-not-sure-what's-changed-ly y'rs  - tim

This is the "pgen" module.  It is a set of bindings to the pgen, the
parser generator that is used to build Python's compiler.  Included is
a sample application, python.py, which constructs relatively compact
ASTs from Python source code.

The module contains two functions:
    parser_from_file -- which creates a grammar object from a grammar
        description file.

    parser_from_string -- which creates a grammar object from a grammar
        description contained in a string.

The grammar object has two methods:
    parse_file -- which creates a parse tree from a file.

    parse_string -- which creates a parse tree from a string

(Both parse_file and parser_from_file require a file name to be given.
 This could easily (and should) be changed to work with anything that
 has a fileno method.)

The grammar object supports a "reduction function" dictionary.  This
dictionary is accessed through the "reductions" attribute.  After the
actual parsing is completed, a bottom up reduction pass is made over
the parse tree (someday I would like to eliminate the initial
construction of the parse tree).

This means that for each terminal and non-termainal in the parse tree,
the reduction function dictionary is checked for an entry whos name
matches that of the terminal or non-terminal.  If one is found, the
function is called.  The result of the call is used to replace the
current node in the parse tree.  The reduction process continues until
all node in the tree have been reduced.

This mechanism is intended to be a convenient way to remove
unnecessary nodes from the parse tree.  Consider this example grammar:

        list: '[' value (',' value)* [','] ']'
        value: NAME | STRING

Each punctuation mark '[', ',', and ']' will show up in the resulting
parse tree as a seperate node.  It is very unlikely that this will be
useful to the application.  The following reduction function will
clean things up.

        def list(g, tok, where, children):
            l = []
            for child in children:
                if child[0] == g.NT.value
                    l.append(child[0])
            return (tok, l)

It replaces the current "list" node with a tuple containing only the
token g.NT.list and the list of values represented as regular old
Python list.  Take a look at example.py for a larger example.

--
Donald Beaudry                                         Silicon Graphics
Compilers/MTI                                          1 Cabot Road
donb at sgi.com                                           Hudson, MA 01749
                  ...So much code, so little time...