parsing python code

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Wed Dec 5 20:06:52 EST 2007


En Wed, 05 Dec 2007 14:02:35 -0300, Ryan Krauss <ryanlists at gmail.com>  
escribió:

> I need to parse a Python file by breaking it into blocks matching
> indentation levels so that function definitions, for loops, and
> classes are kept together as blocks.  [...]
>
> I think the parser module should enable me to do this, but I can't
> seem to figure it out.  Specifically, I think I need to use
> parser.sequence2ast, but it doesn't work the way I think it should and
> I can't find more documentation on it or an example that uses it.

I think the tokenizer module is better suited  
<http://docs.python.org/lib/module-tokenize.html>

Accumulate tokens up to NEWLINE to build complete source lines.
Watch for INDENT and DEDENT tokens to delimit blocks (keeping track of the  
indentation level, as you appear to be concerned with outer blocks only).
This way you can get easily all indented blocks, but to catch one-liners  
you have to do some parsing (at least to determine whether you're into any  
of the compound statements if/while/for/try/with/def/class; if you don't  
see a NEWLINE followed by INDENT it's a single line statement)

-- 
Gabriel Genellina




More information about the Python-list mailing list