[Python-Dev] Memory management in the AST parser & compiler

Jeremy Hylton jeremy at alum.mit.edu
Mon Nov 28 22:23:00 CET 2005


On 11/28/05, Guido van Rossum <guido at python.org> wrote:
> > The code becomes quite cluttered when
> > it uses reference counting.  Right now, the AST is created with
> > malloc/free, but that makes it hard to free the ast at the right time.
>
> Would fixing the code to add free() calls in all the error exits make
> it more or less cluttered than using reference counting?

If we had an arena API, we'd only need to call free on the arena at
top-level entry points.  If an error occurs deeps inside the compiler,
the arena will still get cleaned up by calling free at the top.

> >  It would be fairly complex to convert the ast nodes to pyobjects.
> > They're just simple discriminated unions right now.
>
> Are they all the same size?

No.  Each type is a different size and there are actually a lot of
types -- statements, expressions, arguments, slices, &c.  All the
objects of one type are the same size.

> > If they were
> > allocated from an arena, the entire arena could be freed when the
> > compilation pass ends.
>
> Then I don't understand why there was discussion of alloca() earlier
> on -- surely the lifetime of a node should not be limited by the stack
> frame that allocated it?

Actually this is a pretty good limit, because all these data
structures are temporaries used by the compiler.  Once compilation has
finished, there's no need for the AST or the compiler state.

> I'm not in principle against having an arena for this purpose, but I
> worry that this will make it really hard to provide a Python API for
> the AST, which has already been requested and whose feasibility
> (unless I'm mistaken) also was touted as an argument for switching to
> the AST compiler in the first place. I hope we'll never have to deal
> with an API like the parser module provides...

My preference would be to have the ast shared by value.  We generate
code to serialize it to and from a byte stream and share that between
Python and C.  It is less efficient, but it is also very simple.

Jeremy


More information about the Python-Dev mailing list