[Python-Dev] Memory management in the AST parser & compiler

Neal Norwitz nnorwitz at gmail.com
Mon Nov 28 23:58:24 CET 2005

On 11/28/05, Guido van Rossum <guido at python.org> wrote:
> I guess I don't understand the AST compiler code enough to participate
> in this discussion.

I hope everyone while chime in here.  This is important to improve and
learn from others.

Let me try to describe the current situation with a small amount of
code.  Hopefully it will give some idea of the larger problems.

This is an entire function from Python/ast.c.  It demonstrates the
issues fairly clearly.  It contains at least one memory leak.  It uses
asdl_seq which are barely more than somewhat dynamic arrays. 
Sequences do not know what type they hold, so there needs to be
different dealloc functions to free them properly (asdl_*_seq_free()).
 ast_for_*() allocate memory, so in case of an error, the memory will
need to be freed.  Most of this memory is internal to the AST code. 
However, there are some identifiers (PyString's) that must be
DECREF'ed.  See below for the memory leak.

static stmt_ty
ast_for_funcdef(struct compiling *c, const node *n)
    /* funcdef: 'def' [decorators] NAME parameters ':' suite */
    identifier name = NULL;
    arguments_ty args = NULL;
    asdl_seq *body = NULL;
    asdl_seq *decorator_seq = NULL;
    int name_i;

    REQ(n, funcdef);

    if (NCH(n) == 6) { /* decorators are present */
	decorator_seq = ast_for_decorators(c, CHILD(n, 0));
	if (!decorator_seq)
	    goto error;
	name_i = 2;
    else {
	name_i = 1;

    name = NEW_IDENTIFIER(CHILD(n, name_i));
    if (!name)
	goto error;
    else if (!strcmp(STR(CHILD(n, name_i)), "None")) {
	ast_error(CHILD(n, name_i), "assignment to None");
	goto error;
    args = ast_for_arguments(c, CHILD(n, name_i + 1));
    if (!args)
	goto error;
    body = ast_for_suite(c, CHILD(n, name_i + 3));
    if (!body)
	goto error;

    return FunctionDef(name, args, body, decorator_seq, LINENO(n));

    return NULL;

The memory leak occurs when FunctionDef fails.  name, args, body, and
decorator_seq are all local and would not be freed.  The simple
variables can be freed in each "constructor" like FunctionDef(), but
the sequences cannot unless they keep the info about which type they
hold.  That would help quite a bit, but I'm not sure it's the
right/best solution.

Hope this helps explain a bit.  Please speak up with how this can be
improved.  Gotta run.


More information about the Python-Dev mailing list