[Python-Dev] Memory management in the AST parser & compiler
Nick Coghlan
ncoghlan at iinet.net.au
Tue Nov 15 10:31:03 CET 2005
Transferring part of the discussion of Thomas Lee's PEP 341 patch to
python-dev. . .
Neal Norwitz wrote in the SF patch tracker:
> Thomas, I hope you will write up this experience in coding
> this patch. IMO it clearly demonstrates a problem with the
> new AST code that needs to be addressed. ie, Memory
> management is not possible to get right. I've got a 700+
> line patch to ast.c to correct many more memory issues
> (hopefully that won't cause conflicts with this patch). I
> would like to hear ideas of how the AST code can be improved
> to make it much easier to not leak memory and be safe at the
> same time.
As Neal pointed out, it's tricky to write code for the AST parser and compiler
without accidentally letting memory leak when the parser or compiler runs into
a problem and has to bail out on whatever it was doing. Thomas's patch got to
v5 (based on Neal's review comments) with memory leaks still in it, my review
got rid of some of them, and we think Neal's last review of v6 of the patch
got rid of the last of them.
I am particularly concerned about the returns hidden inside macros in the AST
compiler's symbol table generation and bytecode generation steps. At the
moment, every function in compile.c which allocates code blocks (or anything
else for that matter) and then calls one of the VISIT_* macros is a memory
leak waiting to happen.
Something I've seen used successfully (and used myself) to deal with similar
resource-management problems in C code is to use a switch statement, rather
than getting goto-happy.
Specifically, the body of the entire function is written inside a switch
statement, with 'break' then used as the equivalent of "raise Exception". For
example:
PyObject* switchAsTry()
{
switch(0) {
default:
/* Real function body goes here */
return result;
}
/* Error cleanup code goes here */
return NULL;
}
It avoids the potential for labelling problems that arises when goto's are
used for resource cleanup. It's a far cry from real exception handling, but
it's the best solution I've seen within the limits of C.
A particular benefit comes when macros which may abort function execution are
used inside the function - if those macros are rewritten to use break instead
of return, then the function gets a chance to clean up after an error.
Cheers,
Nick.
P.S. Getting rid of the flow control macros entirely is another option, of
course, but it would make compile.c and symtable.c a LOT harder to follow.
Raymond Chen's articles notwithstanding, a preprocessor-based mini-language
does make sense in some situations, and I think this is one of them.
Particularly since the flow control macros are private to the relevant
implementation files.
--
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
---------------------------------------------------------------
http://boredomandlaziness.blogspot.com
More information about the Python-Dev
mailing list