[Python-Dev] Memory management in the AST parser & compiler
Thomas Lee
krumms at gmail.com
Wed Nov 16 10:49:50 CET 2005
As the writer of the crappy code that sparked this conversation, I feel
I should say something :)
Brett Cannon wrote:
>On 11/15/05, Neal Norwitz <nnorwitz at gmail.com> wrote:
>
>
>>On 11/15/05, Jeremy Hylton <jeremy at alum.mit.edu> wrote:
>>
>>
>>>Thanks for the message. I was going to suggest the same thing. I
>>>think it's primarily a question of how to add an arena layer. The AST
>>>phase has a mixture of malloc/free and Python object allocation. It
>>>should be straightforward to change the malloc/free code to use an
>>>arena API. We'd probably need a separate mechanism to associate a set
>>>of PyObject* with the arena and have those DECREFed.
>>>
>>>
>>Well good. It seems we all agree there is a problem and on the
>>general solution. I haven't thought about Brett's idea to see if it
>>could work or not. It would be great if we had someone start working
>>to improve the situation. It could well be that we live with the
>>current code for 2.5, but it would be great to use arenas for 2.6 at
>>least.
>>
>>
>>
>
> I have been thinking about this some more to put off doing homework
>and I have some random ideas I just wanted to toss out there to make
>sure I am not thinking about arena memory management incorrectly
>(never actually encountered it directly before).
>
>I think an arena API is going to be the best solution. Pulling
>trickery with redefining Py_INCREF and such like I suggested seems
>like a pain and possibly error-prone. With the compiler being a
>specific corner of the core having a special API for handling the
>memory for PyObject* stuff seems reasonable.
>
>
>
I agree. And it raises the learning curve for poor saps like myself. :)
>We might need PyArena_Malloc() and PyArena_New() to handle malloc()
>and PyObject* creation. We could then have a struct that just stored
>pointers to the allocated memory (linked list for each pointer which
>gives high memory overhead or linked list of arrays that should lower
>memory but make having possible holes in the array for stuff already
>freed a pain to handle). We would then have PyArena_FreeAll() that
>would be strategically placed in the code for when bad things happen
>that would just traverse the lists and free everything. I assume
>having a way to free individual items might be useful. Could have the
>PyArena_New() and _Malloc() return structs with the needed info for a
>PyArena_Free(location_struct) to be able to fee the specific item
>without triggering a complete freeing of all memory. But this usage
>should be discouraged and only used when proper memory management is
>guaranteed.
>
>
>
An arena/pool (as I understood it from my quick skim) for the AST would
probably best be implemented (IMHO) as an ADT based on a linked-list:
typedef struct _ast_pool_node {
struct _ast_pool_node *next;
PyObject *object; /* == NULL when data != NULL */
void *data; /* == NULL when object != NULL */
}ast_pool_node;
deallocating a node could then be as simple as:
/* ast_pool_node *n */
PyObject_Free(n->object);
if (n->data != NULL)
free(n->data);
/* save n->next */
free(n);
/* then go on to free n->next */
I haven't really thought all that deeply about this, so somebody shoot
me down if I'm completely off-base (Neal? :D). Every allocation of a
seq/stmt within ast.c would have its memory saved to the pool within the
function it's allocated in. Then before we return, we can just
deallocate the pool/arena/whatever you want to call it.
The problem with this is that should we get to the end of the function
and everything actually went okay (i.e. we return non-NULL), we then
have to run through and deallocate all the nodes anyway (without
deallocating n->object or n->data). Bah. Maybe we *would* be better off
with a monolithic cleanup. I don't know.
>Boy am I wanting RAII from C++ for automatic freeing when scope is
>left. Maybe we need to come up with a similar thing, like all memory
>that should be freed once a scope is left must use some special struct
>that stores references to all created memory locally and then a free
>call must be made at all exit points in the function using the special
>struct. Otherwise the pointer is stored in the arena and handled
>en-mass later.
>
>
>
Which is basically what I just rambled on about up above, I think :)
>Hopefully this is all made some sense. =) Is this the basic strategy
>that an arena setup would need? if not can someone enlighten me?
>
>
>-Brett
>_______________________________________________
>Python-Dev mailing list
>Python-Dev at python.org
>http://mail.python.org/mailman/listinfo/python-dev
>Unsubscribe: http://mail.python.org/mailman/options/python-dev/krumms%40gmail.com
>
>
>
More information about the Python-Dev
mailing list