[Python-Dev] Discussion related to memory leaks requested

Matthew Paulson paulson at busiq.com
Thu Jan 14 14:25:19 EST 2016


Hi All:

I've created a simple program to make sure I wasn't lying to you all ;->

Here it is:

for (ii = 0; ii < 100; ii++)
         {
             Py_Initialize();

             if ((code = Py_CompileString(p, "foo", Py_file_input)) == NULL)
                 printf("PyRun_SimpleString() failed\n");
             else
             {
                 if (PyRun_SimpleString(p) == -1)
                     printf("PyRun_SimpleString() failed\n");

                 Py_CLEAR(code);
             }

             Py_Finalize();
         }

This sequence causes about 10k growth per iteration and after many 
cycles, there's no indication that any pooling logic is helping. Our 
"useful" example is slightly more complex, and therefore may explain why 
I was seeing about 16k per iteration.

Unless I've done something obviously wrong, I tend to believe Benjamin's 
claim that this issue is well known.

Suggestion: I have had great success with similar problems in the past 
by using a pools implementation sitting on top of what I call a "block 
memory allocator".   The bottom (block) allocator grabs large blocks 
from the heap and then doles them out to the pools layer, which in turn 
doles them out to the requester.  When client memory is freed -- it is 
NOT -- rather it's added to the pool which contains like-sized blocks -- 
call it an "organized free list". This is a very, very fast way to 
handle high allocation frequency patterns.  Finally, during shutdown, 
the pool simply vaporizes and the block allocator returns a the (fewer) 
large blocks back to the heap.  This avoids thrashing the heap, forcing 
it to coalesce inefficiently and also avoids heap fragmentation, which 
can cause unwanted growth as well...

Note that this would be a "hard-reset" of all allocated memory, and any 
global data in the text segment would also have to be cleared, but it 
would provide a fast, clean way to ensure that each invocation was 100% 
clean.

I don't claim to understand all the intricacies of the many way python 
can be embedded, but as I said, this strategy has worked very well for 
me in the past building servers written in C that have to stay up for 
months at a time.

Happy to discuss further, if anyone has any interest.

Best,

Matt




On 1/14/2016 4:45 AM, Nick Coghlan wrote:
> On 14 January 2016 at 15:42, Benjamin Peterson <benjamin at python.org> wrote:
>> This is a "well-known" issue. Parts of the interpreter (and especially,
>> extension modules) cheerfully stash objects in global variables with no
>> way to clean them up. Fixing this is a large project, which probably
>> involves implementing PEP 489.
> The actual multi-phase extension module import system from 489 was
> implemented for 3.5, but indeed, the modules with stashed global state
> haven't been converted yet.
>
> I didn't think we loaded any of those by default, though...
>
> Cheers,
> Nick.
>

-- 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160114/36a39761/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: MattSig.JPG
Type: image/jpeg
Size: 38491 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160114/36a39761/attachment-0001.jpe>


More information about the Python-Dev mailing list