[Python-Dev] A "new" kind of leak

Guido van Rossum guido@python.org
Fri, 12 Apr 2002 20:21:35 -0400


>     http://www.python.org/sf/543148
> 
> reports a prodiguous leak under 2.1 via
> 
> import inspect
> 
> def leak():
>     frame = inspect.currentframe()
> 
> while 1:
>     leak()

You can get the same effect with sys._getframe() instead of
inspect.currentframe().

> This isn't surprising, since leak() assigns the current frame to a
> local in the same frame, creating a reference cycle, and frame
> objects weren't added to cyclic gc until 2.2.  Question #1: Is
> adding frame objects to cyclic gc a candidate for 2.1 backporting?
> Question #1a: Who's gonna pay Neil to spend his free time doing that
> <0.5 wink>?

If Neil won't, someone else can do it.  Adding GC to an object isn't
hard, and the tp_traverse/tp_clear implementations probably port right
over.

[Cute explanation snipped]

> Question #2:  Is this OK?  I think not, as this is a leak in reality.

Agreed.

> Question #3: Can we use pymalloc to manage frame objects instead?
> Alas, no: frame objects are too big (on my box, over 350 bytes for a
> minimal frame object, and pymalloc's limit is 256).

How would it help?

> Question #4: Can we just get rid of the free list?  I don't think so
> -- a frame object is needed on every function invocation, and saving
> malloc/free overhead is a real win.

Yup.  Try it though.

> An effective solution would be to bound the size of the frameobject
> free list: pick some maximum N.  When frame_dealloc sees that there
> are already N frames on the free list, it frees the frame memory
> instead of adding it to the free list.

Good idea.

> Question #5: What's a good value for N?  In the absence of
> pathologies due to cycles, and in the absence of generators, the
> number of live frame objects is equal to the current call-stack
> depth.  Since we're just trying to bound worst-case quadratic
> pathologies in cycle cases, the bound can be high -- say, 1000.  A
> non-pathological program creating thousands of simultaneously active
> short-lived generators would suffer, but I'm not prepared to go to
> my grave claiming that's not also pathological <wink>.

I bet even lowering it to 50 would work just as well in practice.  I'm
somewhat wary of solutions that use a really high bound for cases like
this; the code that deals with exceeding the bound would almost never
get executed, so bugs there would linger forever.  I think we had this
problem with the code for comparing self-referential containers;
originally the limit was 1000, now it's something like 20.

> Question #6: If we agree it's a good idea to put on a bound in 2.3,
> is it a good idea to backport the bound?  I think yes to 2.2, but no
> to 2.1.  In 2.1 the frame-cycle cases leak anyway.

But it was reported for 2.1, right?  I could imagine making this a
showcase of the new religion: more support for older versions.  2.1 is
the limit though.

--Guido van Rossum (home page: http://www.python.org/~guido/)