[IronPython] HostCodeHeap leakage?

Ronnie Maor ronnie.maor at gmail.com
Mon Oct 18 22:33:43 CEST 2010


thanks - wanted to make sure you didn't miss it.
we'll try to reduce the number of places where we're exposed to it in the
meantime.

On Mon, Oct 18, 2010 at 10:28 PM, Dino Viehland <dinov at microsoft.com> wrote:

>  Yep, I’ve seen it…  still thinking about what should be done here.  I
> certainly understand the problem and what will cause it but the fix is
> probably non trivial (but also probably well contained to one or two
> classes).  It may take me a day or two to respond w/ something substantial.
>  We probably need to change our kw-calling to use a pre-compiled rule.
>
>
>
> *From:* users-bounces at lists.ironpython.com [mailto:
> users-bounces at lists.ironpython.com] *On Behalf Of *Ronnie Maor
> *Sent:* Monday, October 18, 2010 1:25 PM
>
> *To:* Discussion of IronPython
> *Subject:* Re: [IronPython] HostCodeHeap leakage?
>
>
>
> Can someone from IPy team ack that you saw this?
>
> The issue is causing us a lot of trouble, so we'd really appreciate it if
> you could tell us how to fix - we've already built from source to fix a
> previous leak, so no problem building with another patch.
>
>
>
> BTW, the default value in the function definition is not needed, it's
> calling with named arguments that causes the issue. so this is a slightly
> simpler repro:
>
>
>
> def test_method():
>
>     for i in xrange(1000):
>
>         def func(param): pass
>
>         func(param = None)
>
>
>
> thanks!
>
> Ronnie
>
>
>
> On Mon, Oct 18, 2010 at 11:35 AM, Idan Zaltzberg <idan at cloudshare.com>
> wrote:
>
> FYI, there is a bit simpler reproduction:
>
> def test_method():
>
>     for i in xrange(1000):
>
>         def func(param = None): pass
>
>         func(param = None)
>
>
>
> test_method()
>
>
>
> So, actually any use of keyword params in closure that are redefined causes
> the problem.
>
>
>
> *From:* Idan Zaltzberg [mailto:idan at cloudshare.com]
> *Sent:* Monday, October 18, 2010 11:10 AM
> *To:* 'Discussion of IronPython'
> *Subject:* RE: [IronPython] HostCodeHeap leakage?
>
>
>
> Ok, I finally succeeded in creating a simple reproduction for this problem.
>
> The following code generates a 1000 methods (according to the ".NET CLR
> JIT" performance counter), on Ipy 2.6.1 .Net 2.0
>
>
>
> def test_method():
>
>     for i in xrange(1000):
>
>         def func(*a,**kw): pass
>
>         func(some_parm = None)
>
>
>
> test_method()
>
>
>
> This does not happen if you call f without keyword params (using the *a
> params is OK).
>
> If this is indeed a bug, we would like to know how to fix it in the code
> locally, if that is possible.
>
> Also, I am interested in what Ipy flow creates this methods, since I wasn’t
> able to find the function in the code that does this generation
>
>
>
> Thanks
>
>
>
>
>
> *From:* users-bounces at lists.ironpython.com [mailto:
> users-bounces at lists.ironpython.com] *On Behalf Of *Dino Viehland
> *Sent:* Thursday, October 14, 2010 8:31 PM
> *To:* Discussion of IronPython
> *Subject:* Re: [IronPython] HostCodeHeap leakage?
>
>
>
> So from an IronPython/DLR perspective the process should stabilize over
> time.  That could take a while – as various Python functions get used
> repeatedly we’ll switch from interpreting them to compiling them.  We’ll
> also potentially produce new call site rules which are compiled.  That could
> account for the increase of 16k->18k dynamic methods.  That’s a 12% increase
> and could be a reasonable amount of it remained steady from then on.
> Likewise the # of function codes seems stable but the _codeCount is rising.
> That might mean that you’re defining new functions (via
> exec/execfile/compile) and they’re getting collected - but we still have
> their (dead) weak references in the code list.  Eventually that list should
> get cleaned out when we hit context._nextCodeCleanup (which should be
> greater than context._codeCount).
>
>
>
> I would expect if you were to walk the entire PythonContext._allCodes
> linked list that you’d see about half of the lists having a dead weak
> reference.  If your windbg-foo is up for writing the script to do this
> that’d be great but mine is rusty enough I’d need to look it up in the
> documentation.
>
>
>
> As for jitted code – all dynamic methods are collectible and any normal
> RefEmit code is not collectible (there’s an option to make assemblies
> collectible in .NET 4.0, but we don’t use it as we generally don’t generate
> that many types).  Oh, we do also generate new types for subclasses but you
> should see that NewTypeMaker._newTypes has a stable count over time because
> we share these between types w/ common bases.
>
>
>
> Closures, callbacks, generators should all be fine.
>
>
>
> If you do .symfix, then .reload, in Windbg does “dt HostCodeHeap” show you
> the fields of the HostCodeHeap structure?  Does !eeheap –loader give you the
> address of the HostCodeHeap?  It might be useful to know what
> m_allocationCount is on the heap overtime as if that’s relatively stable the
> heap could be getting fragmented.  Looking at the CLR code I’m becoming more
> and more convinced HostCodeHeap is only used for dynamic methods so if
> that’s growing then I’m thinking we would appear to be leaking dynamic
> methods.
>
>
>
> The only way I can think of figuring out where the allocations are coming
> from is to put a breakpoint on mscorwks!HostCodeHeap::AllocMemory_NoThrow
> (hopefully this will show up in the public symbols).  That may be difficult
> if you can’t attach the debugger to the server but if the issue is blocking
> the server you could have a breakpoint here which does a stack trace and
> continues execution so you could inspect the stack trace later.  I’d include
> both “kb” and “!ClrStack” from sos in the stack trace.
>
>
>
> 2.7 shouldn’t really change this – whether .NET 4.0 would or not would
> depend on if the CLR changed anything here.  But I’m not sure – I would
> assume it wouldn’t.
>
>
>
> *From:* users-bounces at lists.ironpython.com [mailto:
> users-bounces at lists.ironpython.com] *On Behalf Of *Idan Zaltzberg
> *Sent:* Thursday, October 14, 2010 6:44 AM
> *To:* Discussion of IronPython
> *Subject:* Re: [IronPython] HostCodeHeap leakage?
>
>
>
> Hi,
>
>
>
> I've been looking on this for some time, and I'm still don’t understand
> some things. Maybe I should begin by explaining our usage of Ipy (2.6.1 .NET
> 2.0) a bit better.
>
>
>
> We have a long running (stateful) application that we cannot simulate or
> run with a debugger open (so no breakpoints).
>
> However, since the application is run on a VM, we can take snapshots of it
> and then open WinDbg instance and break in the middle of the application.
>
> We do this a few hours after the application restarted and again after two
> days so we can see the difference.
>
>
>
> This way we saw that Jit Code Heap is increasing by a few hundred MB per
> day, and the number of HostCodeHeap objects is increasing.
>
> We also compared the performance counters for the two snapshots, and saw
> the Jitted Code Bytes increased from 100MB to 862MB and the number of
> methods jitted increased from 700K to 6.3M.
>
> In the WinDbg we saw that the Jitted Code Heap size increases from 126MB to
> 424MB.
>
> On the other hand, the object types you mentioned stay relatively the same:
>
> ·         DynamicMethods count went from 16K to 18K
>
> ·         FunctionCode count went from 4013 to 4025
>
> ·         The  _*code*Count field in the PythonContext went from 4447 to
> 7800
>
> Here is what we don't understand:
>
> 1.       Is it normal for the application to keep jitting code and methods
> forever? Should is stabilize?
>
> 2.       From the numbers I guess that some of the jitted code IS
> collected. Which types are collectable and which are not? How can I tell
> which ones I am using?
>
> 3.       Are there any specific patterns I should avoid to decrease
> uncollectable code (or jitting in general). I am using a lot of closures,
> callbacks and generators.
>
> 4.       What datatypes that are visible in WinDbg can I use to understand
> if and why IronPython is generating uncollectable code?
>
> 5.       Is there a way to trace back the HostCodeHeap objects to my code
> (or IronPython specific features)?
>
> 6.       Can I expect an improvement in these issues by moving to Ipy 2.7
> and/or .Net 4.0?
>
>
>
> Thanks
>
>
>
> *From:* users-bounces at lists.ironpython.com [mailto:
> users-bounces at lists.ironpython.com] *On Behalf Of *Dino Viehland
> *Sent:* Wednesday, October 13, 2010 10:01 PM
> *To:* Discussion of IronPython
> *Subject:* Re: [IronPython] HostCodeHeap leakage?
>
>
>
> If you build from source you could set some breakpoints in AssemblyGen.cs
> in the DefineType method.  You can also set one in DelegateUtils.cs *and*DelegateHelpers.cs in DefineDelegateType.  I think those are all the places
> where we are creating uncollectible types.  If we’re continuously hitting
> those breakpoints after you believe your app has reached steady state then
> something is going wrong.
>
>
>
> This code
> http://www.koders.com/cpp/fid5CC8EACFCC85496B49B8CF83BD05AB36DE691E90.aspxleads me to believe the HostCodeHeap might also be used for DynamicMethods.
> If that is the case then the other place to look would be if FunctionCode
> objects are being re-created repeatedly.  That will happen if there’s
> exec/eval/compile calls which are happening and if those objects are being
> kept alive then we could be growing the heap over time.
>
>
>
> There’s also some complicated code which deals with keeping a list of all
> code that is alive.  We do cleanup this list, and the list is a list of weak
> references so it shouldn’t actually keep the code alive, but you could put
> some breakpoints at FunctionCode.RegisterFunctionCode and
> FunctionCode.CodeCleanup to see if that list is growing boundlessly (which
> it would be if something was keeping code objects alive after an
> exec/eval/compile).
>
>
>
> Another place where code generation could be occurring would be w/
> regexes.  If you are dynamically generating reg-exes, or executing a huge
> different variety of them over time, and they’re compiled, then the compiled
> regexes could be staying in memory.  There is a regex cache and you can
> clear it by calling re.purge().  But it should only cache up to 100 regexes.
>
>
>
> A final possible thing to investigate might be what happens if you throw
> away the entire ScriptEngine instance.  Here you could try re-cycling the
> ScriptEngine say every 6 hours and see if the problem goes away.  If that
> fixes the problem then it’s likely that it is one of the things I mentioned
> (or some other cache that’s per-runtime).  At least that would start to
> narrow it down vs. some potentially global state (like the subtype list
> which is shared across ScriptEngines).
>
>
>
> That’s a bunch of different things to look at – hopefully it’ll give some
> insight into what’s going on and help track down the issue.
>
>
>
> *From:* users-bounces at lists.ironpython.com [mailto:
> users-bounces at lists.ironpython.com] *On Behalf Of *Idan Zaltzberg
> *Sent:* Wednesday, October 13, 2010 12:12 AM
> *To:* Discussion of IronPython
> *Subject:* Re: [IronPython] HostCodeHeap leakage?
>
>
>
> I tried what you suggested (changed setup.DebugMode = false;)
>
> But still I get the same behavior:
>
> The "Jit Code Heap" increases from about 17MB to 230MB in 2 days.
>
> Is there a way to verify from the IronPython code that DebugMode is off?
>
> Is there anything else I can do (other startup settings?) to
> decrease/understand the increase in HostCodeHeap objects?
>
> Thanks.
>
>
>
> *From:* users-bounces at lists.ironpython.com [mailto:
> users-bounces at lists.ironpython.com] *On Behalf Of *Dino Viehland
> *Sent:* Tuesday, October 05, 2010 6:59 PM
> *To:* Discussion of IronPython
> *Subject:* Re: [IronPython] HostCodeHeap leakage?
>
>
>
> Yep, DebugMode is the same as –X:Debug.  In general I’d suggest making this
> configurable somehow and only turn it on if you’re actually debugging.  It’s
> unfortunate that we can’t offer both debugging & collectability but right
> now that’s simply a limitation of the CLR and/or our lack of a separate VS
> debug engine which can debug Python code.
>
>
>
> *From:* users-bounces at lists.ironpython.com [mailto:
> users-bounces at lists.ironpython.com] *On Behalf Of *Idan Zaltzberg
> *Sent:* Tuesday, October 05, 2010 9:09 AM
> *To:* Discussion of IronPython
> *Subject:* Re: [IronPython] HostCodeHeap leakage?
>
>
>
> Im running using the engine from a hosting app.
>
> We have these lines in the startup:
>
> ScriptRuntimeSetup setup = new ScriptRuntimeSetup();
> setup.DebugMode = true;
>
> ScriptRuntime runtime = Python.CreateRuntime(setup.Options);
>
> engine = runtime.GetEngine("py");
>
>
>
> Is this is the same like –X:Debug?
>
> You reckon this could be the cause?
>
>
>
> *From:* users-bounces at lists.ironpython.com [mailto:
> users-bounces at lists.ironpython.com] *On Behalf Of *Dino Viehland
> *Sent:* Tuesday, October 05, 2010 5:53 PM
> *To:* Discussion of IronPython
> *Subject:* Re: [IronPython] HostCodeHeap leakage?
>
>
>
> My guess is that’s code in the JIT heap that’s building up but I’m not 100%
> certain.  How is your code being executed?  Do you have the debug option (-D
> or –X:Debug) enabled?  To support debug mode we need to produce
> uncollectible code which could be building up.
>
>
>
> *From:* users-bounces at lists.ironpython.com [mailto:
> users-bounces at lists.ironpython.com] *On Behalf Of *Idan Zaltzberg
> *Sent:* Tuesday, October 05, 2010 2:26 AM
> *To:* Discussion of IronPython
> *Subject:* [IronPython] HostCodeHeap leakage?
>
>
>
> I am trying to find a memory/"performance" leak in an Ipy application.
>
> Using WINDBG (!eeheap -loader), we noticed the that the LoaderHeap is
> getting bigger (150MB increase per day). From the !eeheap output it seems
> that the increase is due to HostCodeHeap (objects?).
>
> As I understand these objects might be created by Ipy infra, is that right?
>
> Is there anyway I can get more info on their content, or prevent them from
> growing?
>
> Thanks
>
>
>
>
> _______________________________________________
> Users mailing list
> Users at lists.ironpython.com
> http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>
>
>
> _______________________________________________
> Users mailing list
> Users at lists.ironpython.com
> http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ironpython-users/attachments/20101018/df160811/attachment.html>


More information about the Ironpython-users mailing list