[IronPython] Ironclad problem, with which someone here may be able to help

William Reade william at resolversystems.com
Wed Nov 5 19:00:39 CET 2008


The log starts in the middle (after many lock/unlocks, some from each 
thread); I'm running on x86; and I have no additional AppDomains.

I don't think it would be safe for me to entirely avoid locking during 
finalization, but I could probably cut it down to a quick lock, on a 
separate object, to enqueue pointers for cleanup and deallocation on the 
main thread. However, I'm reluctant to do that until I'm sure that the 
problem is specifically related to GC, otherwise it'll just come back as 
soon as anyone tries any serious multithreading :).

Curt Hagenlocher wrote:
> Locking during finalization is often considered to be a bad idea.  In 
> particular, locking without a timeout introduces the possibility that 
> you will hang the finalization thread, preventing further objects from 
> being finalized.  But clearly, that's not what's happening here.
>
> Other questions that probably don't matter but might be interesting to 
> know:
>
> Can we assume that the finalization thread isn't the first place where 
> this lock is required?  That your log starts somewhere in the middle?
>
> Is this under x86 or x64 or both?
>
> Are you creating any additional AppDomains in the process?
>
>
> On Wed, Nov 5, 2008 at 10:15 AM, William Reade 
> <william at resolversystems.com <mailto:william at resolversystems.com>> wrote:
>
>     Hi Curt
>
>     I am indeed; that's how I know thread 2 is the GC thread. Is
>     locking during GC forbidden?
>
>     William
>
>     Curt Hagenlocher wrote:
>
>         ...or, for that matter, any __del__ methods from within Python
>         -- which ultimately are handled by finalization.
>
>         On Wed, Nov 5, 2008 at 9:37 AM, Curt Hagenlocher
>         <curt at hagenlocher.org <mailto:curt at hagenlocher.org>
>         <mailto:curt at hagenlocher.org <mailto:curt at hagenlocher.org>>>
>         wrote:
>
>            So, the obvious question for me is whether or not you're
>         using any
>            finalizers.
>
>
>            On Wed, Nov 5, 2008 at 5:57 AM, William Reade
>            <william at resolversystems.com
>         <mailto:william at resolversystems.com>
>         <mailto:william at resolversystems.com
>         <mailto:william at resolversystems.com>>>
>
>            wrote:
>
>                Hi all
>
>                While running the numpy tests, I've come across a situation
>                which, to the best of my knowledge, is simply
>         impossible. I'm
>                hoping that one of the local .NET gurus will be able to
>         tell
>                me what I'm missing, or point me somewhere I can get
>         more insight.
>
>                The 4 methods involved are as follows:
>                -----------------------
>                      public int GetThreadId()
>                      {
>                          return Thread.CurrentThread.ManagedThreadId;
>                      }
>
>                      public void WriteFlush(string info)
>                      {
>                          Console.WriteLine(info);
>                          Console.Out.Flush();
>                      }
>
>                      public void EnsureGIL()
>                      {
>                          Monitor.Enter(this.dispatcherLock);
>                          this.WriteFlush(String.Format(
>                              "EnsureGIL ({1}) {0}", this.GetThreadId(),
>                Builtin.id(this.dispatcherLock)));
>                      }
>
>                      public void ReleaseGIL()
>                      {
>                          this.WriteFlush(String.Format(
>                              "ReleaseGIL ({1}) {0}\n", this.GetThreadId(),
>                Builtin.id(this.dispatcherLock)));
>                          Monitor.Exit(this.dispatcherLock);
>                      }
>                -----------------------
>                ...and they can, and do, occasionally produce output as
>         follows:
>                -----------------------
>                EnsureGIL (443) 2
>                EnsureGIL (443) 1      <- omg, wtf, bbq, etc.
>                ReleaseGIL (443) 2
>
>                EnsureGIL (443) 2
>                ReleaseGIL (443) 1
>
>                ReleaseGIL (443) 2
>                -----------------------
>                When this happens, the process continues happily for a
>         short
>                time and then falls over in a later call to ReleaseGIL
>         (after
>                successfully calling it several times). The error is "
>         Object
>                synchronization method was called from an
>         unsynchronized block
>                of code", which I understand to mean "you can't release
>         this
>                lock because you don't hold it".
>
>                It doesn't happen very often, but I can usually
>         reproduce it
>                by running
>         test_multiarray.TestFromToFile.test_malformed a few
>                hundred times. It may be relevant to note that thread 2
>         is the
>                GC thread, and thread 1 is the main thread. I have
>         considered
>                the following possibilities:
>
>                (1) That I'm locking on the wrong object. I believe
>         that isn't
>                the case, because it's constructed only once, as a "new
>                Object()" (ie, a reference type), and is only subsequently
>                used for locking; and, because it keeps the same ipy id
>                throughout.
>
>                (2) That Monitor.Enter occasionally allows two different
>                threads to acquire the same lock. I consider this extremely
>                unlikely, because... well, how many multithreaded .NET apps
>                already exist? If Monitor really were broken, I think we'd
>                probably know about it by now.
>
>                (3) That calling Flush() on a SyncTextWriter (the type of
>                Console.Out) doesn't actually do anything, and the
>         output is
>                somehow wrongly ordered (although I can't imagine how this
>                could actually be: if the locking is really working,
>         then my
>                console writes are strictly sequential). I don't have
>         access
>                to the code, so I have no idea how it's implemented,
>         but even
>                if this is the case it doesn't help much with the
>         fundamental
>                problem (the synchronisation error which follows).
>
>                Apart from the above, I'm out of ideas. Can anyone suggest
>                what I've missed?
>
>                William
>                _______________________________________________
>                Users mailing list
>                Users at lists.ironpython.com
>         <mailto:Users at lists.ironpython.com>
>         <mailto:Users at lists.ironpython.com
>         <mailto:Users at lists.ironpython.com>>
>
>              
>          http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>
>
>
>         ------------------------------------------------------------------------
>
>
>
>         _______________________________________________
>         Users mailing list
>         Users at lists.ironpython.com <mailto:Users at lists.ironpython.com>
>         http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>          
>
>
>     _______________________________________________
>     Users mailing list
>     Users at lists.ironpython.com <mailto:Users at lists.ironpython.com>
>     http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Users mailing list
> Users at lists.ironpython.com
> http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>   




More information about the Ironpython-users mailing list