[IronPython] Ironclad problem, with which someone here may be able to help

Wed Nov 5 15:38:59 CET 2008

Incidentally, logging a Stopwatch timestamp in WriteFlush reveals that, 
yes, the calls really are happening in the order they appear to be. So, 
option (3) appears to be a red herring, and options (1) and (2) remain 
unchanged.

William Reade wrote:
> Hi all
>
> While running the numpy tests, I've come across a situation which, to 
> the best of my knowledge, is simply impossible. I'm hoping that one of 
> the local .NET gurus will be able to tell me what I'm missing, or 
> point me somewhere I can get more insight.
>
> The 4 methods involved are as follows:
> -----------------------
>        public int GetThreadId()
>        {
>            return Thread.CurrentThread.ManagedThreadId;
>        }
>
>        public void WriteFlush(string info)
>        {
>            Console.WriteLine(info);
>            Console.Out.Flush();
>        }
>
>        public void EnsureGIL()
>        {
>            Monitor.Enter(this.dispatcherLock);
>            this.WriteFlush(String.Format(
>                "EnsureGIL ({1}) {0}", this.GetThreadId(), 
> Builtin.id(this.dispatcherLock)));
>        }
>
>        public void ReleaseGIL()
>        {
>            this.WriteFlush(String.Format(
>                "ReleaseGIL ({1}) {0}\n", this.GetThreadId(), 
> Builtin.id(this.dispatcherLock)));
>            Monitor.Exit(this.dispatcherLock);
>        }
> -----------------------
> ...and they can, and do, occasionally produce output as follows:
> -----------------------
> EnsureGIL (443) 2
> EnsureGIL (443) 1      <- omg, wtf, bbq, etc.
> ReleaseGIL (443) 2
>
> EnsureGIL (443) 2
> ReleaseGIL (443) 1
>
> ReleaseGIL (443) 2
> -----------------------
> When this happens, the process continues happily for a short time and 
> then falls over in a later call to ReleaseGIL (after successfully 
> calling it several times). The error is " Object synchronization 
> method was called from an unsynchronized block of code", which I 
> understand to mean "you can't release this lock because you don't hold 
> it".
>
> It doesn't happen very often, but I can usually reproduce it by 
> running test_multiarray.TestFromToFile.test_malformed a few hundred 
> times. It may be relevant to note that thread 2 is the GC thread, and 
> thread 1 is the main thread. I have considered the following 
> possibilities:
>
> (1) That I'm locking on the wrong object. I believe that isn't the 
> case, because it's constructed only once, as a "new Object()" (ie, a 
> reference type), and is only subsequently used for locking; and, 
> because it keeps the same ipy id throughout.
>
> (2) That Monitor.Enter occasionally allows two different threads to 
> acquire the same lock. I consider this extremely unlikely, because... 
> well, how many multithreaded .NET apps already exist? If Monitor 
> really were broken, I think we'd probably know about it by now.
>
> (3) That calling Flush() on a SyncTextWriter (the type of Console.Out) 
> doesn't actually do anything, and the output is somehow wrongly 
> ordered (although I can't imagine how this could actually be: if the 
> locking is really working, then my console writes are strictly 
> sequential). I don't have access to the code, so I have no idea how 
> it's implemented, but even if this is the case it doesn't help much 
> with the fundamental problem (the synchronisation error which follows).
>
> Apart from the above, I'm out of ideas. Can anyone suggest what I've 
> missed?
>
> William
> _______________________________________________
> Users mailing list
> Users at lists.ironpython.com
> http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>