[IronPython] Ironclad problem, with which someone here may be able to help

William Reade william at resolversystems.com
Fri Nov 7 11:22:25 CET 2008


Thanks Dino -- you've given me plenty to be going on with until you hear 
back from the CLR dev.

Dino Viehland wrote:
> In MTA mode we'll create a new thread which is thread 3 - thread 1 will just be hanging around waiting for it to exit.  Unfortunately w/o a separate EXE we can't avoid the separate thread.
>
> I've pinged the CLR dev who owns interop these days to see if he's ever seen anything like this.  Unfortunately he's OOF until the 10th.  My best suggestion for the time being is to avoid using Monitor for this lock so that you don't pump messages.  That's probably going to mean writing your own lock and then P/Invoking to WaitForSingleObject when you experience contention and need to block the thread.  Using an AutoResetEvent or ManualResetEvent directly won't work because we'll pump messages there too!
>
> I would also suggest looking at what CPython does here.  Dumpbin /imports python.exe shows no imports from ole32 so I'm guessing they don't do anything to spin up COM - which makes sense.  I would therefore assume when acquiring the GIL CPython will never pump messages and therefore programs don't expect possible re-entrancy - it'd probably be a good idea to mimic that behavior :).  Unfortunately there are other places where we use Monitor in IronPython - list.append comes to mind.  It'll be interesting to see if that turns out to be a problem eventually as well.
>
> Anyway, I'll let you know when I hear back.
>
>
> -----Original Message-----
> From: users-bounces at lists.ironpython.com [mailto:users-bounces at lists.ironpython.com] On Behalf Of William Reade
> Sent: Thursday, November 06, 2008 7:29 AM
> To: Discussion of IronPython
> Subject: Re: [IronPython] Ironclad problem, with which someone here may be able to help
>
> Finalization still happens on thread 2, and I see no evidence of thread
> 1 any more; my main thread now appears to have id 3.
>
> Curt Hagenlocher wrote:
>   
>> When you run with -X:MTA, is the object now being finalized on a
>> thread other than thread #2?
>>
>> On Thu, Nov 6, 2008 at 7:01 AM, William Reade
>> <william at resolversystems.com <mailto:william at resolversystems.com>> wrote:
>>
>>     Thanks Dino -- it seems that if I use ipy with -X:MTA, I can no
>>     longer reproduce the problem. However, this bothers me a bit: I'm
>>     not competent to follow all the consequences here, but this
>>     situation seems to imply that Ironclad won't be usable safely from
>>     any STA thread. Is this an intended restriction or a bug?
>>
>>     Incidentally, tracking down the call stacks proved to be hard
>>     work: the timing changed enough that I only got a single failure
>>     over dozens of runs, and it turned out I'd got the logging wrong,
>>     so I didn't find out anything useful :(.
>>
>>
>>     Dino Viehland wrote:
>>
>>         I would suggest getting a snap shot of the call stacks when
>>         this is happening if that's possible.  I can't pin anything
>>         down but I wonder if you could have an STA object or something
>>         that otherwise requires message pumping.  That message pumping
>>         could happen while you're doing a Monitor.Enter call.  If that
>>         was happening maybe there is some subtle CLR bug or a surprise
>>         feature?  It is surprising that Monitor.Enter can go
>>         re-entrant but it can...
>>
>>         So it'd be interesting to get thread snapshots and see if
>>
>>
>>
>>                           EnsureGIL (443) 2
>>                           EnsureGIL (443) 1      <- omg, wtf, bbq, etc.
>>
>>
>>
>>         Could be happening because Thread 1 experiences contention,
>>         blocks and pumps messages, causing the finalizer thread (2) to
>>         run, the lock is acquired and ... ?
>>
>>         Only other thing I could think of is does it repro on other
>>         machines?  Maybe it's a hardware bug as unlikely as that seems?
>>
>>         -----Original Message-----
>>         From: users-bounces at lists.ironpython.com
>>         <mailto:users-bounces at lists.ironpython.com>
>>         [mailto:users-bounces at lists.ironpython.com
>>         <mailto:users-bounces at lists.ironpython.com>] On Behalf Of
>>         William Reade
>>         Sent: Wednesday, November 05, 2008 10:01 AM
>>         To: Discussion of IronPython
>>         Subject: Re: [IronPython] Ironclad problem, with which someone
>>         here may be able to help
>>
>>         The log starts in the middle (after many lock/unlocks, some
>>         from each
>>         thread); I'm running on x86; and I have no additional AppDomains.
>>
>>         I don't think it would be safe for me to entirely avoid
>>         locking during
>>         finalization, but I could probably cut it down to a quick
>>         lock, on a
>>         separate object, to enqueue pointers for cleanup and
>>         deallocation on the
>>         main thread. However, I'm reluctant to do that until I'm sure
>>         that the
>>         problem is specifically related to GC, otherwise it'll just
>>         come back as
>>         soon as anyone tries any serious multithreading :).
>>
>>         Curt Hagenlocher wrote:
>>
>>
>>             Locking during finalization is often considered to be a
>>             bad idea.  In
>>             particular, locking without a timeout introduces the
>>             possibility that
>>             you will hang the finalization thread, preventing further
>>             objects from
>>             being finalized.  But clearly, that's not what's happening
>>             here.
>>
>>             Other questions that probably don't matter but might be
>>             interesting to
>>             know:
>>
>>             Can we assume that the finalization thread isn't the first
>>             place where
>>             this lock is required?  That your log starts somewhere in
>>             the middle?
>>
>>             Is this under x86 or x64 or both?
>>
>>             Are you creating any additional AppDomains in the process?
>>
>>
>>             On Wed, Nov 5, 2008 at 10:15 AM, William Reade
>>             <william at resolversystems.com
>>             <mailto:william at resolversystems.com>
>>             <mailto:william at resolversystems.com
>>             <mailto:william at resolversystems.com>>> wrote:
>>
>>                Hi Curt
>>
>>                I am indeed; that's how I know thread 2 is the GC
>>             thread. Is
>>                locking during GC forbidden?
>>
>>                William
>>
>>                Curt Hagenlocher wrote:
>>
>>                    ...or, for that matter, any __del__ methods from
>>             within Python
>>                    -- which ultimately are handled by finalization.
>>
>>                    On Wed, Nov 5, 2008 at 9:37 AM, Curt Hagenlocher
>>                    <curt at hagenlocher.org <mailto:curt at hagenlocher.org>
>>             <mailto:curt at hagenlocher.org <mailto:curt at hagenlocher.org>>
>>                    <mailto:curt at hagenlocher.org
>>             <mailto:curt at hagenlocher.org> <mailto:curt at hagenlocher.org
>>             <mailto:curt at hagenlocher.org>>>>
>>                    wrote:
>>
>>                       So, the obvious question for me is whether or
>>             not you're
>>                    using any
>>                       finalizers.
>>
>>
>>                       On Wed, Nov 5, 2008 at 5:57 AM, William Reade
>>                       <william at resolversystems.com
>>             <mailto:william at resolversystems.com>
>>                    <mailto:william at resolversystems.com
>>             <mailto:william at resolversystems.com>>
>>                    <mailto:william at resolversystems.com
>>             <mailto:william at resolversystems.com>
>>                    <mailto:william at resolversystems.com
>>             <mailto:william at resolversystems.com>>>>
>>
>>                       wrote:
>>
>>                           Hi all
>>
>>                           While running the numpy tests, I've come
>>             across a situation
>>                           which, to the best of my knowledge, is simply
>>                    impossible. I'm
>>                           hoping that one of the local .NET gurus will
>>             be able to
>>                    tell
>>                           me what I'm missing, or point me somewhere I
>>             can get
>>                    more insight.
>>
>>                           The 4 methods involved are as follows:
>>                           -----------------------
>>                                 public int GetThreadId()
>>                                 {
>>                                     return
>>             Thread.CurrentThread.ManagedThreadId;
>>                                 }
>>
>>                                 public void WriteFlush(string info)
>>                                 {
>>                                     Console.WriteLine(info);
>>                                     Console.Out.Flush();
>>                                 }
>>
>>                                 public void EnsureGIL()
>>                                 {
>>                                     Monitor.Enter(this.dispatcherLock);
>>                                     this.WriteFlush(String.Format(
>>                                         "EnsureGIL ({1}) {0}",
>>             this.GetThreadId(),
>>                           Builtin.id(this.dispatcherLock)));
>>                                 }
>>
>>                                 public void ReleaseGIL()
>>                                 {
>>                                     this.WriteFlush(String.Format(
>>                                         "ReleaseGIL ({1}) {0}\n",
>>             this.GetThreadId(),
>>                           Builtin.id(this.dispatcherLock)));
>>                                     Monitor.Exit(this.dispatcherLock);
>>                                 }
>>                           -----------------------
>>                           ...and they can, and do, occasionally
>>             produce output as
>>                    follows:
>>                           -----------------------
>>                           EnsureGIL (443) 2
>>                           EnsureGIL (443) 1      <- omg, wtf, bbq, etc.
>>                           ReleaseGIL (443) 2
>>
>>                           EnsureGIL (443) 2
>>                           ReleaseGIL (443) 1
>>
>>                           ReleaseGIL (443) 2
>>                           -----------------------
>>                           When this happens, the process continues
>>             happily for a
>>                    short
>>                           time and then falls over in a later call to
>>             ReleaseGIL
>>                    (after
>>                           successfully calling it several times). The
>>             error is "
>>                    Object
>>                           synchronization method was called from an
>>                    unsynchronized block
>>                           of code", which I understand to mean "you
>>             can't release
>>                    this
>>                           lock because you don't hold it".
>>
>>                           It doesn't happen very often, but I can usually
>>                    reproduce it
>>                           by running
>>                    test_multiarray.TestFromToFile.test_malformed a few
>>                           hundred times. It may be relevant to note
>>             that thread 2
>>                    is the
>>                           GC thread, and thread 1 is the main thread.
>>             I have
>>                    considered
>>                           the following possibilities:
>>
>>                           (1) That I'm locking on the wrong object. I
>>             believe
>>                    that isn't
>>                           the case, because it's constructed only
>>             once, as a "new
>>                           Object()" (ie, a reference type), and is
>>             only subsequently
>>                           used for locking; and, because it keeps the
>>             same ipy id
>>                           throughout.
>>
>>                           (2) That Monitor.Enter occasionally allows
>>             two different
>>                           threads to acquire the same lock. I consider
>>             this extremely
>>                           unlikely, because... well, how many
>>             multithreaded .NET apps
>>                           already exist? If Monitor really were
>>             broken, I think we'd
>>                           probably know about it by now.
>>
>>                           (3) That calling Flush() on a SyncTextWriter
>>             (the type of
>>                           Console.Out) doesn't actually do anything,
>>             and the
>>                    output is
>>                           somehow wrongly ordered (although I can't
>>             imagine how this
>>                           could actually be: if the locking is really
>>             working,
>>                    then my
>>                           console writes are strictly sequential). I
>>             don't have
>>                    access
>>                           to the code, so I have no idea how it's
>>             implemented,
>>                    but even
>>                           if this is the case it doesn't help much
>>             with the
>>                    fundamental
>>                           problem (the synchronisation error which
>>             follows).
>>
>>                           Apart from the above, I'm out of ideas. Can
>>             anyone suggest
>>                           what I've missed?
>>
>>                           William
>>                           _______________________________________________
>>                           Users mailing list
>>                           Users at lists.ironpython.com
>>             <mailto:Users at lists.ironpython.com>
>>                    <mailto:Users at lists.ironpython.com
>>             <mailto:Users at lists.ironpython.com>>
>>                    <mailto:Users at lists.ironpython.com
>>             <mailto:Users at lists.ironpython.com>
>>                    <mailto:Users at lists.ironpython.com
>>             <mailto:Users at lists.ironpython.com>>>
>>
>>
>>
>>             http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>>
>>
>>
>>
>>              ------------------------------------------------------------------------
>>
>>
>>
>>                    _______________________________________________
>>                    Users mailing list
>>                    Users at lists.ironpython.com
>>             <mailto:Users at lists.ironpython.com>
>>             <mailto:Users at lists.ironpython.com
>>             <mailto:Users at lists.ironpython.com>>
>>
>>              http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>>
>>
>>
>>                _______________________________________________
>>                Users mailing list
>>                Users at lists.ironpython.com
>>             <mailto:Users at lists.ironpython.com>
>>             <mailto:Users at lists.ironpython.com
>>             <mailto:Users at lists.ironpython.com>>
>>
>>              http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>>
>>
>>             ------------------------------------------------------------------------
>>
>>             _______________________________________________
>>             Users mailing list
>>             Users at lists.ironpython.com <mailto:Users at lists.ironpython.com>
>>             http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>>
>>
>>
>>
>>         _______________________________________________
>>         Users mailing list
>>         Users at lists.ironpython.com <mailto:Users at lists.ironpython.com>
>>         http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>>
>>         _______________________________________________
>>         Users mailing list
>>         Users at lists.ironpython.com <mailto:Users at lists.ironpython.com>
>>         http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>>
>>
>>
>>
>>     _______________________________________________
>>     Users mailing list
>>     Users at lists.ironpython.com <mailto:Users at lists.ironpython.com>
>>     http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Users mailing list
>> Users at lists.ironpython.com
>> http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>>
>>     
>
> _______________________________________________
> Users mailing list
> Users at lists.ironpython.com
> http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
> _______________________________________________
> Users mailing list
> Users at lists.ironpython.com
> http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>
>   




More information about the Ironpython-users mailing list