[IronPython] Ironclad problem, with which someone here may be able to help
Dino Viehland
dinov at microsoft.com
Fri Nov 7 03:56:15 CET 2008
Do you know if numpy is using COM anywhere? Or does Ironclad use COM for any of its interop?
-----Original Message-----
From: users-bounces at lists.ironpython.com [mailto:users-bounces at lists.ironpython.com] On Behalf Of William Reade
Sent: Thursday, November 06, 2008 6:01 AM
To: Discussion of IronPython
Subject: Re: [IronPython] Ironclad problem, with which someone here may be able to help
Thanks Dino -- it seems that if I use ipy with -X:MTA, I can no longer
reproduce the problem. However, this bothers me a bit: I'm not competent
to follow all the consequences here, but this situation seems to imply
that Ironclad won't be usable safely from any STA thread. Is this an
intended restriction or a bug?
Incidentally, tracking down the call stacks proved to be hard work: the
timing changed enough that I only got a single failure over dozens of
runs, and it turned out I'd got the logging wrong, so I didn't find out
anything useful :(.
Dino Viehland wrote:
> I would suggest getting a snap shot of the call stacks when this is happening if that's possible. I can't pin anything down but I wonder if you could have an STA object or something that otherwise requires message pumping. That message pumping could happen while you're doing a Monitor.Enter call. If that was happening maybe there is some subtle CLR bug or a surprise feature? It is surprising that Monitor.Enter can go re-entrant but it can...
>
> So it'd be interesting to get thread snapshots and see if
>
>
>> EnsureGIL (443) 2
>> EnsureGIL (443) 1 <- omg, wtf, bbq, etc.
>>
>
> Could be happening because Thread 1 experiences contention, blocks and pumps messages, causing the finalizer thread (2) to run, the lock is acquired and ... ?
>
> Only other thing I could think of is does it repro on other machines? Maybe it's a hardware bug as unlikely as that seems?
>
> -----Original Message-----
> From: users-bounces at lists.ironpython.com [mailto:users-bounces at lists.ironpython.com] On Behalf Of William Reade
> Sent: Wednesday, November 05, 2008 10:01 AM
> To: Discussion of IronPython
> Subject: Re: [IronPython] Ironclad problem, with which someone here may be able to help
>
> The log starts in the middle (after many lock/unlocks, some from each
> thread); I'm running on x86; and I have no additional AppDomains.
>
> I don't think it would be safe for me to entirely avoid locking during
> finalization, but I could probably cut it down to a quick lock, on a
> separate object, to enqueue pointers for cleanup and deallocation on the
> main thread. However, I'm reluctant to do that until I'm sure that the
> problem is specifically related to GC, otherwise it'll just come back as
> soon as anyone tries any serious multithreading :).
>
> Curt Hagenlocher wrote:
>
>> Locking during finalization is often considered to be a bad idea. In
>> particular, locking without a timeout introduces the possibility that
>> you will hang the finalization thread, preventing further objects from
>> being finalized. But clearly, that's not what's happening here.
>>
>> Other questions that probably don't matter but might be interesting to
>> know:
>>
>> Can we assume that the finalization thread isn't the first place where
>> this lock is required? That your log starts somewhere in the middle?
>>
>> Is this under x86 or x64 or both?
>>
>> Are you creating any additional AppDomains in the process?
>>
>>
>> On Wed, Nov 5, 2008 at 10:15 AM, William Reade
>> <william at resolversystems.com <mailto:william at resolversystems.com>> wrote:
>>
>> Hi Curt
>>
>> I am indeed; that's how I know thread 2 is the GC thread. Is
>> locking during GC forbidden?
>>
>> William
>>
>> Curt Hagenlocher wrote:
>>
>> ...or, for that matter, any __del__ methods from within Python
>> -- which ultimately are handled by finalization.
>>
>> On Wed, Nov 5, 2008 at 9:37 AM, Curt Hagenlocher
>> <curt at hagenlocher.org <mailto:curt at hagenlocher.org>
>> <mailto:curt at hagenlocher.org <mailto:curt at hagenlocher.org>>>
>> wrote:
>>
>> So, the obvious question for me is whether or not you're
>> using any
>> finalizers.
>>
>>
>> On Wed, Nov 5, 2008 at 5:57 AM, William Reade
>> <william at resolversystems.com
>> <mailto:william at resolversystems.com>
>> <mailto:william at resolversystems.com
>> <mailto:william at resolversystems.com>>>
>>
>> wrote:
>>
>> Hi all
>>
>> While running the numpy tests, I've come across a situation
>> which, to the best of my knowledge, is simply
>> impossible. I'm
>> hoping that one of the local .NET gurus will be able to
>> tell
>> me what I'm missing, or point me somewhere I can get
>> more insight.
>>
>> The 4 methods involved are as follows:
>> -----------------------
>> public int GetThreadId()
>> {
>> return Thread.CurrentThread.ManagedThreadId;
>> }
>>
>> public void WriteFlush(string info)
>> {
>> Console.WriteLine(info);
>> Console.Out.Flush();
>> }
>>
>> public void EnsureGIL()
>> {
>> Monitor.Enter(this.dispatcherLock);
>> this.WriteFlush(String.Format(
>> "EnsureGIL ({1}) {0}", this.GetThreadId(),
>> Builtin.id(this.dispatcherLock)));
>> }
>>
>> public void ReleaseGIL()
>> {
>> this.WriteFlush(String.Format(
>> "ReleaseGIL ({1}) {0}\n", this.GetThreadId(),
>> Builtin.id(this.dispatcherLock)));
>> Monitor.Exit(this.dispatcherLock);
>> }
>> -----------------------
>> ...and they can, and do, occasionally produce output as
>> follows:
>> -----------------------
>> EnsureGIL (443) 2
>> EnsureGIL (443) 1 <- omg, wtf, bbq, etc.
>> ReleaseGIL (443) 2
>>
>> EnsureGIL (443) 2
>> ReleaseGIL (443) 1
>>
>> ReleaseGIL (443) 2
>> -----------------------
>> When this happens, the process continues happily for a
>> short
>> time and then falls over in a later call to ReleaseGIL
>> (after
>> successfully calling it several times). The error is "
>> Object
>> synchronization method was called from an
>> unsynchronized block
>> of code", which I understand to mean "you can't release
>> this
>> lock because you don't hold it".
>>
>> It doesn't happen very often, but I can usually
>> reproduce it
>> by running
>> test_multiarray.TestFromToFile.test_malformed a few
>> hundred times. It may be relevant to note that thread 2
>> is the
>> GC thread, and thread 1 is the main thread. I have
>> considered
>> the following possibilities:
>>
>> (1) That I'm locking on the wrong object. I believe
>> that isn't
>> the case, because it's constructed only once, as a "new
>> Object()" (ie, a reference type), and is only subsequently
>> used for locking; and, because it keeps the same ipy id
>> throughout.
>>
>> (2) That Monitor.Enter occasionally allows two different
>> threads to acquire the same lock. I consider this extremely
>> unlikely, because... well, how many multithreaded .NET apps
>> already exist? If Monitor really were broken, I think we'd
>> probably know about it by now.
>>
>> (3) That calling Flush() on a SyncTextWriter (the type of
>> Console.Out) doesn't actually do anything, and the
>> output is
>> somehow wrongly ordered (although I can't imagine how this
>> could actually be: if the locking is really working,
>> then my
>> console writes are strictly sequential). I don't have
>> access
>> to the code, so I have no idea how it's implemented,
>> but even
>> if this is the case it doesn't help much with the
>> fundamental
>> problem (the synchronisation error which follows).
>>
>> Apart from the above, I'm out of ideas. Can anyone suggest
>> what I've missed?
>>
>> William
>> _______________________________________________
>> Users mailing list
>> Users at lists.ironpython.com
>> <mailto:Users at lists.ironpython.com>
>> <mailto:Users at lists.ironpython.com
>> <mailto:Users at lists.ironpython.com>>
>>
>>
>> http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at lists.ironpython.com <mailto:Users at lists.ironpython.com>
>> http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>>
>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at lists.ironpython.com <mailto:Users at lists.ironpython.com>
>> http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Users mailing list
>> Users at lists.ironpython.com
>> http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>>
>>
>
> _______________________________________________
> Users mailing list
> Users at lists.ironpython.com
> http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>
> _______________________________________________
> Users mailing list
> Users at lists.ironpython.com
> http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>
>
_______________________________________________
Users mailing list
Users at lists.ironpython.com
http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
More information about the Ironpython-users
mailing list