[Patches] [ python-Patches-960406 ] unblock signals in threads

SourceForge.net noreply at sourceforge.net
Wed Jul 7 12:37:02 CEST 2004


Patches item #960406, was opened at 2004-05-25 22:00
Message generated for change (Comment added) made by mwh
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=960406&group_id=5470

Category: Core (C code)
Group: Python 2.4
Status: Open
Resolution: None
Priority: 6
Submitted By: Andrew Langmead (langmead)
Assigned to: Michael Hudson (mwh)
Summary: unblock signals in threads

Initial Comment:
This is a patch which will correct the issues some people 
have with python's handling of signal handling in threads. It 
allows any thread to initially catch the signal mark it as 
triggered, allowing the main thread to later process it. (This 
is actually just restoring access to the functionality that was 
in Python 2.1) The special SIGINT handling for the python 
readline module has been changed so that it can now see an 
EINTR error code, rather than needing a longjmp out of the 
readline library itself. If the readline library python is being 
linked to doesn't have the callback features necessary, it will 
fall back to its old behavior.

----------------------------------------------------------------------

>Comment By: Michael Hudson (mwh)
Date: 2004-07-07 11:37

Message:
Logged In: YES 
user_id=6656

alpha1 is approaching...

I'm not sure what to do here.  I think I know how to deal with the 
first complaint below (basically, do different things if you re-enter 
PyOS_Readline from a different thread than when you re-enter it 
from the same thread).

The other issue does seem to be a readline problem.  I've sent a 
flam^Wreport to the readline bugs list about a week ago but no 
response yet.

What do people think?  Any fix for this problem must be in an 
early alpha to get the x-platform testing it sorely needs.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2004-06-30 12:29

Message:
Logged In: YES 
user_id=6656

Dammit all: pressing ^C when in ''interactive search mode" also 
appears to fail to do the Right Thing.  Is this a readline bug?

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2004-06-30 12:17

Message:
Logged In: YES 
user_id=6656

Ah hell, my current patch makes insane things happen when you 
do something like:

>>> thread.start_new_thread(raw_input, ('a',)); time.sleep(1)

Gah.  Maybe we should just try to ban calling into readline from a 
non-main thread; that seems a bit draconian, though.

----------------------------------------------------------------------

Comment By: Anthony Baxter (anthonybaxter)
Date: 2004-06-23 06:00

Message:
Logged In: YES 
user_id=29957

At this point, worry about getting it working at all for
2.4, _then_ we can worry about trying to backport it to 2.3.
If it turns out that we can't fix it for 2.3.5, so be it...
 I'd much rather see this fixed correctly in 2.4 and not at
all in 2.3.5 than seeing a broken hacky fix in both 2.3.5
and 2.4. This code is already unpleasant enough.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2004-06-23 05:41

Message:
Logged In: YES 
user_id=6380

If there's no frame when PyOS_Readline() handles the signal
immediately, why would there be a frame when the user hits
return? IOW I don't think it would be a big deal to change
that behavior.

Semantics that are a trifle (or even completely) accidental
are nevertheless worth preserving in a bugfix release,
otherwise compatibility could be at risk.


----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2004-06-22 10:02

Message:
Logged In: YES 
user_id=6656

> What else did you want from me?

Not a lot more than that :-)  The only other point you might have 
an opinion (aka. a bit of current behaviour that I don't understand 
;-) is that in current Python, a signal delivered while sitting in a 
call to PyOS_Readline() is not handled (at the Python level) until 
the user presses return (or ^C? hmm, not sure about that) 
whereas with this patch, it is handled more-or-less immediately.

This means that the second argument to the Python signal handler 
will be None, rather than a frame object: there's no Python 
execution happening at this point, after all.

Does this sound reasonable to you?

> For 2.3, keeping whatever semantics ^C from readline
> has at the moment should be preserved

Certainly, in principle at least!  However "whatever semantics ^C 
from readline has at the moment" are a trifle accidental... I need 
to think about this.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2004-06-22 03:59

Message:
Logged In: YES 
user_id=6380

Ideally, ^C should always cause the signal handler for
SIGINT to be called, and the KeyboardInterrupt should be
generated by the default SIGINT handler.

For 2.3, keeping whatever semantics ^C from readline has at
the moment should be preserved -- we only want bugfixes, not
new features...

What else did you want from me? (I'm also lacking focus, or
at least time to think about this stuff in detail.)

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2004-06-19 11:14

Message:
Logged In: YES 
user_id=6656

Yes, I think you're right.  I guess I'm suffering a lack of focus, 
finding it hard to resist the impulse to fix what look like ancient 
bogosities in the area while I'm there... (also see the way a NULL 
return from PyOS_Readline is assumed to be a keyboard 
interrupt).

One could argue that ^C should always interrupt an interactive 
session, but one could also argue that users shouldn't be so daft 
as to install handlers for SIGINT if they want that to be true (after 
all, they can make life hard for themselves if they want with 
stty(1)).

A downside to all this footling is that it makes a backport to 2.3 
harder to justify.  Hmm.  I wander what Guido thinnks (he's 
alledgedly "now maintaining" Modules/readline.c :-).

----------------------------------------------------------------------

Comment By: Andrew Langmead (langmead)
Date: 2004-06-19 04:04

Message:
Logged In: YES 
user_id=119306

I'm not sure if the current behavior should be maintained or not, but it 
looks like to me that the readline module has always generated a 
KeyboardInterrupt, regardless of whether SIGINT has been overridden. 
This is a bit odd though. It causes the SIGINT handling to change 
depending on whether or not you are at the top level interpreter's 
prompt.

wantarray% cat /tmp/foo.py 
import signal

def foo(sig, frame):
  print "caught foo"

signal.signal(signal.SIGINT, foo)


wantarray% python -i /tmp/foo.py
>>> foo
<function foo at 0x61430>
>>> ^C
KeyboardInterrupt
>>> while 1:
...   pass
... 
^Ccaught foo
^Ccaught foo
^Ccaught foo
^Ccaught foo
^\zsh: quit       python -i /tmp/foo.py


----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2004-06-18 13:54

Message:
Logged In: YES 
user_id=6656

The problem with that approach is: what if you want a
handler for SIGINT that doesn't raise KeyboardInterrupt? 
Other than that, it sounds like your plan should work.

I've attached a slightly cleaned up version of my patch
which makes signal handling in the "without readline" case
more like yesterday's patch made the "with readline" case.

----------------------------------------------------------------------

Comment By: Andrew Langmead (langmead)
Date: 2004-06-18 03:35

Message:
Logged In: YES 
user_id=119306

Here is another possible approach to solving the problem of readline 
exiting for signals other than SIGINT. I'm not sure if it is better or worse 
than the scarypatch.

As you said, the call to readline is performed without the GIL. So is the 
actual C-level signal handler from the signal module (the python code 
that gets associated with the signal is deferred until later.) At the time 
we see the EINTR, there is a flag in the signal module's Handler array to 
say whether the signal that we received was a SIGINT. If we added some 
sort of  interface within the signal module to find out what signals are 
pending to be run on the next call to PyErr_CheckSignals, then we could 
find out if the EINTR was caused by an INT (at which point we should 
exit) or by another signal (at which we could just retry the select.)

Is there any potential to this approach? 

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2004-06-17 18:25

Message:
Logged In: YES 
user_id=6656

BTW, I'd really really like someone to review this :-)

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2004-06-17 18:24

Message:
Logged In: YES 
user_id=6656

How about the attached?  It's a bit ... scary.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2004-06-17 17:08

Message:
Logged In: YES 
user_id=6656

A potential problem with this patch is that it causes input
to be interrupted (with a KeyboardInterrupt exception) when
any handled signal is delivered.  This seems suboptimal.

It's appealing to try to run the (Python) signal handlers in
the errno == EINTR case of
readline_line_until_enter_or_signal, but that has problems
in that PyOS_ReadlineFunctionPointer is called without the
GIL being held and once that is dealt with, an installed
Python signal handler attempting to call readline at this
point can reasonably be expected to result in all hell
breaking loose.

I don't know what the correct solution is here.  Add our own
rentrancy checks and learn how to work the Python
threadstate API properly?

Thoughts, anyone? Or have I scared everyone away now?

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2004-06-12 11:17

Message:
Logged In: YES 
user_id=6656

Now a rewrite of the test that actually works!

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2004-06-12 11:08

Message:
Logged In: YES 
user_id=6656

Here's a version of the patch that includes the new unit test 
(oops!) which I've rewritten slightly.

----------------------------------------------------------------------

Comment By: Anthony Baxter (anthonybaxter)
Date: 2004-06-11 17:02

Message:
Logged In: YES 
user_id=29957

No - wait. Ignore that test_timeout error, it exists with a
clean checkout. 
The inability to interrupt make testall, however is new with
this patch.
Linux Fedora Core 2.


----------------------------------------------------------------------

Comment By: Anthony Baxter (anthonybaxter)
Date: 2004-06-11 16:58

Message:
Logged In: YES 
user_id=29957

With this patch:

bonanza% ./python Lib/test/test_timeout.py  
testBlockingThenTimeout (__main__.CreationTestCase) ... ok
testFloatReturnValue (__main__.CreationTestCase) ... ok
testObjectCreation (__main__.CreationTestCase) ... ok
testRangeCheck (__main__.CreationTestCase) ... ok
testReturnType (__main__.CreationTestCase) ... ok
testTimeoutThenBlocking (__main__.CreationTestCase) ... ok
testTypeCheck (__main__.CreationTestCase) ... ok
testAcceptTimeout (__main__.TimeoutTestCase) ... ok
testConnectTimeout (__main__.TimeoutTestCase) ... FAIL
testRecvTimeout (__main__.TimeoutTestCase) ... ok
testRecvfromTimeout (__main__.TimeoutTestCase) ... ok
testSend (__main__.TimeoutTestCase) ... ok
testSendall (__main__.TimeoutTestCase) ... ok
testSendto (__main__.TimeoutTestCase) ... ok

======================================================================
FAIL: testConnectTimeout (__main__.TimeoutTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "Lib/test/test_timeout.py", line 121, in
testConnectTimeout
    "timeout (%g) is more than %g seconds more than expected
(%g)"
AssertionError: timeout (4.48679) is more than 2 seconds
more than expected (0.001)

----------------------------------------------------------------------
Ran 14 tests in 17.445s

FAILED (failures=1)
Traceback (most recent call last):
  File "Lib/test/test_timeout.py", line 192, in ?
    test_main()
  File "Lib/test/test_timeout.py", line 189, in test_main
    test_support.run_unittest(CreationTestCase, TimeoutTestCase)
  File
"/home/anthony/src/py/pyhead/dist/src/Lib/test/test_support.py",
line 290, in run_unittest
    run_suite(suite, testclass)
  File
"/home/anthony/src/py/pyhead/dist/src/Lib/test/test_support.py",
line 275, in run_suite
    raise TestFailed(err)
test.test_support.TestFailed: Traceback (most recent call last):
  File "Lib/test/test_timeout.py", line 121, in
testConnectTimeout
    "timeout (%g) is more than %g seconds more than expected
(%g)"
AssertionError: timeout (4.48679) is more than 2 seconds
more than expected (0.001)

Also, with this patch applied, I can no longer kill a 'make
testall' with a ^C


----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2004-06-11 15:18

Message:
Logged In: YES 
user_id=6656

The patch didn't apply, so I've updated it (attached).

I see test_asynchat fail occasionally now, but don't know if that's 
because of this patch :-(

Once I've sorted that out in my head, I think I'm going to check 
this in.

----------------------------------------------------------------------

Comment By: Andrew Langmead (langmead)
Date: 2004-05-29 06:49

Message:
Logged In: YES 
user_id=119306

Here is a reformatted version of the patch.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2004-05-28 15:25

Message:
Logged In: YES 
user_id=31435

I agree that "busy" always should have been volatile -- once 
again, good eye!

Python C style is basically K&R Classic, hard tab for 
indentation, open curly at the end of the line opening a block 
except for first line of function definition.  Just make it look 
like the other C code, but be careful to pick one of the .c 
files Guido approves of <wink>.

----------------------------------------------------------------------

Comment By: Andrew Langmead (langmead)
Date: 2004-05-28 13:37

Message:
Logged In: YES 
user_id=119306

Thank you for pointing me to PEP 7. I'll take a look at where I am amiss 
and fix it up. For the change in ceval.c, I took a look at gcc's x86 
assembly output of the file, and noticed that the optimizer was altering 
the order of the busy flag test. Since busy is set from other concurrent 
execution (other signal handlers), changing the variable to volatile told 
gcc not to optimize accesses to the variable. 

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2004-05-28 09:54

Message:
Logged In: YES 
user_id=6656

I haven't been able to test on MacOS X further, unfortunately.

The patch works on linux/x86 though (after fixing the
TabError :-) but this is with an NTPL kernel, so I didn't
have a problem anyway.

The C doesn't all conform to the Python style -- see PEP 7.
 Can you fix that?

Why the change to Python/ceval.c?

After all that -- thanks a lot!  I really want to get this
checked in ASAP so we can find out which platforms it breaks
at the earliest point in the 2.4 cycle.

----------------------------------------------------------------------

Comment By: Andrew Langmead (langmead)
Date: 2004-05-27 07:04

Message:
Logged In: YES 
user_id=119306

It seems that at least OS X, sending the kill to the process schedules that 
the receiving process will run the signal handler at some later time. (it 
seems to be the  only one to frequently run the signal handlers in the 
opposite order than they were sent)  This revised version of the test 
seems to work better on OS X.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2004-05-26 19:41

Message:
Logged In: YES 
user_id=6656

test_threadsignals hangs for me on OS X.  Haven't done anything 
more thorough than that yet...

----------------------------------------------------------------------

Comment By: Andrew Langmead (langmead)
Date: 2004-05-26 18:48

Message:
Logged In: YES 
user_id=119306

I apologize that the missing patch.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2004-05-26 18:22

Message:
Logged In: YES 
user_id=6656

There's no uploaded file!  You have to check the
checkbox labeled "Check to Upload & Attach File"
when you upload a file. In addition, even if you
*did* check this checkbox, a bug in SourceForge
prevents attaching a file when *creating* an issue.

Please try again.

(This is a SourceForge annoyance that we can do
nothing about. :-( )

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=960406&group_id=5470


More information about the Patches mailing list