[Python-bugs-list] [ python-Bugs-433625 ] bug in PyThread_release_lock()
noreply@sourceforge.net
noreply@sourceforge.net
Thu, 18 Oct 2001 09:18:29 -0700
Bugs item #433625, was opened at 2001-06-15 19:17
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=433625&group_id=5470
Category: Threads
Group: Platform-specific
>Status: Deleted
Resolution: Invalid
Priority: 5
Submitted By: Shih-Hao Liu (shihao)
Assigned to: Tim Peters (tim_one)
Summary: bug in PyThread_release_lock()
Initial Comment:
Mutex should be hold when calling
pthread_cond_signal(). This function should look like:
PyThread_release_lock(PyThread_type_lock lock)
{
pthread_lock *thelock = (pthread_lock *)lock;
int status, error = 0;
dprintf(("PyThread_release_lock(%p) called\n",
lock));
status = pthread_mutex_lock( &thelock->mut );
CHECK_STATUS("pthread_mutex_lock[3]");
thelock->locked = 0;
/* ***** call pthread_cond_signal before unlock
mutex */
status = pthread_cond_signal(
&thelock->lock_released );
CHECK_STATUS("pthread_cond_signal");
status = pthread_mutex_unlock( &thelock->mut );
CHECK_STATUS("pthread_mutex_unlock[3]");
/* wake up someone (anyone, if any) waiting on
the lock */
}
----------------------------------------------------------------------
>Comment By: Tim Peters (tim_one)
Date: 2001-10-18 09:18
Message:
Logged In: YES
user_id=31435
Deleted this bug report, at Shih-Hao Liu's request.
----------------------------------------------------------------------
Comment By: Shih-Hao Liu (shihao)
Date: 2001-06-20 00:17
Message:
Logged In: YES
user_id=246388
> I also browse through
> LinuxThread implementation and can't find
> pthread_cond_xxxx is implemented automically.
I found they do call __pthread_lock when updating the
priority queue after take a closer look.
I guess I have to look at elsewhere to figure out why my
Python script is spinning.
----------------------------------------------------------------------
Comment By: Shih-Hao Liu (shihao)
Date: 2001-06-19 22:14
Message:
Logged In: YES
user_id=246388
> I understand both of those. Are you assuming that, e.g.,
> thread 3's PyThread_acquire_lock completes in whole
during
> this gap? I don't know what else you could mean, so
let's
> assume that.
Yes, I assume thread 3 completed in this gap and it is
possible.
"Collide" means while thread 2's pthread_cond_wait was
modifiying the internal data structure, thread 1 calls
pthread_cond_signal.
The point here is not missing signals, the problem is the
possiblity that pthread_cond_signal preempt the execution
of pthread_cond_wait. I can't find any document says
pthread_cond_xxx functions must be automic operations.
In pthread_cond_xxx man page, it only mentioned:
pthread_cond_wait atomically unlocks the mutex (as per
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pthread_unlock_mutex) and waits for the condition variable
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
cond to be signaled. The thread execution is suspended and
^^^^^^^^^^^^^^^^^^^
does not consume any CPU time until the condition variable
is signaled. The mutex must be locked by the calling
thread on entrance to pthread_cond_wait. Before returning
to the calling thread, pthread_cond_wait re-acquires mutex
(as per pthread_lock_mutex).
Unlocking the mutex and suspending on the condition vari
able is done atomically. Thus, if all threads always
acquire the mutex before signaling the condition, this
guarantees that the condition cannot be signaled (and thus
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ignored) between the time a thread locks the mutex and the
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
time it waits on the condition variable.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I take it that between the time the mutex is locked by
pthread_mutex_lock and unlocked by pthread_cond_wait, you
can't call pthread_cond_signal. I also browse through
LinuxThread implementation and can't find pthread_cond_xxxx
is implemented automically.
I found there is another way to fix this without having to
call the pthread_cond_signal while holding the mutex. If
we do:
if (!thelock->locked)
status = pthread_cond_signal( &thelock->lock_released );
we guarantee that when pthread_cond_signal is called, the
acquire_lock code will not be in the middle of
pthread_cond_signal.
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2001-06-19 15:06
Message:
Logged In: YES
user_id=31435
> The problem will be arised if there is a thread 3
> called PyThread_acquire_lock
You never mentioned thread 3 again, so I don't know what it
has to do with this.
> after thread 1 set thelock->locked to 0 and before
> thread 2 calling pthread_mutex_lock.
I understand both of those. Are you assuming that, e.g.,
thread 3's PyThread_acquire_lock completes in whole during
this gap? I don't know what else you could mean, so let's
assume that.
> "while (thelock->locked)" will success for thread 2
Sure.
> and it will call pthread_cond_wait
Yup.
> and might collide with thread 1's pthread_cond_signal.
What does "collide" mean to you? All the pthread_cond_xxx
functions must be implemented as if atomic, so there's no
meaningful sense (to me) in which they can collide --
unless they're implemented incorrectly.
Assuming they are implemented correctly, it again doesn't
matter that thread 2 misses thread 1's signal, because
thread *3* exploited the information thread 1 was going to
signal, by acquiring the lock. It's actually good that
thread 2 isn't bothered with it: there's no real info in
the signal anymore (at best, if thread 2 got it, it would
wake up and go "oops! it's still locked; I'll wait again").
All that matters now is whether thread 2 gets a chance to
see thread *3*'s signal, at the time thread 3 releases the
lock. And it will, because thread 3 can't release the lock
without acquiring the mutex first, and thread 2 holds the
mutex at all times except when in its pthread_cond_wait
call (so thread 3 can't release the lock except when thread
2 is in pthread_cond_wait).
Note that I'm not at all concerned about "fairness" here,
only about races.
----------------------------------------------------------------------
Comment By: Shih-Hao Liu (shihao)
Date: 2001-06-19 14:41
Message:
Logged In: YES
user_id=246388
Oops.
The problem will be arised if there is a thread 3 called
PyThread_acquire_lock after thread 1 set thelock->locked to
0 and before thread 2 calling pthread_mutex_lock. "while
(thelock->locked)" will success for thread 2 and it will
call pthread_cond_wait and might collide with thread 1's
pthread_cond_signal.
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2001-06-19 13:52
Message:
Logged In: YES
user_id=31435
It appears you're concerned that the signal will "get lost"
in this scenario. I agree that it may, but it doesn't
matter: thread 2's "while (thelock->locked)" test fails
because thread 1 already set thelock->locked to 0, so
thread 2 doesn't execute its pthread_cond_wait, so it
doesn't matter that nobody is *waiting* to see thread 1's
pthread_cond_signal. IOW, the condition thread 1 will try
to signal has *already* been detected by thread 2, and the
signal is useless (because redundant) information. Indeed,
it's a beauty of the condition protocol that signals can be
sloppy.
The linuxthread man page is worded strangely here, but note
that it does not say the mutex *must* be locked. They
can't, either, because POSIX doesn't require it; while
POSIX stds aren't available freely online, some derived
specs are, and are much clearer about this; e.g., see the
Single Unix Specification, here:
<http://www.opengroup.org/onlinepubs/7908799/xsh/pthread_con
d_signal.html>
I'll tell you why I don't *want* to change this: in
Python's use of the global interpreter lock, it's almost
always the case that someone is waiting on the lock. By
releasing the mutex before signaling, this gives a waiter a
chance to run immediately upon calling
pthread_cond_signal. Else, because pthread_cond_wait
(which the waiters are executing) has to lock the mutex, if
the signaler is holding the mutex during the signal,
pthread_cond_signal can't finish the job -- it was to go
back to the signaler and let it unlock the mutex first
before a waiter can proceed. This makes the region of
exclusion longer than it needs to be.
So if there's not an actual race problem here (& I still
don't see one), I don't want to change this.
----------------------------------------------------------------------
Comment By: Shih-Hao Liu (shihao)
Date: 2001-06-17 23:01
Message:
Logged In: YES
user_id=246388
I closed it because I thought the thelock->locked variable
will ensure that the PyThread_release_lock will help to
protect the condition variable and I was wrong. The
linuxthread man page on pthread_cond_signal:
A condition variable must always be associated with a
mutex, to avoid the race condition where a thread prepares
to wait on a condition variable and another thread signals
the condition just before the first thread actually waits
on it.
which means you can't call pthread_cond_signal &
pthread_cond_wait on the same condition variable at the
same time. And using a mutex to protect them is a good
idea. Here is how thing might go wrong with current
implementation:
thread 1 thread 2
|int PyThread_acquire_lock _
|/** assume lock was acquired
| by thread 1, hence locked=0
| & success would be 0 **/
|{
| ...
| status = pthread_mutex_lo
| CHECK_STATUS("pthread_mut
| success = thelock->locked
| if (success) thelock->loc
| status = pthread_mutex_un
| /** thread 2 suspended **/
void PyThread_release_lock _|
{ |
... |
status = pthread_mutex_loc|
CHECK_STATUS("pthread_mute|
|
thelock->locked = 0; |
|
status = pthread_mutex_unl|
/** thread 1 suspend **/ |
| CHECK_STATUS("pthread_mut
|
| if ( !success && waitflag
| /* continue trying unti
|
| /* mut must be locked b
| * protocol */
| status = pthread_mutex_
| CHECK_STATUS("pthread_m
| while ( thelock->locked
| status = pthread_cond
|/** thread 2 suspended while
| updating shared data **
CHECK_STATUS("pthread_mute|
|
/* wake up someone (anyone|
status = pthread_cond_sign|
/** thread 1 update shared |
data and corrupt it. **/ |
Not sure what the effect would be. It's wouldn't be nice
anyway.
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2001-06-17 16:02
Message:
Logged In: YES
user_id=31435
Closing this again, as it appears the original submitter
deleted it. shihao, if you want to pursure this, open it
again.
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2001-06-17 15:01
Message:
Logged In: YES
user_id=31435
Ack, did I delete this?! I sure didn't intend to -- didn't
even intend to close it. Reopened pending more info.
----------------------------------------------------------------------
Comment By: Guido van Rossum (gvanrossum)
Date: 2001-06-17 14:51
Message:
Logged In: YES
user_id=6380
Set status to closed -- no need to delete it.
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2001-06-17 14:16
Message:
Logged In: YES
user_id=31435
Why? It's allowed to signal the condition whether or not
the mutex is held. Since changing this can have visible
effects on thread scheduling, I'm reluctant to change it
without a good reason.
----------------------------------------------------------------------
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=433625&group_id=5470