[Tutor] threads

Roeland Rengelink r.b.rigilink@chello.nl
Thu, 05 Jul 2001 15:08:32 +0200


Hi Brendon,

I'd been working on a short introduction to threads, but I never
finished it. Let me use this post to at least release this much to the
public, it will probably never be finished, and half an answer may be
better than none.


Brendon wrote:
> 
> there seems to remarkably little documentation on this subject, so i'll ask
> here. how do you thread a program? i.e., to keep it very basic a program.
> 
> def thread_one():
>   r1 = 4+5
>   #pass r1 to thread two
> 
> def thread_two():
>   r2 = 3+4
>   r3 = r1 + r2
> 
> #call both threads simultaneously
> 

Threads are pieces of a program  that run asynchronously. 

There are several reasons you might want to use threads in your program,
sometimes you have to deal with asynchronous events, sometimes you just
wish to simulate them.

Simulating asynchonous execution doesn't need threads, just do something
like this:

while 1:
    if random.randrange(2) == 0:
        do_something(data)
    else:
        do_another_thing(data)

This is almost equivalent to:

from threading import Thread
class T1(Thread):
    def __init__(self):
        Thread.__init__(self)
    def run(self):
        while 1:
            do_something(data)

class T2(Thread):
    def __init__(self):
        Thread.__init__(self)
    def run(self):
        while 1:
            do_another_thing(data)
t1 = T1()
t2 = T2()
t1.start()
t2.start()
t1.join()
t2.join()

Notice two things:
1 - The example without threads is much shorter
2 - I said 'almost equivalent'

They are not really equivalent because in the non-threaded example
do_something(data) and do_another_thing(data) are never executed at the
same time. While in the threaded example they appear to be. Depending on
the circumstances this makes the threaded solution a better one, or an
absolute nightmare.

I say appear because in reality at each moment only one of the functions
is being executed. Executing a function means executing a series of
(bytecode) operations. The OS and Python together may decide to switch
from one to the other series between each operation. This is OK if you
don't mind them switching at random, unfortunately, usually, you do.

Problems start when you want to share resources between threads (in this
case 'data' is a shared resource), and/or when you want to communicate
between threads, i.e. when do_something and do_another_thing are going
to depend on the state of T2 and T1 respectively

Sharing resources is a problem because you want to make sure that T1's
operation on the resource doesn't interfere with T2's operation on the
resource. This is what people refer to when they ask worried questions
about the thread-safety of an operation.

You can solve this problem with locks. A lock is the moral equivalent
of:

class Lock:
    def __init__(self):
        self.__locked = 0
    def acquire(self):
        while self.__locked:
            pass           # wait till self.__locked = 0
        self.__locked = 1  # then lock
    def release(self):
        self.__locked = 0


You would use this in your do_* functions like this:

l = Lock()

def do_something(data):
    <do some stuff>
    l.acquire()
    operate(data)  # do some stuff that shouldn't be interfered with
    l.release()
    <do some more stuff>

def do_another_thing(data):
    <do some other stuff>
    l.acquire()
    <do some stuff that could interfere with operate in do_something()>
    l.release()
    <do some more other stuff>

do_something can now only execute operate(data) if it has first acquired
the lock. This lock will remain into effect untill that operation is
finished. Only when do_something has released the lock, do_another_thing
can acquire it to do stuff that might otherwise have interfered with
operate(data)

Sometimes we want one thread to notify the other thread that an event
has occured.
(for example when a variable has been set). The simplest mechanism for
that is the Event object. An event object is something like:

class Event:
    __init__(self):
        self.__flag = 0
    def set(self):
        self.__flag = 1
    def isSet(self):
        return self.__flag
    def clear(self):
        self.__flag = 0
    def wait(self, timeout=None):
	if timeout is None:
	    # wait indefinetely
            while not self.__flag:
                time.sleep(0.01)
            return
        else:
            # wait at most timeout seconds    
            start = time.time()
            while not self.__flag and (time.time()-start)<timeout
                time.sleep(0.01)
            return

Which you would use somewhat like this:

do_it = Event()

class T1(Thread):
    def run(self):
        <do stuff>
        do_it.set()  # We're notifying T2
        <more stuff>

class T2(Thread):
    def run(self):
        <do stuff>
        do_it.wait()  # We're waiting for notification from T1
        do_it.clear()
        <more stuff>

The threading library module provides Lock and Event objects, as well as
some other usefull stuff. I hope this this intro will have made the
library documentation somwehat more usefull.
     
Roeland

PS On re-reading I'm left with a nagging doubt that I may use the term
asynchronous incorrectly.

I mean it here in the sense that a thread normally doesn't influence the
moment at which an operation in another thread is executed.

Can somebody confirm this is correct usage.
-- 
r.b.rigilink@chello.nl

"Half of what I say is nonsense. Unfortunately I don't know which half"