Multiple scripts versus single multi-threaded script
Roy Smith
roy at panix.com
Thu Oct 3 14:28:32 EDT 2013
In article <mailman.684.1380819470.18130.python-list at python.org>,
Chris Angelico <rosuav at gmail.com> wrote:
> On Fri, Oct 4, 2013 at 2:41 AM, Roy Smith <roy at panix.com> wrote:
> > The downside to threads is that all of of this sharing makes them much
> > more complicated to use properly. You have to be aware of how all the
> > threads are interacting, and mediate access to shared resources. If you
> > do that wrong, you get memory corruption, deadlocks, and all sorts of
> > (extremely) difficult to debug problems. A lot of the really hairy
> > problems (i.e. things like one thread continuing to use memory which
> > another thread has freed) are solved by using a high-level language like
> > Python which handles all the memory allocation for you, but you can
> > still get deadlocks and data corruption.
>
> With CPython, you don't have any headaches like that; you have one
> very simple protection, a Global Interpreter Lock (GIL), which
> guarantees that no two threads will execute Python code
> simultaneously. No corruption, no deadlocks, no hairy problems.
>
> ChrisA
Well, the GIL certainly eliminates a whole range of problems, but it's
still possible to write code that deadlocks. All that's really needed
is for two threads to try to acquire the same two resources, in
different orders. I'm running the following code right now. It appears
to be doing a pretty good imitation of a deadlock. Any similarity to
current political events is purely intentional.
import threading
import time
lock1 = threading.Lock()
lock2 = threading.Lock()
class House(threading.Thread):
def run(self):
print "House starting..."
lock1.acquire()
time.sleep(1)
lock2.acquire()
print "House running"
lock2.release()
lock1.release()
class Senate(threading.Thread):
def run(self):
print "Senate starting..."
lock2.acquire()
time.sleep(1)
lock1.acquire()
print "Senate running"
lock1.release()
lock2.release()
h = House()
s = Senate()
h.start()
s.start()
Similarly, I can have data corruption. I can't get memory corruption in
the way you can get in a C/C++ program, but I can certainly have one
thread produce data for another thread to consume, and then
(incorrectly) continue to mutate that data after it relinquishes
ownership.
Let's say I have a Queue. A producer thread pushes work units onto the
Queue and a consumer thread pulls them off the other end. If my
producer thread does something like:
work = {'id': 1, 'data': "The Larch"}
my_queue.put(work)
work['id'] = 3
I've got a race condition where the consumer thread may get an id of
either 1 or 3, depending on exactly when it reads the data from its end
of the queue (more precisely, exactly when it uses that data).
Here's a somewhat different example of data corruption between threads:
import threading
import random
import sys
sketch = "The Dead Parrot"
class T1(threading.Thread):
def run(self):
current_sketch = str(sketch)
while 1:
if sketch != current_sketch:
print "Blimey, it's changed!"
return
class T2(threading.Thread):
def run(self):
sketches = ["Piranah Brothers",
"Spanish Enquisition",
"Lumberjack"]
while 1:
global sketch
sketch = random.choice(sketches)
t1 = T1()
t2 = T2()
t2.daemon = True
t1.start()
t2.start()
t1.join()
sys.exit()
More information about the Python-list
mailing list