[Python-Dev] collections module

Josiah Carlson jcarlson at uci.edu
Fri Jan 9 16:42:59 EST 2004


> Maybe; it probably doesn't matter; serious threaded apps pass maxsize to
> Queue, generally equal to the # of worker threads (or a small constant
> multiple thereof), and the bounded queue does double-duty then as a
> flow-control throttle too; list.pop(0) isn't really slow when len(list) is
> that small.

It really all depends on how fast the producers and consumers are.  In
the cases where I've used Queue.Queue, I didn't have small Queues.

> > and a single conditional,
>
> Sorry, I don't know what that might mean.

oops, I misspelled it, threading.Condition()

A lock object with three additional methods: wait(), notify(), and
notifyAll()

When you have the lock, you can wait() for a notification, notify() an
arbitrary waiter, or notifyAll() waiters.

Terribly nice.

> > the multiple mutexes in the current Queue module are ugly as
> > hell, and very difficult to understand.
> 
> self.mutex is dead easy to understand.  esema and fsema would be clearer if
> renamed to not_empty and not_full (respectively), and clearer still (but
> slower) if they were changed to threading.Conditions (which didn't exist at
> the time Queue was written).  Regardless, there are 3 of them for good
> reasons, although that may be beyond the understanding you've reached so
> far:

I figured it out when I decided it wasn't fast enough for my use.  The
below is quite fast for most reasonable numbers of threads - it does
wake all threads up that were waiting, but I find it considerably clearer,
and can be done with a single synchronization primitive.

from threading import Condition
#assuming fifo exists and is fast

class CQueue:
    def __init__(self, maxsize):
        if maxsize > 0:
            self._m = maxsize
        else:
            #big enough, but Max would be better
            self._m = 2**64
        self._q = fifo()
        self._l = Condition()
    def qsize(self):
        self._l.acquire()
        l = len(self._q)
        self._l.release()
        return l
    def empty(self):
        return self.qsize() == 0
    def full(self):
        return self.qsize() == self._m
    def put(self, item, block=1):
        try:
            self._l.acquire()
            while block and len(self._q) == self._m:
                self._l.wait()
                self._l.release()
                self._l.acquire()
            if len(self._q) < self._m:
                self._q.put(item)
                self._l.notifyAll()
            else:
                raise Full
        finally:
            self._l.release()
    def put_nowait(self, item):
        self.put(item, 0)
    def get(self, block=1):
        try:
            self._l.acquire()
            while block and len(self._q) == 0:
                self._l.wait()
                self._l.release()
                self._l.acquire()
            if len(self._q) > 0:
                a = self._q.get()
                self._l.notifyAll()
                return a
            raise Empty
        finally:
            self._l.release()
    def get_nowait(self):
        return self.get(0)


> > ...
> > It all really depends on the overhead of the underlying mutexes or
> > conditionals, which can be tested.
> 
> I still don't know what "conditionals" means to you (you're counting "if"
> statements, perhaps?), but mucking with mutexes isn't cheap.  You don't want
> to bother with locks unless thread-safety is an important issue.

I agree completely, most locking mechanisms are slow as hell.  There are
some tricks that can be done with critical sections (or its equivalent
on multiple platforms); an article was written about multithreaded
locking performance on IBM at least a year ago.  I wonder if any of
those performance-tuning semantics can be used with the underlying locks
in Python.

 - Josiah




More information about the Python-Dev mailing list