[Python-3000] Useless methods in Queue module

"Martin v. Löwis" martin at v.loewis.de
Tue Jan 15 06:54:20 CET 2008


>> This is a misleading name.  The number returned from qsize() [not size
>> ()] is perfectly reliable, in the sense that it is the exact size of
>> the queue at some instant in time.
> 
> No, you're wrong.

As Mike explains: he is right, you are wrong.

> In CPython's implementation, it's likely that qsize
> is implemented by locking the structure and reading a size field

    def qsize(self):
        """Return the approximate size of the queue (not reliable!)."""
        self.mutex.acquire()
        n = self._qsize()
        self.mutex.release()
        return n
    def _qsize(self):
        return len(self.queue)


> but there are other, more highly concurrent implementations, like Java's
> ConcurrentLinkedQueue
> (http://java.sun.com/javase/6/docs/api/java/util/concurrent/ConcurrentLinkedQueue.html)
> [1], that have to traverse the structure to read its size.

Still, with the Queue.Queue API, such an implementation would hold the
queue lock while computing the size.

For the base class, Mike's statement would be true even without explicit
acquisition of the queue lock: the len() operation is atomic in the
sense that it yields some historical value of the deque object that is
being used:

static Py_ssize_t
deque_len(dequeobject *deque)
{
	return deque->len;
}

Most processors guarantee that a read operation of a Py_ssize_t
field is atomic, plus that read operation is protected by the GIL.

> Furthermore, it's misleading to think of even a locked
> implementation's result as reliable. Ignoring problems with
> simultaneity in the presence of threads, the "true" value could easily
> have changed by the time qsize returns. And if it returns a value
> that's not accurate, that can hardly be described as reliable. It
> really is returning an estimate of the size, suitable only for use in
> performance optimizations where you're prepared to handle it sometimes
> being wildly wrong.

Nonsense. In typical Queue applications, you either have multiple
readers and one writer, or multiple writers and one reader (*). For
the end of the queue that doesn't have contention, empty() resp.
full() are completely reliable (i.e. if you are the only reader,
and you see that empty() is False, you know for sure that it won't
suddenly become True).

(*) in many cases, you have exactly one reader and exactly one writer.

> Nevertheless, it would be a bad idea to assume such
> data structures will never be designed.

It also is a bad idea to make project assumptions into interfaces
which aren't actually part of the interface. That often leads to
bad code which elaborately deals with cases that can never occur.

Regards,
Martin



More information about the Python-3000 mailing list