Queue qsize = unreliable?
dave at pythonapocrypha.com
Mon Aug 9 19:56:20 CEST 2004
Ames Andreas (MPA/DF) wrote:
> Dave Brueck wrote:
>>But is it blocking in any way that you (the caller) will likely
>>notice? (probably not - the mutex is held for a relatively short
> How I notice a blocking function?
> 1) As a programmer: I have to read the docs and unfortunately all too
> often the code.
What I meant was that the fact that a lock is being acquired in the
module implementation doesn't really affect your code (or its
performance in most cases) in practice.
> 2) As a "caller": I'm wasting time when blocking without a reason.
Not really because (1) the code would block only when more than one
thread is accessing the lock at the same time, (2) there is a fairly
small chance that more than one thread will be accessing that particular
lock at the same time (especially in the scenarios you described
earlier), and (3) even when there is contention for the lock it will be
over a very brief period of time.
>>>With CPython's assertions about atomicity of certain list
>>>operations it's possible to implement a lock free queue that is
>>>even usable for the single producer or single consumer case.
>>If you have the single producer and/or single consumer case, you
>>don't need a Queue at all - just use a built-in list (or in 2.4 a
> I think I didn't make my point clear enough. The most important word
> in the above paragraph was "usable", which means more to me than just
> thread-safe. I'd be interested in a usable (i. e. non-polling)
> implementation of a queue without any locks.
What problem has the current implementation caused in your application?
Or are you more concerned about what problem it may cause?
In any case, why don't you just do something like:
return len(self.queue) == 0
> I just included the "else" branch to make clear that only one of
> empty's two possible outcomes is reliable. It makes much more sense
> and it is much closer to my specific application to write:
> while !q.empty():
> When leaving this loop I'll go back to sleep on select.
As I said earlier: if this is your use case you don't need a queue
object - just use a built-in list:
> As to the "extremely short amount of time (relatively speaking)": I'm
> still very impressed by Dan Kegel's c10k paper.
Yes, it is a good paper. BTW - are you really using select() as you
mentioned above or is it some special implementation of select? (because
if it's just an ordinary select then you're not going to be able to
support that many connections anyway, and even if you could, using
something else instead of select would give you much greater performance
improvements than eliminating the Queue lock usage.
> And following that
> one I'm not so much concerned about the amount of time that a producer
> may hold the lock (that time may very well be ignorable), when the
> single consumer tries to acquire it, but much more about the fact that
> it potentially needs at least *two* context switches per iteration
> until the one thread that feeds all others and thus that all others
> depend upon gets back to work.
So is this actually causing you a problem in practice or is this closer
to some premature optimization? No offense intended - I ask only because
it seems very unlikely that this is going to be one of your major
Yes, you could end up with multiple context switches *in the worst
case*, but you have to also consider (1) how likely the worst case
scenario is to come up and (2) how that stacks up against other
bottlenecks in terms of performance.
If you're using the queues to pass around socket-level work (reading and
writing bytes) then I can tell you right now that you're not going to
get anywhere near the 10k connection limit with that approach on
standard CPython - the overhead of the Queue itself is too high - so
it'd be in your best interest to profile and work on the actual
bottlenecks first. If you're instead using the queues to pass around
application-level work ("run this DB query", "send this response to the
client", etc), then it's highly unlikely that the mutex acquisition in
Queue.get(False) will have any measurable impact in overall application
performance because - so again the best bet is to profile to find
current, real hotspots in the code.
FWIW, Python _is_ a pretty good platform for developing relatively
high-performance servers. I've found that, with not too much work, it's
easy to have a clean implementation and still be on par with (and even
beat) something like Apache both in terms of total throughput,
connection response times, etc. BUT, if you want to approach very large
numbers of connections you may want to base your implementation on
Stackless Python and avoid normal Python threads (and Queues, and...)
More information about the Python-list