A matter of queues, tasks and multiprocessing
manu3d at gmail.com
Wed Nov 10 18:25:45 CET 2010
I've tried to come up with this message for a couple of weeks now and
it doesn't look like I'm getting any clearer in my thoughts so I
decided that it's probably best to take the plunge and ask you guys to
kindly throw me a rope...
What I'm trying to come up with is some kind of simple, dynamically
scalable, domain-agnostic, multi-processor capable, tasking system.
Maybe I'm already giving myself too many constraints here, i.e. it
might all be possible except... the "simple" part.
Clearly a good (the?) starting point is to think in terms of tasks,
queues and worker threads. Once synchronization issues are properly
attended this methodology feels, at least in theory, neat and
intuitive. The problem is of course to put in practice.
For example in my context I can think of tasks with very heterogeneous
computing requirements, i.e. quick tasks such as UI interactions, slow
tasks such as I/O with large files, and pretty much everything in-
between. In this context, and given that I cannot know a priory a
task's execution time, how can I create a system that is responsive on
things like UI, takes its time on things such as I/O but gives the
application the opportunity to further customize things? I.e. a user
might want to limit the number of processors/core used to a subset of
those available. Alternatively an application might want to create
more threads than the optimal number simply because interleaving the
executions of a few tasks is preferable to simply queuing them. And
all this might have to happen at runtime.
Currently on one end of the spectrum I'm considering a purely
functional approach, with one queue for UI-related tasks, one for I/O-
related tasks and one for everything else. On a single processor/core
machine they'd simply be serviced by one thread each, all sharing one
processor. On a two processors/cores machine UI and I/O queues might
be serviced by two separate threads on the same processor while
everything else is serviced by a worker thread on the other processor.
As the number of cores/processors increases more threads/processor
would be attached to each queue, but how to establish what should get
more resources first?
Alternatively I've read about queues with threads that "steal" from
other queues. But that seems to assumes more or less homogeneous task
execution or applications where responsiveness might drop to zero,
i.e. because slow I/O tasks are being attended by all available
threads/processors. In these cases, should the UI-related thread/queue
never ever steal from the other queues, to maintain responsiveness at
all times, effectively creating an hybrid approach with one purely
functional thread/queue and (potentially) many other self-balancing
Thoughts, hints, tips, tricks, directions, reading material of any
kind will all be appreciated! =)
More information about the Python-list