[Python-Dev] Pythonic concurrency
jonathan-lists at cleverdevil.org
Thu Sep 29 22:06:34 CEST 2005
> I'd like to restart this discussion; I didn't mean to put forth active
> objects as "the" solution, only that it seems to be one of the better,
> more OO solutions that I've seen so far.
Thanks for doing this. I think this is an issue that is going to be
more and more important as Python continues to gain momentum, and I
would love to see a PEP come out of this discussion.
> What I'd really like to figure out is the "pythonic" solution for
> concurrency. Guido and I got as far as agreeing that it wasn't
Evaluating the situation is often confusing for Python beginners
1. You won't often find threads being the recommended solution
for concurrency in Python. There is some disagreement on
this point, as the recent thread on the GIL reveals, but I
think the point stands.
2. Multi-processing is often brought up as a good alternative to
3. There are a decent number of built-in high level abstractions
for threaded programming in Python (Queues, Thread objects,
Lock objects, etc.) and plenty of documentation too. These
abstractions also make it relatively straightforward to make
your code work on multiple platforms.
4. The only support for multi-processing is quite low-level, and
can be single platform. Forking isn't really an option on
windows, and neither are named pipes. Shared memory? Forget
it! Sockets could be used, but thats a bit low-level.
Its really a shame. There seems to be some consensus about
multi-processing, but not a whole lot of interest in making it easier
out of the box. When it comes to multi-processing, batteries really
_aren't_ included. Sure, you have lead dioxide and some sulphuric
acid, but you have to put them together to make your battery. This
isn't the end of the world, but I find it tedious, and I am sure it
confuses and frustrates people new to Python.
> Here are my own criteria for what such a solution would look like:
> 1) It works by default, so that novices can use it without falling
> into the deep well of threading. That is, a program that you write
> using threading is broken by default, and the tool you have to fix it
> is "inspection." I want something that allows me to say "this is a
> task. Go." and have it work without the python programmer having to
> study and understand several tomes on the subject.
> 2) Tasks can be automatically distributed among processors, so it
> solves the problems of (a) making python run faster (b) how to utilize
> multiprocessor systems.
> 3) Tasks are cheap enough that I can make thousands of them, to solve
> modeling problems (in which I also lump games). This is really a
> solution to a cerain type of program complexity -- if I can just
> assign a task to each logical modeling unit, it makes such a system
> much easier to program.
> 4) Tasks are "self-guarding," so they prevent other tasks from
> interfering with them. The only way tasks can communicate with each
> other is through some kind of formal mechanism (something queue-ish,
> I'd imagine).
> 5) Deadlock is prevented by default. I suspect livelock could still
> happen; I don't know if it's possible to eliminate that.
> 6) It's natural to make an object that is actor-ish. That is, this
> concurrency approach works intuitively with objects.
> 7) Complexity should be eliminated as much as possible. If it requires
> greater limitations on what you can do in exchange for a clear,
> simple, and safe programming model, that sounds pythonic to me. The
> way I see it, if we can't easily use tasks without getting into
> trouble, people won't use them. But if we have a model that allows
> people to (for example) make easy use of multiple processors, they
> will use that approach and the (possible) extra overhead that you pay
> for the simplicity will be absorbed by the extra CPUs.
> 8) It should not exclude the possibility of mobile tasks/active
> objects, ideally with something relatively straightforward such as
> Linda-style tuple spaces.
This all sounds pretty brilliant to me, although even a small subset of
what you define above would be totally adequate I think. To me it
breaks down simply into:
1. The ability to spawn/create "tasks" (which are really processes)
easily, where tasks are isolated from each other.
2. The ability to send a message into a task. A formal queueing
mechanism would be nice, but the simple ability to send messages
is completely enough to roll your own queueing.
Preventing deadlock is hard. Mobile tasks / active objects? Getting
way ahead of ourselves, I think. Really, I just want those two things
above, and I want it as a standard part of Python.
> One thing that occurs to me is that a number of items on this wish
> list may conflict with each other, which may require a different way
> of thinking about the problem. For example, it may require two
> approaches: for "ordinary" non-OO tasks, a functional programming
> approach ala Erlang, in combination with an actor approach for
I am not sure this is the case. I don't really think of concurrency in
terms of objects or good "object-oriented" design all that much. I
think of concurrency in terms of processes. Is this bad? I don't
think so, really. The OS knows nothing of objects, only of threads and
processes (or only of processes). If we had just the two items I
mentioned above, it would be enough for people to build their own
higher level abstractions on top of the solid built-in support.
I haven't really done the research here, and the TSM paper that was
sent earlier makes my brain hurt. It seems like its way more than is
necessary to solve the problem. If we just accomplished the two things
I stated above, I for one would be happy.
Am I alone here?
More information about the Python-Dev