[Python-Dev] Pythonic concurrency

Thu Sep 29 22:06:34 CEST 2005

> I'd like to restart this discussion; I didn't mean to put forth active
> objects as "the" solution, only that it seems to be one of the better,
> more OO solutions that I've seen so far.

Thanks for doing this.  I think this is an issue that is going to be 
more and more important as Python continues to gain momentum, and I 
would love to see a PEP come out of this discussion.

> What I'd really like to figure out is the "pythonic" solution for
> concurrency. Guido and I got as far as agreeing that it wasn't
> threads.

Evaluating the situation is often confusing for Python beginners 
because:

    1. You won't often find threads being the recommended solution
       for concurrency in Python.  There is some disagreement on
       this point, as the recent thread on the GIL reveals, but I
       think the point stands.

    2. Multi-processing is often brought up as a good alternative to
       threading.

    3. There are a decent number of built-in high level abstractions
       for threaded programming in Python (Queues, Thread objects,
       Lock objects, etc.) and plenty of documentation too.  These
       abstractions also make it relatively straightforward to make
       your code work on multiple platforms.

    4. The only support for multi-processing is quite low-level, and
       can be single platform.  Forking isn't really an option on
       windows, and neither are named pipes.  Shared memory?  Forget
       it!  Sockets could be used, but thats a bit low-level.

Its really a shame.  There seems to be some consensus about 
multi-processing, but not a whole lot of interest in making it easier 
out of the box.  When it comes to multi-processing, batteries really 
_aren't_ included.  Sure, you have lead dioxide and some sulphuric 
acid, but you have to put them together to make your battery.  This 
isn't the end of the world, but I find it tedious, and I am sure it 
confuses and frustrates people new to Python.

> Here are my own criteria for what such a solution would look like:
>
> 1) It works by default, so that novices can use it without falling
> into the deep well of threading. That is, a program that you write
> using threading is broken by default, and the tool you have to fix it
> is "inspection." I want something that allows me to say "this is a
> task. Go." and have it work without the python programmer having to
> study and understand several tomes on the subject.
>
> 2) Tasks can be automatically distributed among processors, so it
> solves the problems of (a) making python run faster (b) how to utilize
> multiprocessor systems.
>
> 3) Tasks are cheap enough that I can make thousands of them, to solve
> modeling problems (in which I also lump games). This is really a
> solution to a cerain type of program complexity -- if I can just
> assign a task to each logical modeling unit, it makes such a system
> much easier to program.
>
> 4) Tasks are "self-guarding," so they prevent other tasks from
> interfering with them. The only way tasks can communicate with each
> other is through some kind of formal mechanism (something queue-ish,
> I'd imagine).
>
> 5) Deadlock is prevented by default. I suspect livelock could still
> happen; I don't know if it's possible to eliminate that.
>
> 6) It's natural to make an object that is actor-ish. That is, this
> concurrency approach works intuitively with objects.
>
> 7) Complexity should be eliminated as much as possible. If it requires
> greater limitations on what you can do in exchange for a clear,
> simple, and safe programming model, that sounds pythonic to me. The
> way I see it, if we can't easily use tasks without getting into
> trouble, people won't use them. But if we have a model that allows
> people to (for example) make easy use of multiple processors, they
> will use that approach and the (possible) extra overhead that you pay
> for the simplicity will be absorbed by the extra CPUs.
>
> 8) It should not exclude the possibility of mobile tasks/active
> objects, ideally with something relatively straightforward such as
> Linda-style tuple spaces.

This all sounds pretty brilliant to me, although even a small subset of 
what you define above would be totally adequate I think.  To me it 
breaks down simply into:

     1. The ability to spawn/create "tasks" (which are really processes)
        easily, where tasks are isolated from each other.

     2. The ability to send a message into a task.  A formal queueing
        mechanism would be nice, but the simple ability to send messages
        is completely enough to roll your own queueing.

Preventing deadlock is hard.  Mobile tasks / active objects?  Getting 
way ahead of ourselves, I think.  Really, I just want those two things 
above, and I want it as a standard part of Python.

> One thing that occurs to me is that a number of items on this wish
> list may conflict with each other, which may require a different way
> of thinking about the problem. For example, it may require two
> approaches: for "ordinary" non-OO tasks, a functional programming
> approach ala Erlang, in combination with an actor approach for
> objects.

I am not sure this is the case.  I don't really think of concurrency in 
terms of objects or good "object-oriented" design all that much.  I 
think of concurrency in terms of processes.  Is this bad?  I don't 
think so, really.  The OS knows nothing of objects, only of threads and 
processes (or only of processes).  If we had just the two items I 
mentioned above, it would be enough for people to build their own 
higher level abstractions on top of the solid built-in support.

I haven't really done the research here, and the TSM paper that was 
sent earlier makes my brain hurt.  It seems like its way more than is 
necessary to solve the problem.  If we just accomplished the two things 
I stated above, I for one would be happy.

Am I alone here?

  -- Jonathan

--
http://cleverdevil.org