Adding a Par construct to Python?
jeremy at martinfamily.freeserve.co.uk
jeremy at martinfamily.freeserve.co.uk
Tue May 19 06:57:43 EDT 2009
On 19 May, 00:32, Steven D'Aprano <st... at REMOVE-THIS-
> On Mon, 18 May 2009 02:27:06 -0700, jeremy wrote:
> > However I *do* actually want to add syntax to the language. I think that
> > 'par' makes sense as an official Python construct - we already have had
> > this in the Occam programming language for twenty-five years. The reason
> > for this is ease of use. I would like to make it easy for amateur
> > programmers to exploit natural parallelism in their algorithms. For
> > instance somebody who wishes to calculate a property of each member from
> > a list of chemical structures using the Python Daylight interface: with
> > my suggestion they could potentially get a massive speed up just by
> > changing 'for' to 'par' or 'map' to 'pmap'. (Or map with a parallel
> > keyword argument set as suggested). At present they would have to
> > manually chop up their work and run it as multiple processes in order to
> > achieve the same - fine for expert programmers but not reasonable for
> > people working in other domains who wish to use Python as a utility
> > because of its fantastic productivity and ease of use.
> There seems to be some discrepancy between this, and what you wrote in
> your first post:
> "There would be no locking and it would be the programmer's
> responsibility to ensure that the loop was truly parallel and correct."
> So on the one hand, you want par to be utterly simple-minded and to do no
> locking. On the other hand you want it so simple to use that amateurs can
> mechanically replace 'for' with 'par' in their code and everything will
> Just Work, no effort or thought required. But those two desires are
> Concurrency is an inherently complicated problem: deadlocks and race
> conditions abound, and are notoriously hard to reproduce, let alone
> debug. If par is simple, and does no locking, then the programmer needs
> to worry about those complications. If you want programmers to ignore
> those complications, either (1) par needs to be very complicated and
> smart, to do the Right Thing in every case, or (2) you're satisfied if
> par produces buggy code when used in the fashion you recommend.
> The third option is, make par really simple and put responsibility on the
> user to write code which is concurrent. I think that's the right
> solution, but it means a simplistic "replace `for` with `par` and your
> code will run faster" will not work. It might run faster three times out
> of five, but the other two times it will hang in a deadlock, or produce
> incorrect results, or both.
> you want it so simple to use that amateurs can mechanically replace 'for' with 'par' in their
> code and everything will Just Work, no effort or thought required.
Yes I do want the par construction to be simple, but of course you
can't just replace a for loop with a par loop in the general case.
This issue arises when people use OpenMP: you can take a correct piece
of code, add a comment to indicate that a loop is 'parallel', and if
you get it wrong the code with no longer work correctly. With my 'par'
construct the programmer's intention is made explicit in the code,
rather than by a compiler directive and so I think that is clearer
As I wrote before, concurrency is one of the hardest things for
professional programmers to grasp. For 'amateur' programmers we need
to make it as simple as possible, and I think that a parallel loop
construction and the dangers that lurk within would be reasonably
straightforward to explain: there are no locks to worry about, no
message passing. The only advanced concept is the 'sync' keyword,
which would be used to rendezvous all the threads. That would only be
used to speed up certain codes in order to avoid having to repeatedly
shut down and start up gangs of threads.
More information about the Python-list