PEP for nondeterminism.

Adam DePrince adam.deprince at gmail.com
Sat Mar 25 06:06:08 CET 2006


I frequently find myself in a situation where I wish I could say "the
execution order of these two lines just doesn't matter."  

Proposed here are two freshly drafted PEP's advocating for the addition
of nondeterminism as a spring board for future automatic parallelism.  

The first calls for core language changes, the second calls for a minor
(and likely symbolic) change to list comprehension's semantics to better
support future parallelism.



PEP: XXX
Title: Nondeterminism
Version: $Revision$
Last-Modified: $Date$
Author: Adam DePrince <adam.deprince at gmail.com>
Status: Draft
Python-Version: 3.0
Type: Standards
Content-Type: text/plain
Created: 24-Mar-2006
Post-History:


Abstract

    Proposed herein is are two new control flow structures that allow
    the explicit specification of parallelism by defining the order of
    expression evaluation as non-deterministic, thus removing the
    requirement of sequential execution.
    

Background

    A number of mechanisms have been proposed in Python that have the
    side effect of permitting the automatic discovery of parallelism.
    PEP 288 presents the notion of a generator, an object that
    inherently lends itself to concurrent operation and possibility
    automatic self organization of a pipeline.

    Stackless Python in PEP 219/220 presents a mechanism by which a
    large number micro threads could be efficiently supported, thus
    allowing for a realistic manifestation of a deep generator based
    software pipeline.

    Additionally, this PEP's sibling, XXX, proposes that the
    execution order of list comprehensions be made non-deterministic
    to permit future parallel implementations.



DISCUSSION

    Two new control flow structures are proposed.  While as of yet no
    name is proposed, for the sake of brevity we will refer to them as
    any and first.

    The any structure would call for the execution of all expressions
    within the any block in an non-determinate order.  The user would
    warrant that no side effects exist between the expressions, and
    assigned to the variable the any block would be a list
    representing the return values of each of the expressions in the
    order they appear in the code (but not necessarily in the order of
    execution.)

    The first structure would execute all enclosed expressions
    concurrently, again, the order of execution being
    non-deterministic, other than the promise that all would be
    started.  The return value of the first expression to finish would
    be assigned to the variable associated with the first block and
    the execution of the other operations terminated.

    

SPECIFICATION

    The use of the word 'any' and 'first' is only a convenience.  I do
    not yet propose a keyword for these structures.

    ...
    any retval:
        1
        2+3
        my_function_that_returns_17()
    ...

    retval would be assigned [1, 5, 17] Only the order of evaluation
    is not deterministic.

    first retval:
         urllib.open( 'mirror1' ).read()
         urllib.open( 'mirror2' ).read()
         urllib.open( 'mirror3' ).read()
    ....

    retval would contain the data from the fastest mirror, the other
    threads would be disposed of.



IMPLEMENTATION
    TBD - Incomplete


REFERENCES
    TBD
        

COPYRIGHT

    This document has been placed in the public domain.






PEP: XXX
Title: Nondeterministic List comprehensions
Version: $Revision$
Last-Modified: $Date$
Author: Adam DePrince <adam.deprince at gmail.com>
Status: Draft
Python-Version: 3.0
Type: Standards
Content-Type: text/plain
Created: 24-Mar-2006
Post-History:


Abstract

    The semantics of the list comprehension offer the future benefit
    of multi-threading and parallel execution, assuming that the
    programmer treat the order of evaluation as non-deterministic and
    avoids the introduction of dependencies and side-effects.

Background

    A number of mechanisms have been proposed in Python that have the
    side effect of permitting the automatic discovery of parallelism.
    PEP 288 presents the notion of a generator, an object that
    inherently lends itself to concurrent operation and possibility
    of automatic self organization of pipelined, and thus parallel,
    operation.  

    Stackless Python in PEP 219/220 presents a mechanism by which a
    large number micro threads could be efficiently supported, thus
    allowing for a realistic manifestation of a deep generator based
    software pipeline.


DISCUSSION

    Generators and generator comprehension are often regarded as
    superior to list comprehension due to the reduced footprint of
    in-flight data and the future potential of vertical parallelism via
    pipelining.

    List comprehensions, while having the drawback of requiring the
    presence of the entire working set in-flight, places no inherent
    restriction on the order of execution.  This permits us to
    distribute the execution effort horizontally across multiple
    micro-threads, with a far higher potential for parallelism than
    generators at the expense of requiring the user warrant the
    absence of side effects of dependencies between each passing of
    the list comprehension loop.

    This minor semantic requirement that list comprehension possess
    no interdependencies or side effects permits us to operate on the
    list in an arbitrary order, including dispatching the execution
    across multiple threads.

SPECIFICATION

    We change the semantics of a the list comprehension to
    explicitly state that the order of evaluation is
    non-deterministic, and that there should exist no dependencies or
    side effects.  This leaves open the potential for future
    parallelization.


PROBLEMS

    There are a currently a number of problems.  Generators provide an
    intuitive and natural partitioning of the problem.  The future
    partitioning of a list comprehension is a more capricious;
    generally, we choose a fixed number of threads to dovetail across,
    but little guidance can be gleaned from the use.

    The second drawback is the current state of python.  With the
    global interpreter lock firmly in place, we have no compelling
    argument by which to parallelize our list comprehension, even if
    they were partitioned.  Currently the creation of said
    micro-threads and the dovetailing of the workload would actually
    increase the execution time in the current 2.4 implementation of
    Python.  Generators, even with the current lack of real
    parallelism to the resulting pipeline, benefit from the drastic
    reduction in the amount of in-flight data during their execution.

    The third problem is in the near term, this PEP calls for no
    implementation change, its only a request that users regard the
    execution order as non-deterministic.  There is a reasonable change
    that this would be ignored.

    Lastly, there has been discussion of recasting list comprehension
    as generator comprehensions as an easy of normalizing the
    semantics of the iterator variable with respect to the callee's
    name-space.  Such an effort would assign an explicit order of
    execution to the list comprehension.


REFERENCES
    TBD
        

COPYRIGHT

    This document has been placed in the public domain.







More information about the Python-list mailing list