pre-PEP: Standard Microthreading Pattern

Paul McGuire ptmcg at austin.rr.com
Tue May 1 17:31:31 EDT 2007


On May 1, 3:58 pm, dus... at v.igoro.us wrote:
> I've been hacking away on this PEP for a while, and there has been some
> related discussion on python-dev that went into the PEP:
>
>  http://mail.python.org/pipermail/python-dev/2007-February/070921.html
>  http://mail.python.org/pipermail/python-dev/2007-February/071155.html
>  http://mail.python.org/pipermail/python-dev/2007-February/071181.html
>
> I'd love to have feedback on this PEP:
>  - from a user's perspective (would you want to write microthreaded code?)
>  - from an implementer's perspective (and note that I have a reference
>    implementation that's just getting packaged up right now).
>  - from a technical perspective (efficiency, compatibility, etc.)
>  - regarding the two unresolved issues in the text
>
> And now, without further ado:
>
> PEP: XXX
> Title: Standard Microthreading Pattern
> Version: $Revision$
> Last-Modified: $Date$
> Author: Dustin J. Mitchell <dus... at cs.uchicago.edu>
> Status: Draft
> Type: Informational
> Content-Type: text/x-rst
> Created: 17-Feb-2007
> Post-History:
>
> Abstract
> ========
>
> The extended support for generators introduced in PEP 342 [2]_
> facilitated a natural way of writing generators that cooperatively
> yield control to other functions, allowing multitasking within
> a single OS thread.  Such multitasking requires a supporting
> environment (usually in the form of a scheduler), and several such
> implementations are in active use.  Each has slightly different
> structural requirements for the task-implementing code.
>
> This PEP proposes a single, consistent pattern for such code.
> This consistency will enable microthreaded code to run in a variety
> of environments, just as threaded code can currently run atop a
> variety of OS-level threading libraries.  Compliant libraries,
> e.g., a microthreaded urllib, could then be developed, providing
> batteries-included functionality for microthreaded software.
>
> Definitions
> ===========
>
> Within this PEP, "microthreading" is defined as interleaving
> the execution of multiple independent codepaths by cooperatively
> yielding control periodically in each codepath; a "microthread",
> then, is one such codepath.  This technique and variations on it
> have also been referred to as "weightless threads", "tasklets", and
> "greenlets".  It is a specilization of the more general event-based
> multitasking model.
>
> Microthreads are conceptually distinct from coroutines.  A coroutine
> can schedule a specific coroutine to be executed when it yields
> control, even passing a value to that coroutine.  Microthreads,
> however, simply yield to facilitate multitasking, with no knowledge
> or control of the next codepath to be executed.
>
> This PEP addresses two audiences: authors of microthreaded code
> (users) and designers of microthreaded environments (implementations).
>
> Motivation
> ==========
>
> Application developers usually adopt an event-based architecture in
> order to exploit asynchronous IO facilities.  Asynchronous IO is
> ideal when the overhead of an OS-level thread for each concurrent
> IO operation is too high.
>
> An event-based architecture is also useful when an application must
> perform lightweight multitasking -- for example, an email client
> must both wait for user input and monitor mailboxes for new mail.
> Implementing this sort of multitasking using threads requires careful
> use of synchronization primitives throughout the application to
> ensure correctness.  Using common event-based techniques requires
> storing state on the heap and implementing callbacks for all state
> transitions, which can quickly become complex.
>
> A microthreaded architecture brings the best of both worlds: it
> allows users to store state in local varables, just as they would
> in a threaded implementation, while avoiding the need for complex
> synchronization primitives.  An email client, for example, can
> implement the complex state of an IMAP client in a natural fashion,
> and reflect that state using natural Python data structures with no
> need for locking.
>
> Several well-developed implementations of cooperative multitasking
> are currently available, some of which implement microthreading.
> However, each implementation approaches the subject differently,
> and it is virtually impossible to share code between implementations.
>
> This PEP is intended to bring some measure of consistency to
> microthreaded code, to facilitate development of libraries and
> implementation-agnostic code repositories to support microthreaded
> development.
>
> Rationale
> =========
>
> Since the introduction of generators in PEP 255 [3]_ and their
> extension to support coroutines in PEP 342 [2]_, generators have been
> used to implement various forms of microthreading [1]_:
>
> - Example 3 in PEP 342 implements a simple trampoline scheduler.
>
> - `Kiwi tasklets`_ (formerly GTasklets) use pre-PEP 342 generators
>   along with an explicit post-yield function call to retrieve any
>   incoming value or exception.
>
> - `Twisted Python`_ includes an ``inlineCallbacks`` decorator
>   [#inlineCallbacks]_ which allows a generator to yield ``Deferred``
>   objects; callback results are then re-injected with ``send()``,
>   while errbacks result in a ``throw()``.
>
> - `Kamaelia`_ includes Axon, a library supporting microprocesses which
>   multitask by yielding frequently, and interact through a
>   well-developed IPC mechanism.
>
> - David Mertz's "Charming Python: Implementing "weightless threads"
>   with Python generators" [#Mertz]_ includes a basic microthreading
>   scheduler and suggests directions for improvement.
>
> - An ASPN recipe by Maciej Obarski [#Obarski]_ gives a simple
>   scheduler for task objects represnted by generators.
>
> Each of these implementations have specific requirements of the
> microthreaded code.  The requirements differ enough that code written
> for one environment will generally not function properly in another.
>
> A common pattern for all microthreaded code will allow users to write
> microthreaded code that will be portable to multiple microthreading
> implementations.  This will include libraries, which can then be
> shared more broadly.
>
> The specification in this PEP specifically avoids reference
> to any symbols not included in the built-in namespace, so
> environment-agnostic microthreaded code need not import any
> environment-specific modules.
>
> Implementation Specification
> ============================
>
> An implementation is responsible for executing a body of code written
> by its users according to the microthreading pattern.
>
> Like a standard python thread, execution of a microthread begins with
> a call to a single function, and ends when that function completes.
> The function may call other functions, and so on; the entire control
> flow constitutes a microthread.  Completely unrelated control flows
> may be occurring simultaneously in other microthreads.
>
> Within a microthreading implementation, all microthreads operate
> within a single OS-level thread, so at most one microthread is
> executing at any point.  The implementation must not switch between
> microthreads except at times specified by the user.
>
> A microthreaded function is written as a generator function, with
> the points at which other microthreads may be executed indicated
> by ``yield``.  Normal (non-generator) functions thus cannot be
> interrupted.
>
> While a microthreaded function is executing, its execution is
> represented by a generator.  The implementation is responsible for
> ensuring that this generator has its ``next``, ``send``, and ``throw``
> methods called appropriately.  Specifically, it must call ``next`` to
> initiate processing, and ``step`` to execute each subsequent segment.
> At the completion of each segment, if the return value of ``next``,
> ``send``, or ``throw`` is not one of the special values described
> below, then that value must be supplied to the generator via ``send``
> when it is next scheduled.
>
> When the generator is complete, it will raise a ``StopIteration``
> exception, which the implementation must catch and handle by either
> resuming execution of the calling function (see `Nested Function
> Calls`, below) or terminating the microthread.  Other exceptions
> must be handled as described in the `Exceptions` section, below.
>
> Nested Function Calls
> ---------------------
>
> When a microthreaded function yields a generator, then it is invoking
> another microthreaded function, the execution of which is represented
> by the yielded generator.  Execution of the current function must be
> suspended until the new function has been executed to completion.
> Any return value from the new function must be supplied to the
> calling function via its generator's ``send`` method.
>
> Although the underlying implementation may vary, the effect is of
> a stack of generators within each microthread, with the topmost
> generator representing the state of the currently executing
> microthreaded function, and the remaining generators suspended
> awaiting its completion.
>
> Return Values
> -------------
>
> Microthreaded functions return values to their callers by raising
> a ``StopIteration`` exception containing the return value in
> ``args[0]``.  Implementations must catch this exception and handle it
> appropriately by supplying the return value to the next generator on
> the stack via ``send``, or if there is no next generator, terminating
> the microthread.
>
> Exceptions
> ----------
>
> When a generator raises an exception, that exception must
> be propagated to the next generator on the stack via ``throw``.
> If there is no next generator, then the implementation may display
> a traceback to the user or take other appropriate action.
>
> Implementations may adjust tracebacks to remove implementation-related
> frames, but must not alter the exception itself.
>
> Special Values
> --------------
>
> Implementations may attach special significance to other yielded
> values or raised exceptions, but these values and exceptions must
> be unique objects which could not be produced by code written
> with no knowledge of the implementation.  Specifically, no special
> significance may be attached to any of the built-in Python types or
> types used in the standard library.  Rather, an implementation should
> attach meaning to classes and objects local to the implementation's
> namespace.
>
> Thread Manipulation
> -------------------
>
> Implementations are responsible for providing any appropriate
> functionality for manipulating microthreads.  This PEP does not
> place any restrictions on such functionality.
>
> Pattern Specification
> =====================
>
> This section specifies the microthreading pattern from the perspective
> of a user.  In general, microthreaded code should closely resemble
> ordinary threaded code, with the addition of ``yield`` keywords at
> strategic locations.
>
>  - Microthreaded functions are written as generator functions.  Their
>    execution will not be interrupted except at statements including
>    the ``yield`` keyword.  The value of a ``yield`` expression is
>    the value of its argument, unless the argument is one of the
>    special values discussed in this section.
>
>  - A microthreaded function can call another microthreaded function by
>    yielding the generator resulting from calling the microthreaded
>    function.  The value of the ``yield`` expression is the return
>    value of the called function.
>
>  - A microthreaded function can "return" a value by raising
>    ``StopIteration`` with that value as an argument::
>
>      raise StopIteration(42)
>
>  - An exception raised in a microthreaded function will propagate
>    to its callers just as in a standard function.  Uncaught exceptions
>    will be handled in an implementation-dependent fashion.
>
>  - Functions for thread manipulation are not specified in this PEP,
>    and are implementation-dependent.
>
> Example
> -------
>
> This trivial example highlights each of the aspects of the
> multithreaded pattern::
>
>     def fibonacci(n):
>         latest, i = (1, 1), 2
>         if n < 1:
>             raise ValueError                    # raise exception
>         while i < n:
>             latest = (latest[1], latest[0] + latest[1])
>             yield                               # cooperative yield
>         raise StopIteration(latest[1])          # return value
>
>     def fibsquared(n):
>         try:
>             fibn = (yield fibonacci(n)) ** 2    # function call
>         except ValueError:                      # catch exception
>             print "Sorry, cannot calculate fibsquared of", n
>         else:
>             print "fibsquared of", n, "is", fibn
>
> Backwards Compatibility
> =======================
>
> As this is an informational PEP, no backward compatibility problems
> will arise from its adoption.  In most cases, this PEP specifies
> a superset of the functionality of existing microthreading
> enviroments, so existing microthreaded code will continue to run
> without modification.
>
> One exception to this rule is Kamaelia's Axon, which specifies
> that generators which yield a false value will be terminated.
> That situation is not compatible with this PEP, which directs that
> such a value should be returned from the yield at the next execution
> of the microthread.
>
> The specification of Kiwi tasklets requires that generators
> implementing tasklets call ``tasklet.get_event()`` after most yields.
> This is not necessary under the specification in this PEP, and the
> ``get_event()`` function can simply be stubbed out to ensure backward
> compatibility.
>
> Unresolved Issues
> =================
>
> Return Values
> -------------
>
> Currently, a ``return`` statement with a value in a generator is
> a syntax error.  The Python community should consider supporting
> the behavior described above in the Python compiler.  Specifically,
> the compiler would treat::
>
>     return foo
>
> in a generator as equivalent to::
>
>     raise StopIteration(foo)
>
> This change raises no backward-compatibility issues (as the former
> expression is not currently legal), but may introduce some surprising
> behavior as a function could then continue executing after a return
> statement::
>
>     try:
>         return 10
>     except StopIteration:
>         print "I'm still here!"
>
> Non-Microthreaded Generators
> ----------------------------
>
> Functions which return "normal" generators may confuse a
> microthreading implementation if those generators are accidentally
> yielded.  Consider::
>
>     def get_constants(dict):
>         return ( k for k in dict.iterkeys() if k[0] in string.uppercase )
>
>     def examine_module(mod):
>         d = mod.__dict__
>         constants = yield get_constants(d)
>
> this example will fail, because the implementation will treat the
> generator expresion in ``get_constants`` as a microthreaded function,
> calling its ``send`` method until it is exhaustd, then assigning
> ``None`` to ``constants`` in ``examine_module``.  
>
> The problem only occurs when such a generator is yielded: the
> snippet above will operate correctly if ``yield`` is removed from the
> last line.  Thus users can avoid the problem by exercising caution
> in selecting the values that are yielded.
>
> References
> ==========
>
> .. [1] Stackless is not included in this list because, while it does
>    implement a microthreaded environment, it does so without the use
>    of generators and (as of this writing) requires a modified Python
>    binary.
>
> .. [2] PEP 342, "Coroutines via Enhanced Generators", van Rossum, Eby
>    (http://www.python.org/peps/pep-0342)
>
> .. [3] PEP 355, "Simple Generators", Schemenauer, Peters, Hetland
>    (http://www.python.org/peps/pep-0342)
>
> .. _Kiwi tasklets:
>    http://www.async.com.br/projects/kiwi/api/kiwi.tasklet.html
>
> .. _Twisted Python:http://twistedmatrix.com/
>
> .. [#inlineCallbacks]
>    http://twistedmatrix.com/documents/current/api/twisted.internet.defer...
>
> .. _Kamaelia:http://kamaelia.sourceforge.net/
>
> .. [#Mertz] "Charming Python: Implementing 'weightless threads' with
>    Python generators," Mertz,
>    (http://www-128.ibm.com/developerworks/library/l-pythrd.html)
>
> .. [#Obarski] "simple, cooperative multitasking using generators,"
>    Obarski,
>    http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/466008
>
> Copyright
> =========
>
> This document has been placed in the public domain.
>
> ..
>     Local Variables:
>     mode: indented-text
>     indent-tabs-mode: nil
>     sentence-end-double-space: t
>     fill-column: 70
>     coding: utf-8
>     End:

Simpy also seems to be part of this landscape, albeit in the guise of
a discrete event simulator.  The Simpy engine manages Processes that
share Resources.  Processes synchronize among each other by yielding
one of various values to the scheduler, which it uses to determine
when to call back to the Process.  Forgive me, Simpy-ers if I have
butchered this description, my experience with Simpy is a few years
old now.

The interesting thing here is that Simpy processes do not simply yield
in order to pass control on "to whoever's next" - the value returned
with the yield dictates how the Process gets rescheduled at some time
in the future ("future" being either realtime or simulated time - a
discrete event simulator can simply advance from scheduled event to
scheduled event, without having to actually wait for the intervening
time to elapse).

-- Paul




More information about the Python-list mailing list