[Python-ideas] micro-threading PEP proposal (long) -- take 2!

Tue Aug 26 14:50:00 CEST 2008

Hello,

I haven't read everything in detail, but a few comments.

> The advantages to the Twisted approach over Posix threads are:
> 
> #. much less memory is required per thread

Yes, which also means better CPU cache utilization and thus potentially better
scalability.

> #. faster thread creation

Certainly.

> #. faster context switching (I'm guessing on this one, is this really true?)

Depends. The CPU overhead of context switching is probably lower (although
that's not sure in the case of Twisted, since the reactor is written in pure
Python). However, in a cooperative threading model (rather than the traditional
preemptive model), latencies tend to go up unless you have a lot of possible
switching points.

> #. synchronization between threads is easier because there is no preemption,
>    making it much easier to write critical sections of code.

This is definitely the primary advantage of the Twisted approach.

> Finally, once Python has this deferred mechanism in place at the C level,
> many things become quite easy at the Python level.  This includes full
> micro-threading, micro-pipes between micro-threads, new-style generators 
> that
> can delegate responsibility for generating values to called functions 
> without
> having to intervene between their caller and the called function, parallel
> execution constructs (``parallel_map``).

Probably. However, since the core of your proposal is itself far from trivial, I
suggest you concentrate on it in this PEP; the higher-level constructs can be
deferred ( :-)) to another PEP.

> #. An addition of non_blocking modes of accessing files, sockets, time.sleep
>    and other functions that may block.  It is not clear yet exactly what 
> these
>    will look like.  The possibilities are:
> 
>    - Add an argument to the object creation functions to specify blocking or
>      non-blocking.
>    - Add an operation to change the blocking mode after the object has been
>      created.
>    - Add new non-blocking versions of the methods on the objects that may
>      block (e.g., read_d/write_d/send_d/recv_d/sleep_d).
>    - Some combination of these.

Sounds ok. FWIW, the py3k IO stack is supposed to be ready for non-blocking IO,
but this possibility is almost completely untested as of yet.

>    It may also be useful to add a locking capability to files and sockets so
>    that code (like traceback.print_exception) that outputs several lines can
>    prevent other output from being intermingled with it.

I don't think it's critical.

> #. Micro_thread objects.  Each of these will have a re-usable C deferred
>    object attached to it, since each micro_thread can only be suspended
>    waiting for one thing at a time.  The current micro_thread would be 
> stored
>    within a C global variable, much like ``_PyThreadState_Current``.

By "global", you mean "thread-local", no? That is, there is (at most) one
currently running micro-thread per OS-level thread.

>    There are three usage scenarios, aided by three different functions to
>    create micro-threads:

I suggest you fold those usage scenarios into one simple primitive that launches
a single micro-thread and provides a way to wait for its result (using a
CDeferred I suppose?). Higher-level stuff ("start_in_parallel") does not seem
critical for the usefulness of the PEP.

>       This final scenario uses *micro_pipes* to allow threads to 
> cooperatively
>       solve problems (much like unix pipes)::

What is the added value of "micro pipes" compared to, e.g., a standard Python
list or deque? Are they non-blocking?

>    - ``close()`` to cause a ``StopIteration`` on the ``__next__`` call.
>      A ``put`` done after a ``close`` silently terminates the micro_thread
>      doing the ``put`` (in case the receiving side closes the micro_pipe).

Silencing this sounds like a bad idea.

>    So each micro_thread may have a *stdout* micro_pipe assigned to them and
>    may also be assigned a *stdin* micro_pipe (some other micro_thread's 
> stdout
>    micro_pipe).

Hmm, is it really necessary? Shouldn't micro-threads just create their own pipes
when they need them? The stdin/stdout analogy is only meaningful in certain
types of workloads.

> ``PyDeferred_CDeferred`` is written as a new exception type for use by the
> C code to defer execution.  This is a subclass of ``NotImplementedError``.
> Instances are not raised as a normal exception (e.g., with
> ``PyErr_SetObject``), but by calling ``PyNotifier_Defer`` (described in the
> Notifier_ section, below).  This registers the ``PyDeferred_CDeferred``
> associated with the currently running micro_thread as the current error 
> object,

I'm not sure I understand this right. Does this mean there is a single,
pre-constructed CDeferred object for each micro-thread? If yes, then this
deviates slightly from the Twisted model where many deferreds can be created
dynamically, chained together etc.

> One peculiar thing about the stored callbacks, is that they're not really a
> queue.  When the C deferred is first used and has no saved callbacks,
> the callbacks are saved in straight FIFO manor.  Let's say that four
> callbacks are saved in this order: ``D'``, ``C'``, ``B'``, ``A'`` (meaning
> that ``A`` called ``B``, called ``C``, called ``D`` which deferred):

In this example, can you give the C pseudo-code and the equivalent Twisted
Python (pseudo-)code?

(I haven't read the Reactor part so I won't comment on it)

> #. How is process termination handled?

Raising SystemExit (or another BaseException-derived exception, e.g. ThreadExit)
in all micro-threads sounds reasonable.

> #. How does this impact the debugger/profiler/sys.settrace?

:-)

Last point: you should try to get some Twisted guys involved in the writing of
the PEP if you want it to succeed.

Regards

Antoine.