When giving some examples in a previous post, I gave this
def __init__(self, label, children=()):
self.label = label
self.children = children
skip = yield self.label
if skip == 'SKIP':
for child in self.children:
yield from child
Here is a tree:
tree = Tree('A', [Tree('B'), Tree('C')])
Here is an example of how to traverse it, avoiding the children of
nodes called 'B'. I can't use a for-loop as I need to send the
skip-value to the tree iterator and this makes the loop look quite
complicated. Here's one way to do it:
i = iter(tree)
skip = None
a = i.send(skip)
skip = 'SKIP' if a == 'B' else None
Now imagine that the 'continue' statement within a for-loop has an
optional argument that is sent to the generator being looped over at
the next iteration step. I would then be able to write the loop above
much more simply:
for a in tree:
if a == 'B':
I see there have been discussion about module __call__ about three years ago:
Is there an existing pronouncement on this subject?
__call__ would help avoiding strange things like from StringIO import StringIO
or having to come up with silly names like run, driver, manager, etc.
Ideally, __call__ could be either a function or class.
I imagine, nothing special, except that a module object looks for __call__ instead
of producing a type error.
I would like to use python but I really hate the way you guys change versions
If it could be possible could there be a separate version of python like "IcePython"
that would be a executable with a bz2 file containing all the py files for the modules.
Then when i run the script i would run this single executable and it would dig into it's own version of py files hidden in the bz2.
This would make life much easier since I could ignore the rapid changes in the api of python until i'm ready to move code to a new version.
I think there's a reason corporations want long 5 year versions of linux.
This would help when you have many different boxes running many different oses and many different python versions.
it would be more of a write once with this executable version and it's bz2 file tagging along.
Then no matter what box i want to run it on i just drop 3 files and it will run there.
On Thu, Feb 19, 2009 at 5:15 PM, Greg Ewing <greg.ewing(a)canterbury.ac.nz>wrote:
> I want something that means "Do what that stuff
> would have done if I'd written it all out right
> here." Anyone got a really good thesaurus handy?
I wonder if this isn't going in the wrong direction. This syntax change is
being considered because there's no way to change the control flow inside of
a function (so that, for example, we can yield multiple items) besides using
one of the existing statements or executing some other function. If there
were, one could just write:
from itertools import yield_from
With functions the lack of external regulation on control flow doesn't seem
to be a big deal, but apparently for generators it is… So maybe we need to
think more clearly about what kinds of control flow changes are appropriate
for generators… Is yield from really going to solve all our problems? Or
will we be back for a new keyword in 6 months?
Whether you're using yield-from or not, it doesn't
seem to be possible to have a for-loop iterating
over something that is also suspendable in a
generator-based thread setting.
The basic problem is that we have one channel and
two different things we want to use it for. The
obvious answer is that we need to multiplex. Since
we're already using the entire bandwidth of the
channel (anything could be a valid yielded value)
we need to introduce some out-of-band data somehow.
Suppose we have a new expression
suspend [<value>] [with <tag>]
This is a lot like a yield, except that it sends
a tuple (value, tag).
The existing yield expression
becomes equivalent to
suspend <value> with 'yield'
There is a new generator method to go along with
If the generator is suspended at a suspend expression,
the value of the suspend expression becomes (value, tag).
If it is suspended at a yield, and the tag is 'yield'
then the value becomes the value of the yield expression.
(Not sure what to do in other cases, maybe raise an
The existing send() method is mapped to resume() as
def send(self, value):
value2, tag2 = self.resume(value, 'yield')
if tag2 == 'yield':
# What to do here? Ignore? Raise an exception?
So we've generalised the yield channel into a suspend
channel, which can have any number of sub-channels. We have
reserved one of these sub-channels, tagged with 'yield', for
carrying yielded values. The rest of the channels are free
for use by other things such as thread-scheduling libraries.
To complete this, we also need a variant of the for-loop
that is willing to pass values from the other channels on to
the caller. Picking a random syntax for illustration,
for y from g(x):
would be roughly equivalent to
it = g(x)
value = None
tag = 'yield'
value, tag = it.resume(value, tag)
if tag == 'yield':
y = value
value = None
value, tag = suspend value with tag
plus suitable handling of 'throw' and 'close'.
It seems that the python-ideas mail server really
doesn't like my attachment, so it's time for Plan C.
You can download the code from here:
Is the mail server configured not to accept attachments
or something? The error message I'm getting is
Your message cannot be delivered to the following recipients:
Recipient address: python-ideas(a)python.org
Reason: SMTP transmission failure has occurred
Diagnostic code: smtp;554 permanent error
Remote system: dns;mail.python.org (TCP|126.96.36.199|1204|188.8.131.52|25)
(bag.python.org ESMTP )
Fourth draft of the PEP. Corrected an error in the
expansion and added a bit more to the Rationale.
Title: Syntax for Delegating to a Subgenerator
Author: Gregory Ewing <greg.ewing(a)canterbury.ac.nz>
Type: Standards Track
A syntax is proposed to allow a generator to easily delegate part of
its operations to another generator, the subgenerator interacting
directly with the main generator's caller for as long as it runs.
Additionally, the subgenerator is allowed to return with a value,
and the value is made available to the delegating generator.
The new syntax also opens up some opportunities for optimisation when
one generator re-yields values produced by another.
The following new expression syntax will be allowed in the body of a
yield from <expr>
where <expr> is an expression evaluating to an iterable, from which an
iterator is extracted. The effect is to run the iterator to exhaustion,
during which time it behaves as though it were communicating directly
with the caller of the generator containing the ``yield from`` expression
(the "delegating generator").
* Any values that the iterator yields are passed directly to the
* Any values sent to the delegating generator using ``send()``
are sent directly to the iterator. (If the iterator does not
have a ``send()`` method, values sent in are ignored.)
* Calls to the ``throw()`` method of the delegating generator are
forwarded to the iterator. (If the iterator does not have a
``throw()`` method, the thrown-in exception is raised in the
* If the delegating generator's ``close()`` method is called, the
iterator is finalised before finalising the delegating generator.
The value of the ``yield from`` expression is the first argument to the
``StopIteration`` exception raised by the iterator when it terminates.
Additionally, generators will be allowed to execute a ``return``
statement with a value, and that value will be passed as an argument
to the ``StopIteration`` exception.
result = yield from expr
is semantically equivalent to
_i = iter(expr)
_u = _i.next()
_v = yield _u
except Exception, _e:
if hasattr(_i, 'throw'):
if hasattr(_i, 'send'):
_u = _i.send(_v)
_u = _i.next()
except StopIteration, _e:
_a = _e.args
if len(_a) > 0:
result = _a
result = None
if hasattr(_i, 'close'):
A Python generator is a form of coroutine, but has the limitation that
it can only yield to its immediate caller. This means that a piece of
code containing a ``yield`` cannot be factored out and put into a
separate function in the same way as other code. Performing such a
factoring causes the called function to itself become a generator, and
it is necessary to explicitly iterate over this second generator and
re-yield any values that it produces.
If yielding of values is the only concern, this is not very arduous
and can be performed with a loop such as
for v in g:
However, if the subgenerator is to interact properly with the caller
in the case of calls to ``send()``, ``throw()`` and ``close()``, things
become considerably more complicated. As the formal expansion presented
above illustrates, the necessary code is very longwinded, and it is tricky
to handle all the corner cases correctly. In this situation, the advantages
of a specialised syntax should be clear.
Generators as Threads
A motivating use case for generators being able to return values
concerns the use of generators to implement lightweight threads. When
using generators in that way, it is reasonable to want to spread the
computation performed by the lightweight thread over many functions.
One would like to be able to call a subgenerator as though it were
an ordinary function, passing it parameters and receiving a returned
Using the proposed syntax, a statement such as
y = f(x)
where f is an ordinary function, can be transformed into a delegation
y = yield from g(x)
where g is a generator. One can reason about the behaviour of the
resulting code by thinking of g as an ordinary function that can be
suspended using a ``yield`` statement.
When using generators as threads in this way, typically one is not
interested in the values being passed in or out of the yields.
However, there are use cases for this as well, where the thread is
seen as a producer or consumer of items. The ``yield from``
expression allows the logic of the thread to be spread over as
many functions as desired, with the production or consumption of
items occuring in any subfunction, and the items are automatically
routed to or from their ultimate source or destination.
Concerning ``throw()`` and ``close()``, it is reasonable to expect
that if an exception is thrown into the thread from outside, it should
first be raised in the innermost generator where the thread is suspended,
and propagate outwards from there; and that if the thread is terminated
from outside by calling ``close()``, the chain of active generators
should be finalised from the innermost outwards.
The particular syntax proposed has been chosen as suggestive of its
meaning, while not introducing any new keywords and clearly standing
out as being different from a plain ``yield``.
Using a specialised syntax opens up possibilities for optimisation
when there is a long chain of generators. Such chains can arise, for
instance, when recursively traversing a tree structure. The overhead
of passing ``next()`` calls and yielded values down and up the chain
can cause what ought to be an O(n) operation to become O(n\*\*2).
A possible strategy is to add a slot to generator objects to hold a
generator being delegated to. When a ``next()`` or ``send()`` call is
made on the generator, this slot is checked first, and if it is
nonempty, the generator that it references is resumed instead. If it
raises StopIteration, the slot is cleared and the main generator is
This would reduce the delegation overhead to a chain of C function
calls involving no Python code execution. A possible enhancement would
be to traverse the whole chain of generators in a loop and directly
resume the one at the end, although the handling of StopIteration is
more complicated then.
Use of StopIteration to return values
There are a variety of ways that the return value from the generator
could be passed back. Some alternatives include storing it as an
attribute of the generator-iterator object, or returning it as the
value of the ``close()`` call to the subgenerator. However, the proposed
mechanism is attractive for a couple of reasons:
* Using the StopIteration exception makes it easy for other kinds
of iterators to participate in the protocol without having to
grow a close() method.
* It simplifies the implementation, because the point at which the
return value from the subgenerator becomes available is the same
point at which StopIteration is raised. Delaying until any later
time would require storing the return value somewhere.
Under this proposal, the value of a ``yield from`` expression would
be derived in a very different way from that of an ordinary ``yield``
expression. This suggests that some other syntax not containing the
word ``yield`` might be more appropriate, but no alternative has so
far been proposed, other than ``call``, which has already been
rejected by the BDFL.
It has been suggested that some mechanism other than ``return`` in
the subgenerator should be used to establish the value returned by
the ``yield from`` expression. However, this would interfere with
the goal of being able to think of the subgenerator as a suspendable
function, since it would not be able to return values in the same way
as other functions.
The use of an argument to StopIteration to pass the return value
has been criticised as an "abuse of exceptions", without any
concrete justification of this claim. In any case, this is only
one suggested implementation; another mechanism could be used
without losing any essential features of the proposal.
Proposals along similar lines have been made before, some using the
syntax ``yield *`` instead of ``yield from``. While ``yield *`` is
more concise, it could be argued that it looks too similar to an
ordinary ``yield`` and the difference might be overlooked when reading
To the author's knowledge, previous proposals have focused only
on yielding values, and thereby suffered from the criticism that
the two-line for-loop they replace is not sufficiently tiresome
to write to justify a new syntax. By dealing with sent values
as well as yielded ones, this proposal provides considerably more
This document has been placed in the public domain.
About a year ago, I posted a scheme to comp.lang.python describing how to
use isolated interpreters to circumvent the GIL on SMPs:
In the following, an "appdomain" will be defined as a thread assosciated
with an unique embedded Python interpreter. One interpreter per thread is
how tcl work. Erlang also uses isolated threads that only communicate
through messages (as opposed to shared objects). Appdomains are also
available in the .NET framework, and in Java as "Java isolates". They are
potentially very useful as multicore CPUs become abundant. They allow one
process to run one independent Python interpreter on each available CPU
In Python, "appdomains" can be created by embedding the Python interpreter
multiple times in a process. For this to work, we have to make multiple
copies of the Python DLL and rename them (e.g. Python25-0.dll,
Python25-1.dll, Python25-2.dll, etc.) Otherwise the dynamic loader will
just return a handle to the already imported DLL. As DLLs can be accessed
with ctypes, we don't even have to program a line of C to do this. we can
start up a Python interpreter and use ctypes to embed more interpreters
into it, associating each interpreter with its own thread. ctypes takes
care of releasing the GIL in the parent interpreter, so calls to these
sub-interpreters become asynchronous. I had a mock-up of this scheme
working. Martin Löwis replied he doubted this would work, and pointed out
that Python extension libraries (.pyd files) are DLLs as well. They would
only be imported once, and their global states would thus crash, thus
He was right, of course, but also wrong. In fact I had already proven him
wrong by importing a DLL multiple times. If it can be done for
Python25.dll, it can be done for any other DLL as well - including .pyd
files - in exactly the same way. Thus what remains is to change Python's
dynamic loader to use the same "copy and import" scheme. This can either
be done by changing Python's C code, or (at least on Windows) to redirect
the LoadLibrary API call from kernel32.dll to a custom DLL. Both a quite
easy and requires minimal C coding.
Thus it is quite easy to make multiple, independent Python interpreters
live isolated lives in the same process. As opposed to multiple processes,
they can communicate without involving any IPC. It would also be possible
to design proxy objects allowing one interpreter access to an object in
another. Immutable object such as strings would be particularly easy to
This very simple scheme should allow parallel processing with Python
similar to how it's done in Erlang, without the GIL getting in our way. At
least on Windows this can be done without touching the CPython source at
all. I am not sure about Linux though. I may be necessary to patch the
CPython source to make it work there.