I think it would be a good idea if Python tracebacks could be translated
into languages other than English - and it would set a good example.
For example, using French as my default local language, instead of
>>> 1/0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ZeroDivisionError: integer division or modulo by zero
I might get something like
>>> 1/0
Suivi d'erreur (appel le plus récent en dernier) :
Fichier "<stdin>", à la ligne 1, dans <module>
ZeroDivisionError: division entière ou modulo par zéro
André
Here's an updated version of the PEP reflecting my
recent suggestions on how to eliminate 'codef'.
PEP: XXX
Title: Cofunctions
Version: $Revision$
Last-Modified: $Date$
Author: Gregory Ewing <greg.ewing(a)canterbury.ac.nz>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 13-Feb-2009
Python-Version: 3.x
Post-History:
Abstract
========
A syntax is proposed for defining and calling a special type of generator
called a 'cofunction'. It is designed to provide a streamlined way of
writing generator-based coroutines, and allow the early detection of
certain kinds of error that are easily made when writing such code, which
otherwise tend to cause hard-to-diagnose symptoms.
This proposal builds on the 'yield from' mechanism described in PEP 380,
and describes some of the semantics of cofunctions in terms of it. However,
it would be possible to define and implement cofunctions independently of
PEP 380 if so desired.
Specification
=============
Cofunction definitions
----------------------
A cofunction is a special kind of generator, distinguished by the presence
of the keyword ``cocall`` (defined below) at least once in its body. It may
also contain ``yield`` and/or ``yield from`` expressions, which behave as
they do in other generators.
From the outside, the distinguishing feature of a cofunction is that it cannot
be called the same way as an ordinary function. An exception is raised if an
ordinary call to a cofunction is attempted.
Cocalls
-------
Calls from one cofunction to another are made by marking the call with
a new keyword ``cocall``. The expression
::
cocall f(*args, **kwds)
is evaluated by first checking whether the object ``f`` implements
a ``__cocall__`` method. If it does, the cocall expression is
equivalent to
::
yield from f.__cocall__(*args, **kwds)
except that the object returned by __cocall__ is expected to be an
iterator, so the step of calling iter() on it is skipped.
If ``f`` does not have a ``__cocall__`` method, or the ``__cocall__``
method returns ``NotImplemented``, then the cocall expression is
treated as an ordinary call, and the ``__call__`` method of ``f``
is invoked.
Objects which implement __cocall__ are expected to return an object
obeying the iterator protocol. Cofunctions respond to __cocall__ the
same way as ordinary generator functions respond to __call__, i.e. by
returning a generator-iterator.
Certain objects that wrap other callable objects, notably bound methods,
will be given __cocall__ implementations that delegate to the underlying
object.
Grammar
-------
The full syntax of a cocall expression is described by the following
grammar lines:
::
atom: cocall | <existing alternatives for atom>
cocall: 'cocall' atom cotrailer* '(' [arglist] ')'
cotrailer: '[' subscriptlist ']' | '.' NAME
Note that this syntax allows cocalls to methods and elements of sequences
or mappings to be expressed naturally. For example, the following are valid:
::
y = cocall self.foo(x)
y = cocall funcdict[key](x)
y = cocall a.b.c[i].d(x)
Also note that the final calling parentheses are mandatory, so that for example
the following is invalid syntax:
::
y = cocall f # INVALID
New builtins, attributes and C API functions
--------------------------------------------
To facilitate interfacing cofunctions with non-coroutine code, there will
be a built-in function ``costart`` whose definition is equivalent to
::
def costart(obj, *args, **kwds):
try:
m = obj.__cocall__
except AttributeError:
result = NotImplemented
else:
result = m(*args, **kwds)
if result is NotImplemented:
raise TypeError("Object does not support cocall")
return result
There will also be a corresponding C API function
::
PyObject *PyObject_CoCall(PyObject *obj, PyObject *args, PyObject *kwds)
It is left unspecified for now whether a cofunction is a distinct type
of object or, like a generator function, is simply a specially-marked
function instance. If the latter, a read-only boolean attribute
``__iscofunction__`` should be provided to allow testing whether a given
function object is a cofunction.
Motivation and Rationale
========================
The ``yield from`` syntax is reasonably self-explanatory when used for the
purpose of delegating part of the work of a generator to another function. It
can also be used to good effect in the implementation of generator-based
coroutines, but it reads somewhat awkwardly when used for that purpose, and
tends to obscure the true intent of the code.
Furthermore, using generators as coroutines is somewhat error-prone. If one
forgets to use ``yield from`` when it should have been used, or uses it when it
shouldn't have, the symptoms that result can be extremely obscure and confusing.
Finally, sometimes there is a need for a function to be a coroutine even though
it does not yield anything, and in these cases it is necessary to resort to
kludges such as ``if 0: yield`` to force it to be a generator.
The ``cocall`` construct address the first issue by making the syntax directly
reflect the intent, that is, that the function being called forms part of a
coroutine.
The second issue is addressed by making it impossible to mix coroutine and
non-coroutine code in ways that don't make sense. If the rules are violated, an
exception is raised that points out exactly what and where the problem is.
Lastly, the need for dummy yields is eliminated by making it possible for a
cofunction to call both cofunctions and ordinary functions with the same syntax,
so that an ordinary function can be used in place of a cofunction that yields
zero times.
Record of Discussion
====================
An earlier version of this proposal required a special keyword ``codef`` to be
used in place of ``def`` when defining a cofunction, and disallowed calling an
ordinary function using ``cocall``. However, it became evident that these
features were not necessary, and the ``codef`` keyword was dropped in the
interests of minimising the number of new keywords required.
The use of a decorator instead of ``codef`` was also suggested, but the current
proposal makes this unnecessary as well.
It has been questioned whether some combination of decorators and functions
could be used instead of a dedicated ``cocall`` syntax. While this might be
possible, to achieve equivalent error-detecting power it would be necessary
to write cofunction calls as something like
::
yield from cocall(f)(args)
making them even more verbose and inelegant than an unadorned ``yield from``.
It is also not clear whether it is possible to achieve all of the benefits of
the cocall syntax using this kind of approach.
Prototype Implementation
========================
An implementation of an earlier version of this proposal in the form of patches
to Python 3.1.2 can be found here:
http://www.cosc.canterbury.ac.nz/greg.ewing/python/generators/cofunctions.h…
If this version of the proposal is received favourably, the implementation will
be updated to match.
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:
[Changed subject]
> On 2010-10-25 04:37, Guido van Rossum wrote:
>> This should not require threads.
>>
>> Here's a bare-bones sketch using generators:
[...]
On Mon, Oct 25, 2010 at 3:19 AM, Jacob Holm <jh(a)improva.dk> wrote:
> If you don't care about allowing the funcs to raise StopIteration, this
> can actually be simplified to:
[...]
Indeed, I realized this after posting. :-) I had several other ideas
for improvements, e.g. being able to pass an initial value to the
reduce-like function or even being able to supply a reduce-like
function of one's own.
> More interesting (to me at least) is that this is an excellent example
> of why I would like to see a version of PEP380 where "close" on a
> generator can return a value (AFAICT the version of PEP380 on
> http://www.python.org/dev/peps/pep-0380 is not up-to-date and does not
> mention this possibility, or even link to the heated discussion we had
> on python-ideas around march/april 2009).
Can you dig up the link here?
I recall that discussion but I don't recall a clear conclusion coming
from it -- just heated debate.
Based on my example I have to agree that returning a value from
close() would be nice. There is a little detail, how multiple
arguments to StopIteration should be interpreted, but that's not so
important if it's being raised by a return statement.
> Assuming that "close" on a reduce_collector generator instance returns
> the value of the StopIteration raised by the "return" statements, we can
> simplify the code even further:
>
>
> def reduce_collector(func):
> try:
> outcome = yield
> except GeneratorExit:
> return None
> while True:
> try:
> val = yield
> except GeneratorExit:
> return outcome
> outcome = func(outcome, val)
>
> def parallel_reduce(iterable, funcs):
> collectors = [reduce_collector(func) for func in funcs]
> for coll in collectors:
> next(coll)
> for val in iterable:
> for coll in collectors:
> coll.send(val)
> return [coll.close() for coll in collectors]
>
>
> Yes, this is only saving a few lines, but I find it *much* more readable...
I totally agree that not having to call throw() and catch whatever it
bounces back is much nicer. (Now I wish there was a way to avoid the
"try..except GeneratorExit" construct in the generator, but I think I
should stop while I'm ahead. :-)
The interesting thing is that I've been dealing with generators used
as coroutines or tasks intensely on and off since July, and I haven't
had a single need for any of the three patterns that this example
happened to demonstrate:
- the need to "prime" the generator in a separate step
- throwing and catching GeneratorExit
- getting a value from close()
(I did have a lot of use for send(), throw(), and extracting a value
from StopIteration.)
In my context, generators are used to emulate concurrently running
tasks, and "yield" is always used to mean "block until this piece of
async I/O is complete, and wake me up with the result". This is
similar to the "classic" trampoline code found in PEP 342.
In fact, when I wrote the example for this thread, I fumbled a bit
because the use of generators there is different than I had been using
them (though it was no doubt thanks to having worked with them
intensely that I came up with the example quickly).
So, it is clear that generators are extremely versatile, and PEP 380
deserves several good use cases to explain all the API subtleties.
BTW, while I have you, what do you think of Greg's "cofunctions" proposal?
--
--Guido van Rossum (python.org/~guido)
Seconding Raymond's 'drat'. Resending to python-ideas.
On Sat, 16 Oct 2010 07:00:16 am Raymond Hettinger wrote:
> Hello guys. If you don't mind, I would like to hijack your thread
> :-)
Please do :)
> A few years ago, Guido and other python devvers supported a
> proposal I made to create a stats module, but I didn't have time
> to develop it.
[...]
> I think the creativity and energy of this group is much better
> directed at building a quality stats module (perhaps with some R-like
> capabilities).
+1
Are you still interested in working on it, or is this a subtle hint that
somebody else should do so?
--
Steven D'Aprano
I've been pondering the whole close()-returning-a-value
thing I've convinced myself once again that it's a bad
idea.
Essentially the problem is that we're trying to make
the close() method, and consequently GeneratorExit,
serve two different and incompatible roles.
One role (the one it currently serves) is as an
emergency bail-out mechanism. In that role, when we
have a stack of generators delegating via yield-from,
we want things to behave as thought the GeneratorExit
originates in the innermost one and propagates back
out of the entire stack. We don't want any of the
intermediate generators to catch it and turn it
into a StopIteration, because that would give the
next outer one the misleading impression that it's
business as usual, but it's not.
This is why PEP 380 currently specifies that, after
calling the close() method of the subgenerator,
GeneratorExit is unconditionally re-raised in the
delegating generator.
The proponents of close()-returning-a-value, however,
want GeneratorExit to serve another role: as a way
of signalling to a consuming generator (i.e. one that
is having values passed into it using send()) that
there are no more values left to pass in.
It seems to me that this is analogous to a function
reading values from a file, or getting them from an
iterator. The behaviour that's usually required in
the presence of delegation is quite different in those
cases.
Consider a function f1, that calls another function
f2, which loops reading from a file. When f2 reaches
the end of the file, this is a signal that it should
finish what it's doing and return a value to f1, which
then continues in its usual way.
Similarly, if f2 uses a for-loop to iterate over
something, when the iterator is exhausted, f2 continues
and returns normally.
I don't see how GeneratorExit can be made to fulfil
this role, i.e. as a "producer exhausted" signal,
without compromising its existing one. And if that
idea is dropped, the idea of close() returning a value
no longer has much motivation that I can see.
So how should "producer exhausted" be signalled, and
how should the result of a consumer generator be returned?
As for returning the result, I think it should be done
using the existing PEP 380 mechanism, i.e. the generator
executes a "return", consequently raising StopIteration
with the value. A delegating generator will then see
this as the result of a yield-from and continue normally.
As for the signalling mechanism, I think that's entirely
a matter for the producer and consumer to decide between
themselves. One way would be to send() in a sentinel value,
if there is a suitable out-of-band value available.
Another would be to throw() in some pre-arranged exception,
perhaps EOFError as a suggested convention.
If we look at files as an analogy, we see a similar range
of conventions. Most file reading operations return an empty
string or bytes object on EOF. Some, such as readline(),
raise an exception, because the empty element of the relevant
type is also a valid return value.
As an example, a consumer generator using None as a
sentinel value might look like this:
def summer():
tot = 0
while 1:
x = yield
if x is None:
break
tot += x
return tot
and a producer using it:
s = summer()
s.next()
for x in values:
s.send(x)
try:
s.send(None)
except StopIteration as e:
result = e.value
Having to catch StopIteration is a little tedious, but it
could easily be encapsulated in a helper function:
def close_consumer(g, sentinel):
try:
g.send(sentinel)
except StopIteration as e:
return e.value
The helper function could also take care of another issue
that arises. What happens if a delegating consumer carries
on after a subconsumer has finished and yields again?
The analogous situation with files is trying to read from
a file that has already signalled EOF before. In that case,
the file simply signals EOF again. Similarly, calling
next() on an exhausted iterator raises StopIteration again.
So, if a "finished" consumer yields again, and we are using
a sentinel value, the yield should return the sentinel again.
We can get this behaviour by writing our helper function like
this:
def close_consumer(g, sentinel):
while 1:
try:
g.send(sentinel)
except StopIteration as e:
return e.value
So in summary, I think PEP 380 and current generator
semantics are fine as they stand with regard to the
behaviour of close(). Signalling the end of a stream of
values to a consumer generator can and should be handled
by convention, using existing facilities.
--
Greg
In some languages (e.g. Scheme, Ruby, etc.), the question mark character (?)
is a valid character for identifiers. I find that using it well can improve
readability of programs written in those languages.
Python 3 now allow all kinds of unicode characters in source code for
identifiers. This is fantastic when one wants to teach programming to
non-English speakers and have them use meaningful identifiers.
While Python 3 does not allow ?, it does allow characters like ʔ (
http://en.wikipedia.org/wiki/Glottal_stop_%28letter%29) which can be used
to good effect in writing valid identifiers such as functions that return
either True or False, etc., thus improving (imo) readability.
Given that one can legally mimic ? in Python identifiers, and given that the
? symbol is not used for anything in Python, would it be possible to
consider allowing the use of ? as a valid character in an identifier?
André
I'd like to resurrect a discussion that went on a little over a year
ago [1] started by Michael Foord suggesting that it'd be nice if
keyword arguments' storage was implemented as an ordered dict as
opposed to the current unordered form.
I'm interested in picking this up for implementation, which presumably
will require moving the implementation of the existing ordereddict
class into the C library.
Are there any issues that this might cause in implementation on the
py3k development line?
[1] http://mail.python.org/pipermail/python-ideas/2009-April/004163.html
--
Chris R.
Not to be taken literally, internally, or seriously.
Guido van Rossum wrote:
> I'd also like to convince you to change g.close() so that it captures
> and returns the return value from StopIteration if it has one.
Looking at this again, I find that I'm not really sure how
this impacts PEP 380. The current expansion specifies that
when a delegating generator is closed, the subgenerator's
close() method is called, any value it returns is ignored,
and GeneratorExit is re-raised.
If that close() call were to return a value, what do you
think should be done with it?
--
Greg
On 10/28/10, Antoine Pitrou <solipsis(a)pitrou.net> wrote:
> On Thu, 28 Oct 2010 19:58:59 +0200
> spir <denis.spir(a)gmail.com> wrote:
>> What does the current implementation use as buckets?
> It uses an open addressing strategy. Each dict entry holds three
> pointer-sized fields: key object, value object, and cached hash value
> of the key.
> (set entries have only two fields, since they don't hold a value object)
Has anyone benchmarked not storing the hash value here?
For a string dict, that hash should already be available on the string
object itself, so it is redundant. Keeping it obviously improves
cache locality, but ... it also makes the dict objects 50% larger, and
there is a chance that the strings themselves would already be in
cache anyhow. And if strings were reliably interned, the comparison
check should normally just be a pointer compare -- possibly fast
enough that the "different hash" shortcut doesn't buy anything.
[caveats about still needing to go to the slower dict implementation
for string subclasses]
-jJ
I find myself sometimes, when writing IO Code from C, wanting to pass memory that I have allocated internally and filled with data, to Python without copying it into a string object.
To this end, I have locally created a (2.x) method called PyBuffer_FromMemoryAndDestructor(), which is the same as PyBuffer_FromMemory() except that it will call a provided destructor function with an optional arg, to release the memory address given, when no longer in use.
First of all, I'd futilely like to suggest this change for 2.x. The existing PyBuffer_FromMemory() provides no lifetime management.
Second, the ByBuffer object doesn't support the new Py_buffer interface, so you can't really use this then, like putting a memoryview around it. This is a fixable bug, otoh.
Thirdly, in py3k I think the situation is different. There you would (probably, correct me if I'm wrong) emulate the old PyBuffer_FromMemory with a combination of the new PyBuffer_FromContiguous and a PyMemoryView_FromBuffer(). But this also does not allow any lifetime magement of the external memory. So, for py3k, I'd actually like to extend the Memoryview object, and provide something like PyMemoryView_FromExternal() that takes an optional pointer to a "void destructor(void *arg, void *ptr)) and an (void *arg), to be called when the buffer is released.
K