Here's an updated version of the PEP reflecting my
recent suggestions on how to eliminate 'codef'.
PEP: XXX
Title: Cofunctions
Version: $Revision$
Last-Modified: $Date$
Author: Gregory Ewing <greg.ewing(a)canterbury.ac.nz>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 13-Feb-2009
Python-Version: 3.x
Post-History:
Abstract
========
A syntax is proposed for defining and calling a special type of generator
called a 'cofunction'. It is designed to provide a streamlined way of
writing generator-based coroutines, and allow the early detection of
certain kinds of error that are easily made when writing such code, which
otherwise tend to cause hard-to-diagnose symptoms.
This proposal builds on the 'yield from' mechanism described in PEP 380,
and describes some of the semantics of cofunctions in terms of it. However,
it would be possible to define and implement cofunctions independently of
PEP 380 if so desired.
Specification
=============
Cofunction definitions
----------------------
A cofunction is a special kind of generator, distinguished by the presence
of the keyword ``cocall`` (defined below) at least once in its body. It may
also contain ``yield`` and/or ``yield from`` expressions, which behave as
they do in other generators.
From the outside, the distinguishing feature of a cofunction is that it cannot
be called the same way as an ordinary function. An exception is raised if an
ordinary call to a cofunction is attempted.
Cocalls
-------
Calls from one cofunction to another are made by marking the call with
a new keyword ``cocall``. The expression
::
cocall f(*args, **kwds)
is evaluated by first checking whether the object ``f`` implements
a ``__cocall__`` method. If it does, the cocall expression is
equivalent to
::
yield from f.__cocall__(*args, **kwds)
except that the object returned by __cocall__ is expected to be an
iterator, so the step of calling iter() on it is skipped.
If ``f`` does not have a ``__cocall__`` method, or the ``__cocall__``
method returns ``NotImplemented``, then the cocall expression is
treated as an ordinary call, and the ``__call__`` method of ``f``
is invoked.
Objects which implement __cocall__ are expected to return an object
obeying the iterator protocol. Cofunctions respond to __cocall__ the
same way as ordinary generator functions respond to __call__, i.e. by
returning a generator-iterator.
Certain objects that wrap other callable objects, notably bound methods,
will be given __cocall__ implementations that delegate to the underlying
object.
Grammar
-------
The full syntax of a cocall expression is described by the following
grammar lines:
::
atom: cocall | <existing alternatives for atom>
cocall: 'cocall' atom cotrailer* '(' [arglist] ')'
cotrailer: '[' subscriptlist ']' | '.' NAME
Note that this syntax allows cocalls to methods and elements of sequences
or mappings to be expressed naturally. For example, the following are valid:
::
y = cocall self.foo(x)
y = cocall funcdict[key](x)
y = cocall a.b.c[i].d(x)
Also note that the final calling parentheses are mandatory, so that for example
the following is invalid syntax:
::
y = cocall f # INVALID
New builtins, attributes and C API functions
--------------------------------------------
To facilitate interfacing cofunctions with non-coroutine code, there will
be a built-in function ``costart`` whose definition is equivalent to
::
def costart(obj, *args, **kwds):
try:
m = obj.__cocall__
except AttributeError:
result = NotImplemented
else:
result = m(*args, **kwds)
if result is NotImplemented:
raise TypeError("Object does not support cocall")
return result
There will also be a corresponding C API function
::
PyObject *PyObject_CoCall(PyObject *obj, PyObject *args, PyObject *kwds)
It is left unspecified for now whether a cofunction is a distinct type
of object or, like a generator function, is simply a specially-marked
function instance. If the latter, a read-only boolean attribute
``__iscofunction__`` should be provided to allow testing whether a given
function object is a cofunction.
Motivation and Rationale
========================
The ``yield from`` syntax is reasonably self-explanatory when used for the
purpose of delegating part of the work of a generator to another function. It
can also be used to good effect in the implementation of generator-based
coroutines, but it reads somewhat awkwardly when used for that purpose, and
tends to obscure the true intent of the code.
Furthermore, using generators as coroutines is somewhat error-prone. If one
forgets to use ``yield from`` when it should have been used, or uses it when it
shouldn't have, the symptoms that result can be extremely obscure and confusing.
Finally, sometimes there is a need for a function to be a coroutine even though
it does not yield anything, and in these cases it is necessary to resort to
kludges such as ``if 0: yield`` to force it to be a generator.
The ``cocall`` construct address the first issue by making the syntax directly
reflect the intent, that is, that the function being called forms part of a
coroutine.
The second issue is addressed by making it impossible to mix coroutine and
non-coroutine code in ways that don't make sense. If the rules are violated, an
exception is raised that points out exactly what and where the problem is.
Lastly, the need for dummy yields is eliminated by making it possible for a
cofunction to call both cofunctions and ordinary functions with the same syntax,
so that an ordinary function can be used in place of a cofunction that yields
zero times.
Record of Discussion
====================
An earlier version of this proposal required a special keyword ``codef`` to be
used in place of ``def`` when defining a cofunction, and disallowed calling an
ordinary function using ``cocall``. However, it became evident that these
features were not necessary, and the ``codef`` keyword was dropped in the
interests of minimising the number of new keywords required.
The use of a decorator instead of ``codef`` was also suggested, but the current
proposal makes this unnecessary as well.
It has been questioned whether some combination of decorators and functions
could be used instead of a dedicated ``cocall`` syntax. While this might be
possible, to achieve equivalent error-detecting power it would be necessary
to write cofunction calls as something like
::
yield from cocall(f)(args)
making them even more verbose and inelegant than an unadorned ``yield from``.
It is also not clear whether it is possible to achieve all of the benefits of
the cocall syntax using this kind of approach.
Prototype Implementation
========================
An implementation of an earlier version of this proposal in the form of patches
to Python 3.1.2 can be found here:
http://www.cosc.canterbury.ac.nz/greg.ewing/python/generators/cofunctions.h…
If this version of the proposal is received favourably, the implementation will
be updated to match.
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:
On 12/10/12 11:04, Mark Adam wrote:
> On Thu, Oct 11, 2012 at 6:35 AM, Steven D'Aprano<steve(a)pearwood.info> wrote:
>> On 11/10/12 16:45, Greg Ewing wrote:
>>> Are you sure there would be any point in this? People who
>>> specifically *want* base-2 floats are probably quite happy
>>> with the current float type, and wouldn't appreciate having
>>> it slowed down, even by a small amount.
>>
>> I would gladly give up a small amount of speed for better control
>> over floats, such as whether 1/0.0 raised an exception or
>> returned infinity.
>
> Umm, you would be giving up a *lot* of speed. Native floating point
> happens right in the processor, so if you want special behavior, you'd
> have to take the floating point out of hardware and into "user space".
Any half-decent processor supports the IEEE-754 standard. If it doesn't,
it's broken by design.
Even in user-space, you're not giving up that much speed in practical
terms, at least not for my needs. The new decimal module in Python 3.3 is
less than a factor of 10 times slower than Python's floats, which makes it
pretty much instantaneous to my mind :)
numpy supports configurable numeric contexts, and I don't hear that many
complaints that numpy is slower than standard Python.
--
Steven
>
> mark
>
It seems that built-in classes do not short-circuit `__eq__` method
when the objects are identical, at least in CPython:
f = frozenset(range(200000000))
f1 = f
f1 == f # this operation will take about 1 sec on my machine
Is there any disadvantage to checking whether the equality was called
with the same object, and if it was, return `True` right away? I
noticed this when trying to memoize a function that has large
frozenset arguments. While hashing of a large argument is very fast
after it's done once (hash value is presumably cached), the equality
comparison is always slow even against itself. So when the same large
argument is provided over and over, memoization is slow.
Of course, there's a workaround: subclass frozenset, and redefine
__eq__ to check id() first. And arguably, for this particular use
case, I should redefine both __hash__ and __eq__, to make them only
look exclusively at id(), since it's not worth wasting memoizer time
trying to compare two non-identical large arguments that are highly
unlikely to compare equal anyway. So if there's any reason for the
current implementation, I don't have a strong argument against it.
On Fri, Oct 12, 2012 at 10:34 PM, Mike Graham <mikegraham(a)gmail.com> wrote:
> On Fri, Oct 12, 2012 at 4:27 PM, Ram Rachum <ram.rachum(a)gmail.com> wrote:
> > Hi everybody,
> >
> > Today a funny thought occurred to me. Ever since I've learned to program
> > when I was a child, I've taken for granted that when programming, the
> sign
> > used for multiplication is *. But now that I think about it, why? Now
> that
> > we have Unicode, why not use · ?
> >
> > Do you think that we can make Python support · in addition to *?
> >
> > I can think of a couple of problems, but none of them seem like
> > deal-breakers:
> >
> > - Backward compatibility: Python already uses *, but I don't see a
> backward
> > compatibility problem with supporting · additionally. Let people use
> > whichever they want, like spaces and tabs.
> > - Input methods: I personally use an IDE that could be easily set to
> > automatically convert * to · where appropriate and to allow manual input
> of
> > ·. People on Linux can type Alt-. . Anyone else can set up a script
> that'll
> > let them type · using whichever keyboard combination they want. I admit
> this
> > is pretty annoying, but since you can always use * if you want to, I
> figure
> > that anyone who cares enough about using · instead of * (I bet that
> people
> > in scientific computing would like that) would be willing to take the
> time
> > to set it up.
> >
> >
> > What do you think?
> >
> >
> > Ram
>
> Python should not expect characters that are hard for most people to
> type.
No one will be forced to type it. If you can't type it, use *.
> Python should not expect characters that are still hard to
> display on many common platforms.
>
We allow people to have unicode variable names, if they wish, don't we? So
why not allow them to use unicode operator, if they wish, as a completely
optional thing?
>
> I think you'll find strong opposition to adding any non-ASCII
> characters or characters that don't occur on almost all keyboards as
> part of the language.
>
> Mike
>
Hi python-ideas,
I'm jumping in to this thread on behalf of Tornado. I think there are
actually two separate issues here and it's important to keep them
distinct: at a low level, there is a need for a standardized event
loop, while at a higher level there is a question of what asynchronous
code should look like.
This thread so far has been more about the latter, but the need for
standardization is more acute for the core event loop. I've written a
bridge between Tornado and Twisted so libraries written for both event
loops can coexist, but obviously that wouldn't scale if there were a
proliferation of event loop implementations out there. I'd be in
favor of a simple event loop interface in the standard library, with
reference implementation(s) (select, epoll, kqueue, iocp) and some
means of configuring the global (or thread-local) singleton. My
preference is to keep the interface fairly low-level and close to the
underlying mechanisms (i.e. like IReactorFDSet instead of
IReactor{TCP,UDP,SSL,etc}), so that different interfaces like
Tornado's IOStream or Twisted's protocols can be built on top of it.
As for the higher-level question of what asynchronous code should look
like, there's a lot more room for spirited debate, and I don't think
there's enough consensus to declare a One True Way. Personally, I'm
-1 on greenlets as a general solution (what if you have to call
MySQLdb or getaddrinfo?), although they can be useful in particular
cases to convert well-behaved synchronous code into async (as in
Motor: http://emptysquare.net/blog/introducing-motor-an-asynchronous-mongodb-drive…).
I like Futures, though, and I find that they work well in
asynchronous code. The use of the result() method to encapsulate both
successful responses and exceptions is especially nice with generator
coroutines.
FWIW, here's the interface I'm moving towards for async code. From
the caller's perspective, asynchronous functions return a Future (the
future has to be constructed by hand since there is no Executor
involved), and also take an optional callback argument (mainly for
consistency with currently-prevailing patterns for async code; if the
callback is given it is simply added to the Future with
add_done_callback). In Tornado the Future is created by a decorator
and hidden from the asynchronous function (it just sees the callback),
although this relies on some Tornado-specific magic for exception
handling. In a coroutine, the decorator recognizes Futures and
resumes execution when the future is done. With these decorators
asynchronous code looks almost like synchronous code, except for the
"yield" keyword before each asynchronous call.
-Ben
On 9 October 2012 02:07, Guido van Rossum <guido(a)python.org> wrote:
> On Mon, Oct 8, 2012 at 5:32 PM, Oscar Benjamin
> <oscar.j.benjamin(a)gmail.com> wrote:
>> On 9 October 2012 01:11, Guido van Rossum <guido(a)python.org> wrote:
>>> On Mon, Oct 8, 2012 at 5:02 PM, Greg Ewing <greg.ewing(a)canterbury.ac.nz> wrote:
>>>>
>>>> So the question that really needs to be answered, I think, is
>>>> not "Why is NaN == NaN false?", but "Why doesn't NaN == anything
>>>> raise an exception, when it would make so much more sense to
>>>> do so?"
>>>
>>> Because == raising an exception is really unpleasant. We had this in
>>> Python 2 for unicode/str comparisons and it was very awkward.
>>>
>>> Nobody arguing against the status quo seems to care at all about
>>> numerical algorithms though. I propose that you go find some numerical
>>> mathematicians and ask them.
>>
>> The main purpose of quiet NaNs is to propagate through computation
>> ruining everything they touch. In a programming language like C that
>> lacks exceptions this is important as it allows you to avoid checking
>> all the time for invalid values, whilst still being able to know if
>> the end result of your computation was ever affected by an invalid
>> numerical operation. The reasons for NaNs to compare unequal are no
>> doubt related to this purpose.
>>
>> It is of course arguable whether the same reasoning applies to a
>> language like Python that has a very good system of exceptions but I
>> agree with Guido that raising an exception on == would be unfortunate.
>> How many people would forget that they needed to catch those
>> exceptions? How awkward could your code be if you did remember to
>> catch all those exceptions? In an exception handling language it's
>> important to know that there are some operations that you can trust.
>
> If we want to do *anything* I think we should first introduce a
> floating point context similar to the Decimal context. Then we can
> talk.
The other thread has gone on for ages now and isn't going anywhere.
Guido's suggestion here is much more interesting (to me) so I want to
start a new thread on this subject. Python's default handling of
floating point operations is IEEE-754 compliant which in my opinion is
the obvious and right thing to do.
However, Python is a much more versatile language than some of the
other languages for which IEEE-754 was designed. Python offers the
possibility of a very rich approach to the control and verification of
the accuracy of numeric operations on both a function by function and
code block by code block basis. This kind of functionality is already
implemented in the decimal module [1] as well as numpy [2], gmpy [3],
sympy [4] and no doubt other numerical modules that I'm not aware of.
It would be a real blessing to numerical Python programmers if
either/both of the following were to occur:
1) Support for calculation contexts with floats
2) A generic kind of calculation context manager that was recognised
widely by the builtin/stdlib types and also by third party numerical
packages.
Oscar
References:
[1] http://docs.python.org/library/decimal.html#context-objects
[2] http://docs.scipy.org/doc/numpy/reference/generated/numpy.seterr.html#numpy…
[3] https://gmpy2.readthedocs.org/en/latest/mpfr.html
[4] http://docs.sympy.org/dev/modules/mpmath/contexts.html
The literal"\c" should be an error but in practice means "\\c". It's
probably too late to make this invalid syntax as it out to be, but I
wonder if a warning isn't in order, especially with the theoretical
potential of adding new string escapes in the future.
As StopIteration now have value, this value is lost when using functions
which works with iterators/generators (map, filter, itertools).
Therefore, wrapping the iterator, which preserved its semantics in
versions before 3.3, no longer preserves it:
map(lambda x: x, iterator)
filter(lambda x: True, iterator)
itertools.accumulate(iterator, lambda x, y: y)
itertools.chain(iterator)
itertools.compress(iterator, itertools.cycle([True]))
itertools.dropwhile(lambda x: False, iterator)
itertools.filterfalse(lambda x: False, iterator)
next(itertools.groupby(iterator, lambda x: None))[1]
itertools.takewhile(lambda x: True, iterator)
itertools.tee(iterator, 1)[0]
Perhaps it would be worth to propagate original exception (or at least
it's value) in functions for which it makes sense.
I regularly see learners using "is" to check for string equality and
sometimes other equality. Due to optimizations, they often come away
thinking it worked for them.
There are no cases where
if x is "foo":
or
if x is 4:
is actually the code someone intended to write.
Although this has no benefit to anyone but new learners, it also
doesn't really do any harm.
Mike
A couple of weeks ago I posted a question on superuser.com about whether
there is a way to get the same *very* convenient
stepping-through-command-history behaviour in an interactive Python
interpreter session as is possible in (at least) the bash shell with the
Ctrl-o keybinding:
http://superuser.com/questions/477997/key-binding-to-interactively-execute-…
I was spurred to ask this question by a painful development experience
full of Up Up Up Up Up Enter Up Up Up Up Up Enter ... keypresses to
repeat a previous set of Python commands/statements that weren't worth
putting in a script file, or which I wanted to make very minor changes
to on each iteration.
As you might have noticed, I didn't get any answers, which either means
that I'm the only person in the world to think this is an issue worth
getting bothered about, or that there is no such behaviour available.
Perhaps both -- but my feeling is that if this behaviour were available
and well-known, it would become heavily used and very popular. As many
other readline behaviours *do* work, this one would be really nice to
have -- any chance that it could be added to a future release? (if it's
not already there via some secret binding)
Thanks!
Andy