From Steve.Dower at microsoft.com  Sat Dec  1 00:32:04 2012
From: Steve.Dower at microsoft.com (Steve Dower)
Date: Fri, 30 Nov 2012 23:32:04 +0000
Subject: [Python-ideas] An async facade? (was Re: [Python-Dev] Socket
 timeout and completion based sockets)
In-Reply-To: <CAP7+vJL=7jNuTJ8VuT1LDw9dKo8TdxPTH+6cs7-gJA6GmnGROQ@mail.gmail.com>
References: <EFE3877620384242A686D52278B7CCD329DEB553@RKV-IT-EXCH103.ccp.ad.local>
	<CAP7+vJJWDEwAvKB1VVrz-RX6b9kO3TpxnwUGnoq57746MV1WFg@mail.gmail.com>
	<EFE3877620384242A686D52278B7CCD329DEBF4B@RKV-IT-EXCH103.ccp.ad.local>
	<CAP7+vJKpeEDV9ZpvJqN2_joOt4cKj9B1BJ5gYeZusnwcnu_VwQ@mail.gmail.com>
	<EFE3877620384242A686D52278B7CCD329DED074@RKV-IT-EXCH103.ccp.ad.local>
	<CAP7+vJJKJrYbXFF8EjBSYdicVMKZv1J4A_rc4rdw6VkMkEg6Fg@mail.gmail.com>
	<1D9BE0CD-5BF4-480D-8D40-5A409E40760D@twistedmatrix.com>
	<20121130161422.GB536@snakebite.org>
	<A7269F03D11BC245BD52843B195AC4F0019E46B6@TK5EX14MBXC293.redmond.corp.microsoft.com>
	<CAP7+vJ+np39bRs-F3YCcsyvZVSzDRdC7A2XvMHTDqHd3emW8mw@mail.gmail.com>
	<50B93536.30104@canterbury.ac.nz>
	<CAP7+vJL=7jNuTJ8VuT1LDw9dKo8TdxPTH+6cs7-gJA6GmnGROQ@mail.gmail.com>
Message-ID: <A7269F03D11BC245BD52843B195AC4F0019E48EB@TK5EX14MBXC293.redmond.corp.microsoft.com>

Guido van Rossum wrote:
> Greg Ewing wrote:
>> Guido van Rossum wrote:
>>>
>>> Futures or callbacks, that's the question...
>>>
>>> Richard and I have even been considering APIs like this:
>>>
>>> res = obj.some_call(<args>)
>>> if isinstance(res, Future):
>>>     res = yield res
>>
>>
>> I thought you had decided against the idea of yielding futures?
>
> As a user-facing API style, yes. But this is meant for an internal API
> -- the equivalent of your bare 'yield'. If you want to, I can consider another style as well
>
>
> res = obj.some_call(<args>)
> if isinstance(res, Future):
>     res.<magic_call>()
>     yield
>
> But I don't see a fundamental advantage to this.

I do, it completely avoids ever using yield from to pass values around when used for coroutines.

If values are always yielded or never yielded then it is easy (or easier) to detect errors such as:

def func():
    data = yield from get_data_async()
    for x in data:
        yield x

When values are sometimes yielded and sometimes not, it's much harder to reliably throw an error when a value was yielded. Always using bare yields lets the code calling __next__() (I forget whether we're calling this "scheduler"...) raise an error if the value is not None.


Cheers,
Steve


From guido at python.org  Sat Dec  1 00:48:23 2012
From: guido at python.org (Guido van Rossum)
Date: Fri, 30 Nov 2012 15:48:23 -0800
Subject: [Python-ideas] An async facade? (was Re: [Python-Dev] Socket
 timeout and completion based sockets)
In-Reply-To: <A7269F03D11BC245BD52843B195AC4F0019E48EB@TK5EX14MBXC293.redmond.corp.microsoft.com>
References: <EFE3877620384242A686D52278B7CCD329DEB553@RKV-IT-EXCH103.ccp.ad.local>
	<CAP7+vJJWDEwAvKB1VVrz-RX6b9kO3TpxnwUGnoq57746MV1WFg@mail.gmail.com>
	<EFE3877620384242A686D52278B7CCD329DEBF4B@RKV-IT-EXCH103.ccp.ad.local>
	<CAP7+vJKpeEDV9ZpvJqN2_joOt4cKj9B1BJ5gYeZusnwcnu_VwQ@mail.gmail.com>
	<EFE3877620384242A686D52278B7CCD329DED074@RKV-IT-EXCH103.ccp.ad.local>
	<CAP7+vJJKJrYbXFF8EjBSYdicVMKZv1J4A_rc4rdw6VkMkEg6Fg@mail.gmail.com>
	<1D9BE0CD-5BF4-480D-8D40-5A409E40760D@twistedmatrix.com>
	<20121130161422.GB536@snakebite.org>
	<A7269F03D11BC245BD52843B195AC4F0019E46B6@TK5EX14MBXC293.redmond.corp.microsoft.com>
	<CAP7+vJ+np39bRs-F3YCcsyvZVSzDRdC7A2XvMHTDqHd3emW8mw@mail.gmail.com>
	<50B93536.30104@canterbury.ac.nz>
	<CAP7+vJL=7jNuTJ8VuT1LDw9dKo8TdxPTH+6cs7-gJA6GmnGROQ@mail.gmail.com>
	<A7269F03D11BC245BD52843B195AC4F0019E48EB@TK5EX14MBXC293.redmond.corp.microsoft.com>
Message-ID: <CAP7+vJ+x_3k8cSQ+M7MPn8mHt3f8Rc0-VvafU_7y9Pu+wx95Zw@mail.gmail.com>

On Fri, Nov 30, 2012 at 3:32 PM, Steve Dower <Steve.Dower at microsoft.com> wrote:
> Guido van Rossum wrote:
>> Greg Ewing wrote:
>>> Guido van Rossum wrote:
>>>>
>>>> Futures or callbacks, that's the question...
>>>>
>>>> Richard and I have even been considering APIs like this:
>>>>
>>>> res = obj.some_call(<args>)
>>>> if isinstance(res, Future):
>>>>     res = yield res
>>>
>>>
>>> I thought you had decided against the idea of yielding futures?
>>
>> As a user-facing API style, yes. But this is meant for an internal API
>> -- the equivalent of your bare 'yield'. If you want to, I can consider another style as well
>>
>>
>> res = obj.some_call(<args>)
>> if isinstance(res, Future):
>>     res.<magic_call>()
>>     yield
>>
>> But I don't see a fundamental advantage to this.
>
> I do, it completely avoids ever using yield from to pass values around when used for coroutines.
>
> If values are always yielded or never yielded then it is easy (or easier) to detect errors such as:
>
> def func():
>     data = yield from get_data_async()
>     for x in data:
>         yield x
>
> When values are sometimes yielded and sometimes not, it's much harder to reliably throw an error when a value was yielded. Always using bare yields lets the code calling __next__() (I forget whether we're calling this "scheduler"...) raise an error if the value is not None.

Good point. I'll keep this in mind.

-- 
--Guido van Rossum (python.org/~guido)


From rene at stranden.com  Sat Dec  1 00:57:08 2012
From: rene at stranden.com (Rene Nejsum)
Date: Sat, 1 Dec 2012 00:57:08 +0100
Subject: [Python-ideas] An async facade? (was Re: [Python-Dev] Socket
	timeout and completion based sockets)
In-Reply-To: <CAP7+vJ+np39bRs-F3YCcsyvZVSzDRdC7A2XvMHTDqHd3emW8mw@mail.gmail.com>
References: <EFE3877620384242A686D52278B7CCD329DEB553@RKV-IT-EXCH103.ccp.ad.local>
	<CAP7+vJJWDEwAvKB1VVrz-RX6b9kO3TpxnwUGnoq57746MV1WFg@mail.gmail.com>
	<EFE3877620384242A686D52278B7CCD329DEBF4B@RKV-IT-EXCH103.ccp.ad.local>
	<CAP7+vJKpeEDV9ZpvJqN2_joOt4cKj9B1BJ5gYeZusnwcnu_VwQ@mail.gmail.com>
	<EFE3877620384242A686D52278B7CCD329DED074@RKV-IT-EXCH103.ccp.ad.local>
	<CAP7+vJJKJrYbXFF8EjBSYdicVMKZv1J4A_rc4rdw6VkMkEg6Fg@mail.gmail.com>
	<1D9BE0CD-5BF4-480D-8D40-5A409E40760D@twistedmatrix.com>
	<20121130161422.GB536@snakebite.org>
	<A7269F03D11BC245BD52843B195AC4F0019E46B6@TK5EX14MBXC293.redmond.corp.microsoft.com>
	<CAP7+vJ+np39bRs-F3YCcsyvZVSzDRdC7A2XvMHTDqHd3emW8mw@mail.gmail.com>
Message-ID: <367DB117-A21A-4A9E-A401-3FCF4C6FE6FD@stranden.com>


On Nov 30, 2012, at 8:04 PM, Guido van Rossum <guido at python.org> wrote:

> Futures or callbacks, that's the question?

I would strongly recommend Futures, most importantly because it seams to handle Threads more elegantly, since it is easier to move between Threads.

> 
> Richard and I have even been considering APIs like this:
> 
> res = obj.some_call(<args>)
> if isinstance(res, Future):
>    res = yield res
> 
> or
> 
> res = obj.some_call(<args>)
> if res is None:
>    res = yield <magic>
> 
> where <magic> is some call on the scheduler/eventloop/proactor that
> pulls the future out of a hat.
> 
> The idea of the first version is simply to avoid the Future when the
> result happens to be immediately ready (e.g. when calling readline()
> on some buffering stream, most of the time the next line is already in
> the buffer); the point of the second version is that "res is None" is
> way faster than "isinstance(res, Future)" -- however the magic is a
> little awkward.
> 
> The debate is still open.
Great :-)

I understand that there are several layers involved (1) old style function call, 2) yield/coroutines and 3) threads) but I believe a model that handles all levels alike would be preferable. As a 3'rd API, consider:

res = obj.some_call(<args>)
self.other_call()
print res

the some_call() is *always" async and res i *always* a Future, 

1) if executed in same thread it can be optimised out and be a normal function call
2) if coroutine it's a perfect time for t.switch()
3) if threads other_call() continues and res blocks if not ready

Or maybe the notion of all objects running in separate coroutines/threads, all methods being async and all return values being Futures is something for Python 4? :-) (or PyLang an Erlang lookalike)
 
br
/rene

> 
> --Guido
> 
> On Fri, Nov 30, 2012 at 9:57 AM, Steve Dower <Steve.Dower at microsoft.com> wrote:
>> Trent Nelson wrote:
>>>   TL;DR version:
>>> 
>>>       Provide an async interface that is implicitly asynchronous;
>>>       all calls return immediately, callbacks are used to handle
>>>       success/error/timeout.
>> 
>> This is the central idea of what I've been advocating - the use of Future. Rather than adding an extra parameter to the initial call, asynchronous methods return an object that can have callbacks added.
>> 
>>>   The biggest benefit is that no assumption is made as to how the
>>>   asynchronicity is achieved.  Note that I didn't mention IOCP or
>>>   kqueue or epoll once.  Those are all implementation details that
>>>   the writer of an asynchronous Python app doesn't need to care about.
>> 
>> I think this is why I've been largely ignored (except by Guido) - I don't even mention sockets, let alone the implementation details :). There are all sorts of operations that can be run asynchronously that do not involve sockets, though it seems that the driving force behind most of the effort is just to make really fast web servers.
>> 
>> My code contribution is at http://bitbucket.org/stevedower/wattle, though I have not updated it in a while and there are certainly aspects that I would change. You may find it interesting if you haven't seen it yet.
>> 
>> Cheers,
>> Steve
>> 
>> -----Original Message-----
>> From: Python-ideas [mailto:python-ideas-bounces+steve.dower=microsoft.com at python.org] On Behalf Of Trent Nelson
>> Sent: Friday, November 30, 2012 0814
>> To: Guido van Rossum
>> Cc: Glyph; python-ideas at python.org
>> Subject: [Python-ideas] An async facade? (was Re: [Python-Dev] Socket timeout and completion based sockets)
>> 
>>    [ It's tough coming up with unique subjects for these async
>>      discussions.  I've dropped python-dev and cc'd python-ideas
>>      instead as the stuff below follows on from the recent msgs. ]
>> 
>>    TL;DR version:
>> 
>>        Provide an async interface that is implicitly asynchronous;
>>        all calls return immediately, callbacks are used to handle
>>        success/error/timeout.
>> 
>>            class async:
>>                def accept():
>>                def read():
>>                def write():
>>                def getaddrinfo():
>>                def submit_work():
>> 
>>        How the asynchronicity (not a word, I know) is achieved is
>>        an implementation detail, and will differ for each platform.
>> 
>>        (Windows will be able to leverage all its async APIs to full
>>         extent, Linux et al can keep mimicking asynchronicity via
>>         the usual non-blocking + multiplexing (poll/kqueue etc),
>>         thread pools, etc.)
>> 
>> 
>> On Wed, Nov 28, 2012 at 11:15:07AM -0800, Glyph wrote:
>>>   On Nov 28, 2012, at 12:04 PM, Guido van Rossum <guido at python.org> wrote:
>>>   I would also like to bring up <https://github.com/lvh/async-pep> again.
>> 
>>    So, I spent yesterday working on the IOCP/async stuff.  The saw this
>>    PEP and the sample async/abstract.py.  That got me thinking: why don't
>>    we have a low-level async facade/API?  Something where all calls are
>>    implicitly asynchronous.
>> 
>>    On systems with extensive support for asynchronous 'stuff', primarily
>>    Windows and AIX/Solaris to a lesser extent, we'd be able to leverage
>>    the platform-provided async facilities to full effect.
>> 
>>    On other platforms, we'd fake it, just like we do now, with select,
>>    poll/epoll, kqueue and non-blocking sockets.
>> 
>>    Consider the following:
>> 
>>        class Callback:
>>            __slots__ = [
>>                'success',
>>                'failure',
>>                'timeout',
>>                'cancel',
>>            ]
>> 
>>        class AsyncEngine:
>>            def getaddrinfo(host, port, ..., cb):
>>                ...
>> 
>>            def getaddrinfo_then_connect(.., callbacks=(cb1, cb2))
>>                ...
>> 
>>            def accept(sock, cb):
>>                ...
>> 
>>            def accept_then_write(sock, buf, (cb1, cb2)):
>>                ...
>> 
>>            def accept_then_expect_line(sock, line, (cb1, cb2)):
>>                ...
>> 
>>            def accept_then_expect_multiline_regex(sock, regex, cb):
>>                ...
>> 
>>            def read_until(fd_or_sock, bytes, cb):
>>                ...
>> 
>>            def read_all(fd_or_sock, cb):
>>                return self.read_until(fd_or_sock, EOF, cb)
>> 
>>            def read_until_lineglob(fd_or_sock, cb):
>>                ...
>> 
>>            def read_until_regex(fd_or_sock, cb):
>>                ...
>> 
>>            def read_chunk(fd_or_sock, chunk_size, cb):
>>                ...
>> 
>>            def write(fd_or_sock, buf, cb):
>>                ...
>> 
>>            def write_then_expect_line(fd_or_sock, buf, (cb1, cb2)):
>>                ...
>> 
>>            def connect_then_expect_line(..):
>>                ...
>> 
>>            def connect_then_write_line(..):
>>                ...
>> 
>>            def submit_work(callable, cb):
>>                ...
>> 
>>            def run_once(..):
>>                """Run the event loop once."""
>> 
>>            def run(..):
>>                """Keep running the event loop until exit."""
>> 
>>    All methods always take at least one callback.  Chained methods can
>>    take multiple callbacks (i.e. accept_then_expect_line()).  You fill
>>    in the success, failure (both callables) and timeout (an int) slots.
>>    The engine will populate cb.cancel with a callable that you can call
>>    at any time to (try and) cancel the IO operation.  (How quickly that
>>    works depends on the underlying implementation.)
>> 
>>    I like this approach for two reasons: a) it allows platforms with
>>    great async support to work at their full potential, and b) it
>>    doesn't leak implementation details like non-blocking sockets, fds,
>>    multiplexing (poll/kqueue/select, IOCP, etc).  Those are all details
>>    that are taken care of by the underlying implementation.
>> 
>>    getaddrinfo is a good example here.  Guido, in tulip, you have this
>>    implemented as:
>> 
>>        def getaddrinfo(host, port, af=0, socktype=0, proto=0):
>>            infos = yield from scheduling.call_in_thread(
>>                socket.getaddrinfo,
>>                host, port, af,
>>                socktype, proto
>>            )
>> 
>>    That's very implementation specific.  It assumes the only way to
>>    perform an async getaddrinfo is by calling it from a separate
>>    thread.  On Windows, there's native support for async getaddrinfo(),
>>    which we wouldn't be able to leverage here.
>> 
>>    The biggest benefit is that no assumption is made as to how the
>>    asynchronicity is achieved.  Note that I didn't mention IOCP or
>>    kqueue or epoll once.  Those are all implementation details that
>>    the writer of an asynchronous Python app doesn't need to care about.
>> 
>>    Thoughts?
>> 
>>        Trent.
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
> 
> 
> 
> -- 
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas



From thomas at kluyver.me.uk  Sat Dec  1 13:28:50 2012
From: thomas at kluyver.me.uk (Thomas Kluyver)
Date: Sat, 1 Dec 2012 12:28:50 +0000
Subject: [Python-ideas] Conventions for function annotations
Message-ID: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>

Function annotations (PEP 3107) are a very interesting new feature, but so
far have gone largely unused. The only project I've seen using them is
plac, a command-line option parser. One reason for this is that because
function annotations can be used to mean anything, we're wary of doing
anything in case we interfere with some other use case. A recent thread on
ipython-dev touched on this [1], and we'd like to suggest some conventions
to make annotations useful for everyone.

1. Code inspecting annotations should be prepared to ignore annotations it
can't understand.

2. Code creating annotations should use wrapper classes to indicate what
the annotation means. For instance, we are contemplating a way to specify
options for a parameter, to be used in tab completion, so we would do
something like this:

from IPython.core.completer import options
def my_io(filename, mode: options('read','write') ='read'):
    ...

3. There are a couple of important exceptions to 2:
- Annotations that are simply a string can be used like a docstring, to be
displayed to the user. Inspecting code should not expect to be able to
parse any machine-readable information out of these strings.
- Annotations that are a built-in type (int, str, etc.) indicate that the
value should always be an instance of that type. Inspecting code may use
these for type checking, introspection, optimisation, or other such
purposes. Note that for now, I have limited this to built-in types, so
other types can be used for other purposes, but this could be extended. For
instance, the ABCs from collections (collections.Mapping et al.) could well
be added to this category.

4. There should be a convention for attaching multiple annotations to one
value. I propose that all code using annotations expects to handle
tuples/lists of annotations. (We also considered dictionaries, but the
result is long and ugly). So in this definition:

def my_io(filename, mode: (options('read','write'), str, 'The mode in which
to open the file') ='read'):
    ...

the mode parameter has a set of options (ignored by frameworks that don't
recognise it), should always be a string, and has a description.

Any thoughts and suggestions are welcome.

As an aside, we may also create a couple of decorators to fill in
__annotations__ on Python 2, something like:

@return_annotation('A file obect')
@annotations(mode=(options('read','write'), str, 'The mode in which to open
the file'))
def my_io(filename, mode='read'):
    ...

[1] http://mail.scipy.org/pipermail/ipython-dev/2012-November/010697.html


Thanks,
Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121201/237f224c/attachment.html>

From andrew.svetlov at gmail.com  Sat Dec  1 15:59:59 2012
From: andrew.svetlov at gmail.com (Andrew Svetlov)
Date: Sat, 1 Dec 2012 16:59:59 +0200
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
Message-ID: <CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>

I think code related to annotations is tightly coupled with annotated
function usage context (decorator, metaclass, function caller).
So annotation really can mean anything and it depends from context.
I don't see use case when context need to ignore unexpected
annotation. In my practice annotation is always expected if specified,
absence of annotation for parameter is mark to do nothing with it (it
can be allowed or disabled depending of context requirements).
The same for multiple annotations. If your context allow it ? that's
up to you. Exact kind of composition to use depends from context ? it
can be tuple, dict, user-defined composition object.

My point is: we dont need to restrict annotations in any way. If some
libraries want to share annotations that means they are tightly enough
coupled and can make rules for itself. All other code can go in the
wild.

On Sat, Dec 1, 2012 at 2:28 PM, Thomas Kluyver <thomas at kluyver.me.uk> wrote:
> Function annotations (PEP 3107) are a very interesting new feature, but so
> far have gone largely unused. The only project I've seen using them is plac,
> a command-line option parser. One reason for this is that because function
> annotations can be used to mean anything, we're wary of doing anything in
> case we interfere with some other use case. A recent thread on ipython-dev
> touched on this [1], and we'd like to suggest some conventions to make
> annotations useful for everyone.
>
> 1. Code inspecting annotations should be prepared to ignore annotations it
> can't understand.
>
> 2. Code creating annotations should use wrapper classes to indicate what the
> annotation means. For instance, we are contemplating a way to specify
> options for a parameter, to be used in tab completion, so we would do
> something like this:
>
> from IPython.core.completer import options
> def my_io(filename, mode: options('read','write') ='read'):
>     ...
>
> 3. There are a couple of important exceptions to 2:
> - Annotations that are simply a string can be used like a docstring, to be
> displayed to the user. Inspecting code should not expect to be able to parse
> any machine-readable information out of these strings.
> - Annotations that are a built-in type (int, str, etc.) indicate that the
> value should always be an instance of that type. Inspecting code may use
> these for type checking, introspection, optimisation, or other such
> purposes. Note that for now, I have limited this to built-in types, so other
> types can be used for other purposes, but this could be extended. For
> instance, the ABCs from collections (collections.Mapping et al.) could well
> be added to this category.
>
> 4. There should be a convention for attaching multiple annotations to one
> value. I propose that all code using annotations expects to handle
> tuples/lists of annotations. (We also considered dictionaries, but the
> result is long and ugly). So in this definition:
>
> def my_io(filename, mode: (options('read','write'), str, 'The mode in which
> to open the file') ='read'):
>     ...
>
> the mode parameter has a set of options (ignored by frameworks that don't
> recognise it), should always be a string, and has a description.
>
> Any thoughts and suggestions are welcome.
>
> As an aside, we may also create a couple of decorators to fill in
> __annotations__ on Python 2, something like:
>
> @return_annotation('A file obect')
> @annotations(mode=(options('read','write'), str, 'The mode in which to open
> the file'))
> def my_io(filename, mode='read'):
>     ...
>
> [1] http://mail.scipy.org/pipermail/ipython-dev/2012-November/010697.html
>
>
> Thanks,
> Thomas
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
Thanks,
Andrew Svetlov


From tismer at stackless.com  Sat Dec  1 16:51:01 2012
From: tismer at stackless.com (Christian Tismer)
Date: Sat, 01 Dec 2012 16:51:01 +0100
Subject: [Python-ideas] An async facade? (was Re: [Python-Dev] Socket
 timeout and completion based sockets)
In-Reply-To: <CAP7+vJ+qr414pz_FqEh2DPKncdvuT9XvdgpXA7kkYhMrkiCiAQ@mail.gmail.com>
References: <EFE3877620384242A686D52278B7CCD329DEB553@RKV-IT-EXCH103.ccp.ad.local>
	<CAP7+vJJWDEwAvKB1VVrz-RX6b9kO3TpxnwUGnoq57746MV1WFg@mail.gmail.com>
	<EFE3877620384242A686D52278B7CCD329DEBF4B@RKV-IT-EXCH103.ccp.ad.local>
	<CAP7+vJKpeEDV9ZpvJqN2_joOt4cKj9B1BJ5gYeZusnwcnu_VwQ@mail.gmail.com>
	<EFE3877620384242A686D52278B7CCD329DED074@RKV-IT-EXCH103.ccp.ad.local>
	<CAP7+vJJKJrYbXFF8EjBSYdicVMKZv1J4A_rc4rdw6VkMkEg6Fg@mail.gmail.com>
	<1D9BE0CD-5BF4-480D-8D40-5A409E40760D@twistedmatrix.com>
	<20121130161422.GB536@snakebite.org>
	<A7269F03D11BC245BD52843B195AC4F0019E46B6@TK5EX14MBXC293.redmond.corp.microsoft.com>
	<CAP7+vJ+np39bRs-F3YCcsyvZVSzDRdC7A2XvMHTDqHd3emW8mw@mail.gmail.com>
	<A7269F03D11BC245BD52843B195AC4F0019E4758@TK5EX14MBXC293.redmond.corp.microsoft.com>
	<CAP7+vJ+qr414pz_FqEh2DPKncdvuT9XvdgpXA7kkYhMrkiCiAQ@mail.gmail.com>
Message-ID: <50BA2765.3010405@stackless.com>

On 30.11.12 20:29, Guido van Rossum wrote:
> On Fri, Nov 30, 2012 at 11:18 AM, Steve Dower <Steve.Dower at microsoft.com> wrote:
>> Guido van Rossum wrote:
>>> Futures or callbacks, that's the question...
>> I know the C++ standards committee is looking at the same thing right now, and they're probably going to provide both: futures for those who prefer them (which is basically how the code looks) and callbacks for when every cycle is critical or if the developer prefers them. C++ has the advantage that futures can often be optimized out, so implementing a Future-based wrapper around a callback-based function is very cheap, but the two-level API will probably happen.
> Well, for Python 3 we will definitely have two layers already:
> callbacks and yield-from-based-coroutines. The question is whether
> there's room for Futures in between (I like layers of abstraction, but
> I don't like having too many layers).

So far I agree very much.
> ...

> The debate is still open.
>> How about:
>>
>> value, future = obj.some_call(...)
>> if value is None:
>>      value = yield future
> Also considered; I don't really like having to allocate a tuple here
> (which is impossible to optimize out completely, even though its
> allocation may use a fast free list).

A little remark:

I do respect personal taste very much, and if a tuple can be avoided
I'm in fore sure.
But the argument of the cost of a tuple creation is something that even I
no longer consider relevant, especially in a context of other constructs
like yield-from which are (currently) not even efficient ( O(n)-wise ).

The discussion should better stay design oriented and not consider
little overhead by a constant factor.

But I agree that returned tuples are not a nice pattern to be used all 
the time.

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer at stackless.com>
Software Consulting          :     Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121     :    *Starship* http://starship.python.net/
14482 Potsdam                :     PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
       whom do you want to sponsor today?   http://www.stackless.com/



From thomas at kluyver.me.uk  Sat Dec  1 17:30:49 2012
From: thomas at kluyver.me.uk (Thomas Kluyver)
Date: Sat, 1 Dec 2012 16:30:49 +0000
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
Message-ID: <CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>

I think annotations are potentially very useful for things like
introspection and static analysis. For instance, your IDE could warn you if
you pass a parameter that doesn't match the type specified in an
annotation. In these cases, the code reading the annotations isn't coupled
with the function definitions.

I'm not aiming to restrict annotations, just to establish some conventions
to make them useful. We have a convention, for instance, that attributes
with a leading underscore are private. That's a useful basis that everyone
understands, so when you do obj.<tab> in IPython, it doesn't show those
attributes by default. I'd like to have some conventions of that nature
around annotations.

Thomas


On 1 December 2012 14:59, Andrew Svetlov <andrew.svetlov at gmail.com> wrote:

> I think code related to annotations is tightly coupled with annotated
> function usage context (decorator, metaclass, function caller).
> So annotation really can mean anything and it depends from context.
> I don't see use case when context need to ignore unexpected
> annotation. In my practice annotation is always expected if specified,
> absence of annotation for parameter is mark to do nothing with it (it
> can be allowed or disabled depending of context requirements).
> The same for multiple annotations. If your context allow it ? that's
> up to you. Exact kind of composition to use depends from context ? it
> can be tuple, dict, user-defined composition object.
>
> My point is: we dont need to restrict annotations in any way. If some
> libraries want to share annotations that means they are tightly enough
> coupled and can make rules for itself. All other code can go in the
> wild.
>
> On Sat, Dec 1, 2012 at 2:28 PM, Thomas Kluyver <thomas at kluyver.me.uk>
> wrote:
> > Function annotations (PEP 3107) are a very interesting new feature, but
> so
> > far have gone largely unused. The only project I've seen using them is
> plac,
> > a command-line option parser. One reason for this is that because
> function
> > annotations can be used to mean anything, we're wary of doing anything in
> > case we interfere with some other use case. A recent thread on
> ipython-dev
> > touched on this [1], and we'd like to suggest some conventions to make
> > annotations useful for everyone.
> >
> > 1. Code inspecting annotations should be prepared to ignore annotations
> it
> > can't understand.
> >
> > 2. Code creating annotations should use wrapper classes to indicate what
> the
> > annotation means. For instance, we are contemplating a way to specify
> > options for a parameter, to be used in tab completion, so we would do
> > something like this:
> >
> > from IPython.core.completer import options
> > def my_io(filename, mode: options('read','write') ='read'):
> >     ...
> >
> > 3. There are a couple of important exceptions to 2:
> > - Annotations that are simply a string can be used like a docstring, to
> be
> > displayed to the user. Inspecting code should not expect to be able to
> parse
> > any machine-readable information out of these strings.
> > - Annotations that are a built-in type (int, str, etc.) indicate that the
> > value should always be an instance of that type. Inspecting code may use
> > these for type checking, introspection, optimisation, or other such
> > purposes. Note that for now, I have limited this to built-in types, so
> other
> > types can be used for other purposes, but this could be extended. For
> > instance, the ABCs from collections (collections.Mapping et al.) could
> well
> > be added to this category.
> >
> > 4. There should be a convention for attaching multiple annotations to one
> > value. I propose that all code using annotations expects to handle
> > tuples/lists of annotations. (We also considered dictionaries, but the
> > result is long and ugly). So in this definition:
> >
> > def my_io(filename, mode: (options('read','write'), str, 'The mode in
> which
> > to open the file') ='read'):
> >     ...
> >
> > the mode parameter has a set of options (ignored by frameworks that don't
> > recognise it), should always be a string, and has a description.
> >
> > Any thoughts and suggestions are welcome.
> >
> > As an aside, we may also create a couple of decorators to fill in
> > __annotations__ on Python 2, something like:
> >
> > @return_annotation('A file obect')
> > @annotations(mode=(options('read','write'), str, 'The mode in which to
> open
> > the file'))
> > def my_io(filename, mode='read'):
> >     ...
> >
> > [1]
> http://mail.scipy.org/pipermail/ipython-dev/2012-November/010697.html
> >
> >
> > Thanks,
> > Thomas
> >
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at python.org
> > http://mail.python.org/mailman/listinfo/python-ideas
> >
>
>
>
> --
> Thanks,
> Andrew Svetlov
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121201/8f9adb01/attachment.html>

From ncoghlan at gmail.com  Sun Dec  2 07:58:57 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 2 Dec 2012 16:58:57 +1000
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
Message-ID: <CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>

On Sun, Dec 2, 2012 at 4:26 PM, Robert McGibbon <rmcgibbo at gmail.com> wrote:

> By being *tolerant and well behaved when confronted with annotations that
> our library doesn't understand I*, I
> think we can use function annotations without a short-range decorator that
> translates their information in some other
> structure. If other annotation-using libraries are also willing to ignore
> our tabbing annotations if/when they encounter them,
> then can't we all get along smoothly?
> *
> *
> (For reference, the feature will look/work something like this)
>
> *
> In[1]: def foo(filename : tab_glob('*.txt')):  # tab completion that
> recommends files/directories that match a glob pattern
> ...          pass
> ...
> In[2]: foo(<TAB>
> 'a.txt'        'b.txt'
> 'c.txt'         'dir/'
> *
>

You're missing the other key reason for requiring decorators that interpret
function annotations: they're there for the benefit of *readers*, not just
other software. Given your definition above, I don't know what the
annotations are for, except by recognising the "tab_glob" call. However,
that then breaks as soon as the expression is put into a named variable
earlier in the file:

    def foo(filename : text_files): # What does this mean?
        pass

But the reader can be told *explicitly* what the annotations are related to
via a decorator:

    @tab_expansion
    def foo(filename : text_files): # Oh, it's just a tab expansion
specifier
        pass

Readers no longer have to guess from context, and if the tab_expansion
decorator creates IPython-specific metadata, then the interpreter doesn't
need to guess either.

(Note that you *can* use ordinary mechanisms like class decorators,
metaclasses, post-creation modification of classes and IDE snippet
inclusion to avoid the need to type out the "this is what these annotations
mean" decorator explicitly. However, that's just an application of Python's
standard abstraction tools, rather than a further special case convention)

Mixing annotations intended for different consumers is a fundamentally bad
idea, as it encourages unreadable code and complex dances to avoid stepping
on each other's toes. It's better to design a *separate* API that supports
composition by passing the per-parameter details directly to a decorator
factory (which then adds appropriate named attributes to the function),
with annotations used just as syntactic sugar for simple cases where no
composition is involved.

The important first question to ask is "How would we solve this if
annotations didn't exist?" and only *then* look at the shorthand case for
function-annotations. For cases where function annotations make code more
complex or less robust, *don't use them*.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121202/4d60d2e3/attachment.html>

From ncoghlan at gmail.com  Sun Dec  2 05:58:27 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 2 Dec 2012 14:58:27 +1000
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
Message-ID: <CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>

On Sun, Dec 2, 2012 at 2:30 AM, Thomas Kluyver <thomas at kluyver.me.uk> wrote:

> I think annotations are potentially very useful for things like
> introspection and static analysis. For instance, your IDE could warn you if
> you pass a parameter that doesn't match the type specified in an
> annotation. In these cases, the code reading the annotations isn't coupled
> with the function definitions.
>
> I'm not aiming to restrict annotations, just to establish some conventions
> to make them useful. We have a convention, for instance, that attributes
> with a leading underscore are private. That's a useful basis that everyone
> understands, so when you do obj.<tab> in IPython, it doesn't

show those attributes by default. I'd like to have some conventions of that
> nature around annotations.


Indeed, composability is a problem with annotations. I suspect the only way
to resolve this systematically is to adopt a convention where annotations
are used *strictly* for short-range communication with an associated
decorator that transfers the annotation details to a *different*
purpose-specific location for long-term introspection.

Furthermore, if composability is going to be possible in general,
annotations can really *only* be used as a convenience API, with an
underlying API where the necessary details are supplied directly to the
decorator. For example, here's an example using the main decorator API for
a cffi callback declaration [1]:

    @cffi.callback("int (char *, int)"):
    def my_cb(arg1, arg2):
        ...

The problem with this is that it can get complicated to map C-level types
to parameter names as the function signature gets more complicated. So,
what you may want to do is write a decorator that builds the CFFI signature
from annotations on the individual parameters:

    @annotated_cffi_callback
    def my_cb(arg1: "char *", arg2: "int") -> "int":
        ...

The decorator would turn that into an ordinary call to cffi.callback, so
future introspection wouldn't look at the annotations mapping at all, it
would look directly at the CFFI metadata.

Annotations should probably only ever be introspected by their associated
decorator, and if you really want to apply multiple decorators with
annotation support to a single function, you're going to have to fall back
to the non-annotation based API for at least some of them. Once you start
trying to overload the annotation field with multiple annotations, the
readability gain for closer association with the individual parameters is
counterbalanced by the loss of association between the subannotations and
their corresponding decorators.

[1] http://cffi.readthedocs.org/en/latest/#callbacks

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121202/3cb36545/attachment.html>

From rmcgibbo at gmail.com  Sun Dec  2 11:12:06 2012
From: rmcgibbo at gmail.com (Robert McGibbon)
Date: Sun, 2 Dec 2012 02:12:06 -0800
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
Message-ID: <4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>

Nick,

Thanks! You make a very convincing argument.

Especially if this represents the collective recommendation of the python core development team on the proper conventions surrounding the use of function annotations, I would encourage you guys to perhaps make it more widely known (blogs, etc). As python 3.x adoption continues to move forward, this type of thing could become an issue if shmucks like me start using the annotation feature more widely.

-Robert

On Dec 1, 2012, at 10:58 PM, Nick Coghlan wrote:

> Mixing annotations intended for different consumers is a fundamentally bad idea, as it encourages unreadable code and complex dances to avoid stepping on each other's toes. It's better to design a *separate* API that supports composition by passing the per-parameter details directly to a decorator factory (which then adds appropriate named attributes to the function), with annotations used just as syntactic sugar for simple cases where no composition is involved.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121202/0740707a/attachment.html>

From ncoghlan at gmail.com  Sun Dec  2 12:43:34 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 2 Dec 2012 21:43:34 +1000
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
Message-ID: <CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>

On Sun, Dec 2, 2012 at 8:12 PM, Robert McGibbon <rmcgibbo at gmail.com> wrote:

> Nick,
>
> Thanks! You make a very convincing argument.
>
> Especially if this represents the collective recommendation of the python
> core development team on the proper conventions surrounding the use of
> function annotations, I would encourage you guys to perhaps make it more
> widely known (blogs, etc). As python 3.x adoption continues to move
> forward, this type of thing could become an issue if shmucks like me start
> using the annotation feature more widely.
>

Last time it came up, the collective opinion on python-dev was still to
leave PEP 8 officially neutral on the topic so that people could experiment
more freely with annotations and the community could help figure out what
worked well and what didn't. Admittedly this was long enough ago that I
don't remember the details, just the obvious consequence that PEP 8 remains
largely silent on the matter, aside from declaring that function
annotations are off-limits for standard library modules: "The Python
standard library will not use function annotations as that would result in
a premature commitment to a particular annotation style. Instead, the
annotations are left for users to discover and experiment with useful
annotation styles."

Obviously, I'm personally rather less open-minded on the topic of
*composition* in particular, as that's a feature I'm firmly convinced
should be left in the hands of ordinary decorator usage. I believe trying
to contort annotations to handle that cause is almost certain to result in
something less readable than the already possible decorator equivalent.

However, the flip-side of the argument is that if we assume my opinion is
correct and document it as an official recommendation in PEP 8, then many
people won't even *try* to come up with good approaches to composition for
function annotations. Maybe there *is* an elegant, natural solution out
there that's superior to using explicit calls to decorator factories for
the cases that involve composition. If PEP 8 declares "just use decorator
factories for cases involving composition, and always design your APIs with
a non-annotation based fallback for such cases", would we be inadvertently
shutting down at least some of the very experimentation we intended to
allow?

After all, while I don't think the composition proposal in this thread
reached the bar of being more readable than just composing decorator
factories to handle more complex cases, I *do* think it is quite a decent
attempt.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121202/def92399/attachment.html>

From thomas at kluyver.me.uk  Sun Dec  2 16:25:24 2012
From: thomas at kluyver.me.uk (Thomas Kluyver)
Date: Sun, 2 Dec 2012 15:25:24 +0000
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
Message-ID: <CAOvn4qg0hSCb1U99eksiK3n+qXfD2Qm0fT2tatuqYx6PDJ43cQ@mail.gmail.com>

On 2 December 2012 11:43, Nick Coghlan <ncoghlan at gmail.com> wrote:

> However, the flip-side of the argument is that if we assume my opinion is
> correct and document it as an official recommendation in PEP 8, then many
> people won't even *try* to come up with good approaches to composition for
> function annotations. Maybe there *is* an elegant, natural solution out
> there that's superior to using explicit calls to decorator factories for
> the cases that involve composition. If PEP 8 declares "just use decorator
> factories for cases involving composition, and always design your APIs with
> a non-annotation based fallback for such cases", would we be inadvertently
> shutting down at least some of the very experimentation we intended to
> allow?


My concern with this is that it's tricky to experiment with composition. If
you want to simultaneously use annotations for, say, one framework that
checks argument types, and one that documents individual arguments based on
annotations, they need to be using the same mechanism to compose annotation
values. Alternatively, the first one to access the annotations could
decompose the values, leaving them in a form the second can understand -
but that sounds brittle and opaque.

Another proposed mechanism (Robert's idea) which I didn't mention above is
to override __add__, so that multiple annotations can be composed like this:

def my_io(filename, mode: tab('read','write') + typed(str) ='read'):
    ...

As a possible workaround, here's a decorator for decorators that makes the
following two definitions equivalent:
https://gist.github.com/4189289

@check_argtypes
def checked1(a:int, b:str):
    pass

@check_argtypes(a=int, b=str)
def checked2(a, b):
    pass

With this, it's easy to use annotations where possible, and you benefit
from the extra clarity, but it's equally simple to pass the values as
arguments to the decorator, for instance if the annotations are already in
use for something else. It should also work under Python 2, using the
non-annotated version.

Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121202/17e0b3eb/attachment.html>

From steve at pearwood.info  Sun Dec  2 23:23:25 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 03 Dec 2012 09:23:25 +1100
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
Message-ID: <50BBD4DD.7010703@pearwood.info>

On 02/12/12 22:43, Nick Coghlan wrote:

> Last time it came up, the collective opinion on python-dev was still to
> leave PEP 8 officially neutral on the topic so that people could experiment
> more freely with annotations and the community could help figure out what
> worked well and what didn't. Admittedly this was long enough ago that I
> don't remember the details, just the obvious consequence that PEP 8 remains
> largely silent on the matter, aside from declaring that function
> annotations are off-limits for standard library modules: "The Python
> standard library will not use function annotations as that would result in
> a premature commitment to a particular annotation style. Instead, the
> annotations are left for users to discover and experiment with useful
> annotation styles."

I fear that this was a strategic mistake. The result, it seems to me, is that
annotations have been badly neglected.

I can't speak for others, but I heavily use the standard library as a guide
to what counts as good practice in Python. I'm not a big user of third party
libraries, and most of those are for 2.x, so with the lack of annotations in
the std lib I've had no guidance as to what sort of things annotations could
be used for apart from "type checking".

I'm sure that I'm not the only one.



-- 
Steven


From andrew.svetlov at gmail.com  Mon Dec  3 00:33:59 2012
From: andrew.svetlov at gmail.com (Andrew Svetlov)
Date: Mon, 3 Dec 2012 01:33:59 +0200
Subject: [Python-ideas] WSAPoll and tulip
In-Reply-To: <CADiSq7cVciR1=C5osZKK1Umi+YqJA9xq=rXr6dt-h3sQVXwa4g@mail.gmail.com>
References: <20121127123325.GH90314@snakebite.org>
	<20121127154204.5fc81457@pitrou.net>
	<20121127150330.GB91191@snakebite.org>
	<CAP7+vJJPgWh4gY=1=yxWYtP9pyaOyy4DHDT592d=7LiP_BnzVQ@mail.gmail.com>
	<CAL3CFcXT5MkoOpbuwAhDsFjBneS76_zhaHMHgovQ4HZD2pg8eA@mail.gmail.com>
	<CAP7+vJ+MmO6TZqs_JoW9W5OLc_i3Wgp47mYOu3D5fBoCaeYsVQ@mail.gmail.com>
	<CAL3CFcUwtjOxkPm2enqC8sQYF2H=jzZeT6zM3PXQr80zFU0n7Q@mail.gmail.com>
	<CAP7+vJJDhmu=UzcdFCXew=DmmBqc-LVCqceNZrbRicn9GhrKeA@mail.gmail.com>
	<CADiSq7cVciR1=C5osZKK1Umi+YqJA9xq=rXr6dt-h3sQVXwa4g@mail.gmail.com>
Message-ID: <CAL3CFcUdm3gnsN+OhjahNaYDHUt30ZOies3wLvYGu3ZnPimMVA@mail.gmail.com>

Created http://bugs.python.org/issue16596 for jumping over yields.
Please review.

On Wed, Nov 28, 2012 at 11:41 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> That will need to be well highlighted in What's New,  as it could be very
> confusing if the iterator is never called again.
>
> --
> Sent from my phone, thus the relative brevity :)



-- 
Thanks,
Andrew Svetlov


From aquavitae69 at gmail.com  Mon Dec  3 06:05:22 2012
From: aquavitae69 at gmail.com (David Townshend)
Date: Mon, 3 Dec 2012 07:05:22 +0200
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <50BBD4DD.7010703@pearwood.info>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<50BBD4DD.7010703@pearwood.info>
Message-ID: <CAEgL-ferpYFii-8iv562COGacX4qA1Pd3GyD0J_+6P7yJRKVyA@mail.gmail.com>

> I fear that this was a strategic mistake. The result, it seems to me, is
that
> annotations have been badly neglected.
>
> I can't speak for others, but I heavily use the standard library as a
guide
> to what counts as good practice in Python. I'm not a big user of third
party
> libraries, and most of those are for 2.x, so with the lack of annotations
in
> the std lib I've had no guidance as to what sort of things annotations
could
> be used for apart from "type checking".
>
> I'm sure that I'm not the only one.
>
>
>
> --
> Steven
>

+1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121203/f433c20f/attachment.html>

From raymond.hettinger at gmail.com  Mon Dec  3 09:09:17 2012
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Mon, 3 Dec 2012 00:09:17 -0800
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
Message-ID: <9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>


On Dec 2, 2012, at 3:43 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

>  Admittedly this was long enough ago that I don't remember the details, just the obvious consequence that PEP 8 remains largely silent on the matter, aside from declaring that function annotations are off-limits for standard library modules: 

PEP 8 is not "largely silent" on the subject:

"''
The Python standard library will not use function annotations as that would result in a premature commitment to a particular annotation style. Instead, the annotations are left for users to discover and experiment with useful annotation styles.

Early core developer attempts to use function annotations revealed inconsistent, ad-hoc annotation styles. For example:

[str] was ambiguous as to whether it represented a list of strings or a value that could be either str or None.
The notation open(file:(str,bytes)) was used for a value that could be either bytes or str rather than a 2-tuple containing a str value followed by a bytesvalue.
The annotation seek(whence:int) exhibited an mix of over-specification and under-specification: int is too restrictive (anything with __index__ would be allowed) and it is not restrictive enough (only the values 0, 1, and 2 are allowed). Likewise, the annotation write(b: bytes) was also too restrictive (anything supporting the buffer protocol would be allowed).
Annotations such as read1(n: int=None) were self-contradictory since None is not an int. Annotations such as source_path(self, fullname:str) -> objectwere confusing about what the return type should be.
In addition to the above, annotations were inconsistent in the use of concrete types versus abstract types: int versus Integral and set/frozenset versus MutableSet/Set.
Some annotations in the abstract base classes were incorrect specifications. For example, set-to-set operations require other to be another instance of Setrather than just an Iterable.
A further issue was that annotations become part of the specification but weren't being tested.
In most cases, the docstrings already included the type specifications and did so with greater clarity than the function annotations. In the remaining cases, the docstrings were improved once the annotations were removed.
The observed function annotations were too ad-hoc and inconsistent to work with a coherent system of automatic type checking or argument validation. Leaving these annotations in the code would have made it more difficult to make changes later so that automated utilities could be supported.
'''


Raymond
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121203/f902ab85/attachment.html>

From solipsis at pitrou.net  Mon Dec  3 10:21:14 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 3 Dec 2012 10:21:14 +0100
Subject: [Python-ideas] Conventions for function annotations
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
Message-ID: <20121203102114.58a63d2b@pitrou.net>

Le Mon, 3 Dec 2012 00:09:17 -0800,
Raymond Hettinger
<raymond.hettinger at gmail.com> a ?crit :
> 
> Early core developer attempts to use function annotations revealed
> inconsistent, ad-hoc annotation styles. For example:
> 
> [str] was ambiguous as to whether it represented a list of strings or
> a value that could be either str or None. The notation
> open(file:(str,bytes)) was used for a value that could be either
> bytes or str rather than a 2-tuple containing a str value followed by
> a bytesvalue. The annotation seek(whence:int) exhibited an mix of
> over-specification and under-specification: int is too restrictive
> (anything with __index__ would be allowed) and it is not restrictive
> enough (only the values 0, 1, and 2 are allowed). Likewise, the
> annotation write(b: bytes) was also too restrictive (anything
> supporting the buffer protocol would be allowed). Annotations such as
> read1(n: int=None) were self-contradictory since None is not an int.
> Annotations such as source_path(self, fullname:str) -> objectwere
> confusing about what the return type should be. In addition to the
> above, annotations were inconsistent in the use of concrete types
> versus abstract types: int versus Integral and set/frozenset versus
> MutableSet/Set. Some annotations in the abstract base classes were
> incorrect specifications. For example, set-to-set operations require
> other to be another instance of Setrather than just an Iterable.

In short, we have discovered that declarative typing isn't very
useful :-)

Regards

Antoine.




From p.f.moore at gmail.com  Mon Dec  3 10:30:50 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 3 Dec 2012 09:30:50 +0000
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CACac1F9M7y+Kse8pNr5igjmq7VPrM0esD1LnCaJY8-Jt6Hj_yA@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<20121203102114.58a63d2b@pitrou.net>
	<CACac1F9M7y+Kse8pNr5igjmq7VPrM0esD1LnCaJY8-Jt6Hj_yA@mail.gmail.com>
Message-ID: <CACac1F8fw2zBCsdrKBOSGTyVUDEnwgikwkBy_RpBMgnGcPtwNw@mail.gmail.com>

Sorry, should have gone to the list

On 3 December 2012 09:30, Paul Moore <p.f.moore at gmail.com> wrote:
> On 3 December 2012 09:21, Antoine Pitrou <solipsis at pitrou.net> wrote:
>> In short, we have discovered that declarative typing isn't very
>> useful :-)
>
> .. but haven't thought of any other useful applications of
> annotations, and nor has the collective community on PyPI.
>
> Annotations seem like a solution looking for a problem, to me. (Which
> is a shame, as they look like a pretty cool solution)
> Paul


From rmcgibbo at gmail.com  Mon Dec  3 10:41:15 2012
From: rmcgibbo at gmail.com (Robert McGibbon)
Date: Mon, 3 Dec 2012 01:41:15 -0800
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CACac1F8fw2zBCsdrKBOSGTyVUDEnwgikwkBy_RpBMgnGcPtwNw@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<20121203102114.58a63d2b@pitrou.net>
	<CACac1F9M7y+Kse8pNr5igjmq7VPrM0esD1LnCaJY8-Jt6Hj_yA@mail.gmail.com>
	<CACac1F8fw2zBCsdrKBOSGTyVUDEnwgikwkBy_RpBMgnGcPtwNw@mail.gmail.com>
Message-ID: <5A715C4D-18C8-4BF0-B972-761BFDDAC3F3@gmail.com>

The IPython community has thought of using annotations to do argument specific
tab completion in the interactive interpreter.

For example, a load function whose first argument is supposed to be files matching 
a certain glob pattern might use a function annotation on that argument to specify
the glob pattern.

A sympy maintainer, Aaron Meurer,  has also expressed interest in using this feature
-- as implemented in ipython -- to annotate sympy functions' return values by type
to facilitate tab completion for chained calls like f(x).<TAB>

I'm working on this feature for IPython (PR: Function annotation based hooks into the tab completion system).
I've already benefited a lot from the discussion on this thread in terms of the design
of the API. Specifically Nick Coghlan's arguments have been very enlightening.
Comments, suggestions, contributions, etc are welcome!

-Robert


On Dec 3, 2012, at 1:30 AM, Paul Moore wrote:

> Sorry, should have gone to the list
> 
> On 3 December 2012 09:30, Paul Moore <p.f.moore at gmail.com> wrote:
>> On 3 December 2012 09:21, Antoine Pitrou <solipsis at pitrou.net> wrote:
>>> In short, we have discovered that declarative typing isn't very
>>> useful :-)
>> 
>> .. but haven't thought of any other useful applications of
>> annotations, and nor has the collective community on PyPI.
>> 
>> Annotations seem like a solution looking for a problem, to me. (Which
>> is a shame, as they look like a pretty cool solution)
>> Paul
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121203/259c6fe9/attachment.html>

From thomas at kluyver.me.uk  Mon Dec  3 11:52:26 2012
From: thomas at kluyver.me.uk (Thomas Kluyver)
Date: Mon, 3 Dec 2012 10:52:26 +0000
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CACac1F8fw2zBCsdrKBOSGTyVUDEnwgikwkBy_RpBMgnGcPtwNw@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<20121203102114.58a63d2b@pitrou.net>
	<CACac1F9M7y+Kse8pNr5igjmq7VPrM0esD1LnCaJY8-Jt6Hj_yA@mail.gmail.com>
	<CACac1F8fw2zBCsdrKBOSGTyVUDEnwgikwkBy_RpBMgnGcPtwNw@mail.gmail.com>
Message-ID: <CAOvn4qi__wUUqOZr4LfZ9oO4bTqPhxZgLU6CSj-gfiBLOautNA@mail.gmail.com>

On 3 December 2012 09:30, Paul Moore <p.f.moore at gmail.com> wrote:

> > .. but haven't thought of any other useful applications of
> > annotations, and nor has the collective community on PyPI.
>

I suspect that the lack of applications is partly due to people not knowing
about them, code having to still support Python 2, and an absence of
guidelines about how to use them safely.

For our part, I think we'll push forwards following Nick's suggestions -
annotations to be accessed by closely coupled decorators only.

Thanks all,
Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121203/282bff8f/attachment.html>

From ncoghlan at gmail.com  Mon Dec  3 12:08:01 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 3 Dec 2012 21:08:01 +1000
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
Message-ID: <CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>

On Mon, Dec 3, 2012 at 6:09 PM, Raymond Hettinger <
raymond.hettinger at gmail.com> wrote:

>
> On Dec 2, 2012, at 3:43 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>
>  Admittedly this was long enough ago that I don't remember the details,
> just the obvious consequence that PEP 8 remains largely silent on the
> matter, aside from declaring that function annotations are off-limits for
> standard library modules:
>
>
> PEP 8 is not "largely silent" on the subject:
>

It's effectively silent on the matters at hand, which are:

* the advisability of using annotations without an associated decorator
that makes the interpretation currently in play explicit (while the
examples given do illustrate why *not* doing this is a bad idea, it doesn't
explicitly state that conclusion, merely "we're not going to use them in
the standard library at this point")
* the advisability of providing a pure annotations API, without any
fallback to an explicit decorator factory
* the advisability of handling composition within the annotations
themselves, rather than by falling back to explicit decorator factories
* the advisability of using the __annotations__ dictionary for long-term
introspection, rather than using the decorator to move the information to a
purpose-specific location in a separate function attribute

I would be *quite delighted* if people are open to the idea of making a
much stronger recommendation along the following lines explicit in PEP 8:

==================

* If function annotations are used, it is recommended that:
    * the annotation details should be designed with a specific practical
use case in mind
    * the annotations are used solely as a form of syntactic sugar for
passing arguments to a decorator factory that would otherwise accept
explicit per-parameter arguments
    * the decorator factory name should provide the reader of the code with
a strong hint as to the intended meaning of the parameter annotations (or
at least a convenient reference point to look up in the documentation)
    * in simple cases, using parameter and return type annotations will
then allow the per-parameter details to be mapped easily by both the code
author and later readers without requiring repetition of parameter names or
careful alignment of factory arguments with parameter positions.
    * the explicit form remains available to handle more complex situations
(such as applying multiple decorators to the same function) without
requiring complicated conventions for composing independent annotations on
a single function

==================

In relation to the last point, I consider composing annotations to be
analogous to composing function arguments. Writing:

    @g
    @f
    def annotated(arg1: (a, x), arg2: (b, y), arg3: (c, z)):
        ...

instead of the much simpler:

    @g(x, y, z)
    @f(a, b, c)
    def annotated(arg1, arg2, arg3):
        ...

is analagous to writing:

    args = [(a, x), (b, y), (c, z)]
    f(*(x[0] for x in args))
    g(*(x[1] for x in args))

instead of the more obvious:

    f(a, b, c)
    g(x, y, z)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121203/f62cd443/attachment.html>

From andrew.svetlov at gmail.com  Mon Dec  3 12:45:51 2012
From: andrew.svetlov at gmail.com (Andrew Svetlov)
Date: Mon, 3 Dec 2012 13:45:51 +0200
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
Message-ID: <CAL3CFcV6WZ76pMPJZqGYvqijPN5_zKzrJkvQE_XWJxKCo0cgKg@mail.gmail.com>

On Mon, Dec 3, 2012 at 1:08 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> * the advisability of using the __annotations__ dictionary for long-term
> introspection, rather than using the decorator to move the information to a
> purpose-specific location in a separate function attribute
My 5 cents: perhaps you don't need to use __annotations__ at all,
Signature object (PEP 362) gives more convenient way for gathering
information about function spec.


From ncoghlan at gmail.com  Mon Dec  3 12:51:01 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 3 Dec 2012 21:51:01 +1000
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CAL3CFcV6WZ76pMPJZqGYvqijPN5_zKzrJkvQE_XWJxKCo0cgKg@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
	<CAL3CFcV6WZ76pMPJZqGYvqijPN5_zKzrJkvQE_XWJxKCo0cgKg@mail.gmail.com>
Message-ID: <CADiSq7eBPWYd7j35HQ5Tfk=Sk0e1dYVVWC7C4cE4f59SbrMm7A@mail.gmail.com>

On Mon, Dec 3, 2012 at 9:45 PM, Andrew Svetlov <andrew.svetlov at gmail.com>wrote:

> On Mon, Dec 3, 2012 at 1:08 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> > * the advisability of using the __annotations__ dictionary for long-term
> > introspection, rather than using the decorator to move the information
> to a
> > purpose-specific location in a separate function attribute
> My 5 cents: perhaps you don't need to use __annotations__ at all,
> Signature object (PEP 362) gives more convenient way for gathering
> information about function spec.
>

I don't quite understand that comment - PEP 362 is purely an access
mechanism. The underlying storage is still in __annotations__ (at least as
far any annotations are concerned).

However, using separate storage is a natural consequence of also providing
an explicit decorator factory API, so I didn't bring it up.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121203/02d9b71e/attachment.html>

From andrew.svetlov at gmail.com  Mon Dec  3 12:53:59 2012
From: andrew.svetlov at gmail.com (Andrew Svetlov)
Date: Mon, 3 Dec 2012 13:53:59 +0200
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CADiSq7eBPWYd7j35HQ5Tfk=Sk0e1dYVVWC7C4cE4f59SbrMm7A@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
	<CAL3CFcV6WZ76pMPJZqGYvqijPN5_zKzrJkvQE_XWJxKCo0cgKg@mail.gmail.com>
	<CADiSq7eBPWYd7j35HQ5Tfk=Sk0e1dYVVWC7C4cE4f59SbrMm7A@mail.gmail.com>
Message-ID: <CAL3CFcX6VWveiLUVB5PKA=gXJsQmuyHLwb_bNPt3KNLrBNO28A@mail.gmail.com>

Ok, you right. I told about access mechanism only.

On Mon, Dec 3, 2012 at 1:51 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Mon, Dec 3, 2012 at 9:45 PM, Andrew Svetlov <andrew.svetlov at gmail.com>
> wrote:
>>
>> On Mon, Dec 3, 2012 at 1:08 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> > * the advisability of using the __annotations__ dictionary for long-term
>> > introspection, rather than using the decorator to move the information
>> > to a
>> > purpose-specific location in a separate function attribute
>> My 5 cents: perhaps you don't need to use __annotations__ at all,
>> Signature object (PEP 362) gives more convenient way for gathering
>> information about function spec.
>
>
> I don't quite understand that comment - PEP 362 is purely an access
> mechanism. The underlying storage is still in __annotations__ (at least as
> far any annotations are concerned).
>
> However, using separate storage is a natural consequence of also providing
> an explicit decorator factory API, so I didn't bring it up.
>
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia



-- 
Thanks,
Andrew Svetlov


From barry at python.org  Mon Dec  3 16:34:16 2012
From: barry at python.org (Barry Warsaw)
Date: Mon, 3 Dec 2012 10:34:16 -0500
Subject: [Python-ideas] Conventions for function annotations
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
Message-ID: <20121203103416.03094472@resist.wooz.org>

On Dec 03, 2012, at 09:08 PM, Nick Coghlan wrote:

>I would be *quite delighted* if people are open to the idea of making a
>much stronger recommendation along the following lines explicit in PEP 8:

I am -1 for putting any of what followed in PEP 8, and in fact, I think the
existing examples at the bottom of PEP 8 are inappropriate.

PEP 8 should be prescriptive of explicit Python coding styles.  Think "do
this, not that".  It should be as minimal as possible, and in general provide
rules that can be easily referenced and perhaps automated (e.g. pep8.py).

Some of the existing text in PEP 8 already doesn't fall under that rubric, but
it's close enough (e.g. designing for inheritance).

I don't think annotations reach the level of consensus or practical experience
needed to be added to PEP 8.

OTOH, I wouldn't oppose a new informational PEP labeled "Annotations Best
Practices", where some of these principles can be laid out and explored.

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121203/a08cad77/attachment.pgp>

From guido at python.org  Mon Dec  3 18:27:35 2012
From: guido at python.org (Guido van Rossum)
Date: Mon, 3 Dec 2012 09:27:35 -0800
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <20121203103416.03094472@resist.wooz.org>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
	<20121203103416.03094472@resist.wooz.org>
Message-ID: <CAP7+vJJKXgK_Pc+Y+6jQyGW6CPOc44uhjW+ZCNWdaD5MTa9EmQ@mail.gmail.com>

Hm. I agree PEP 8 seems an odd place for Nick's recommendation. Even
if I were to agree with hos proposal I would think it belongs in a
different PEP than PEP 8.

But personally I haven't given up on using annotations to give type
hints -- I think it can at some times be a useful augmentation to
static analysis (whose use I see mostly as an aid to human readers
and/or tools like linters, IDEs, and refactoring tools, not for
guiding compiler optimizations). I know of several projects (both
public and private) for improving the state of the art of Python
static analysis with this goal in mind. With the advent of e.g.
TypeScript and Dart in the JavaScript world, optional type annotations
for dynamic languages appear to be becoming more fashionable, and
maybe we can get some use out of them.

FWIW, as far as e.g. 'int' being both overspecified and
underspecified: I don't care about the underspecification so much,
that's always going to happen; and for the overspecification, we can
either use some abstract class instead, or simply state that the
occurrence of certain concrete types must be taken as a shorthand for
a specific abstract type. This could be part of the registration call
of the concrete type, or something.

Obviously this would require inventing and standardizing notations for
things like "list of X", "tuple with items X, Y, Z", "either X or Y",
and so on, as well as a standard way of combining annotations intended
for different tools.

*This* would be a useful discussion. What to do in the interim... I
think the current language in PEP 8 is just fine until we have a
better story.

--Guido

On Mon, Dec 3, 2012 at 7:34 AM, Barry Warsaw <barry at python.org> wrote:
> On Dec 03, 2012, at 09:08 PM, Nick Coghlan wrote:
>
>>I would be *quite delighted* if people are open to the idea of making a
>>much stronger recommendation along the following lines explicit in PEP 8:
>
> I am -1 for putting any of what followed in PEP 8, and in fact, I think the
> existing examples at the bottom of PEP 8 are inappropriate.
>
> PEP 8 should be prescriptive of explicit Python coding styles.  Think "do
> this, not that".  It should be as minimal as possible, and in general provide
> rules that can be easily referenced and perhaps automated (e.g. pep8.py).
>
> Some of the existing text in PEP 8 already doesn't fall under that rubric, but
> it's close enough (e.g. designing for inheritance).
>
> I don't think annotations reach the level of consensus or practical experience
> needed to be added to PEP 8.
>
> OTOH, I wouldn't oppose a new informational PEP labeled "Annotations Best
> Practices", where some of these principles can be laid out and explored.

-- 
--Guido van Rossum (python.org/~guido)


From ncoghlan at gmail.com  Tue Dec  4 00:02:58 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 4 Dec 2012 09:02:58 +1000
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CAP7+vJJKXgK_Pc+Y+6jQyGW6CPOc44uhjW+ZCNWdaD5MTa9EmQ@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
	<20121203103416.03094472@resist.wooz.org>
	<CAP7+vJJKXgK_Pc+Y+6jQyGW6CPOc44uhjW+ZCNWdaD5MTa9EmQ@mail.gmail.com>
Message-ID: <CADiSq7fy3hW_Mze1jJND7MeknPKJzVhWU8KPXzfHnbO+TQV++Q@mail.gmail.com>

So long as any type hinting semantics are associated with a "@type_hints"
decorator, none of those ideas conflict with my suggestions for good
annotation usage practices.

The explicit decorators effectively end up serving as dialect specifiers
for the annotations, for the benefit of other software (by moving the
metadata out to purpose specific attributes) and for readers (simply by
being present).

Anyway, the reactions here confirmed my recollection of a lack of consensus
amongst the core team. I'll just put something up on my own site, instead.

Cheers,
Nick.

--
Sent from my phone, thus the relative brevity :)
On Dec 4, 2012 3:28 AM, "Guido van Rossum" <guido at python.org> wrote:

> Hm. I agree PEP 8 seems an odd place for Nick's recommendation. Even
> if I were to agree with hos proposal I would think it belongs in a
> different PEP than PEP 8.
>
> But personally I haven't given up on using annotations to give type
> hints -- I think it can at some times be a useful augmentation to
> static analysis (whose use I see mostly as an aid to human readers
> and/or tools like linters, IDEs, and refactoring tools, not for
> guiding compiler optimizations). I know of several projects (both
> public and private) for improving the state of the art of Python
> static analysis with this goal in mind. With the advent of e.g.
> TypeScript and Dart in the JavaScript world, optional type annotations
> for dynamic languages appear to be becoming more fashionable, and
> maybe we can get some use out of them.
>
> FWIW, as far as e.g. 'int' being both overspecified and
> underspecified: I don't care about the underspecification so much,
> that's always going to happen; and for the overspecification, we can
> either use some abstract class instead, or simply state that the
> occurrence of certain concrete types must be taken as a shorthand for
> a specific abstract type. This could be part of the registration call
> of the concrete type, or something.
>
> Obviously this would require inventing and standardizing notations for
> things like "list of X", "tuple with items X, Y, Z", "either X or Y",
> and so on, as well as a standard way of combining annotations intended
> for different tools.
>
> *This* would be a useful discussion. What to do in the interim... I
> think the current language in PEP 8 is just fine until we have a
> better story.
>
> --Guido
>
> On Mon, Dec 3, 2012 at 7:34 AM, Barry Warsaw <barry at python.org> wrote:
> > On Dec 03, 2012, at 09:08 PM, Nick Coghlan wrote:
> >
> >>I would be *quite delighted* if people are open to the idea of making a
> >>much stronger recommendation along the following lines explicit in PEP 8:
> >
> > I am -1 for putting any of what followed in PEP 8, and in fact, I think
> the
> > existing examples at the bottom of PEP 8 are inappropriate.
> >
> > PEP 8 should be prescriptive of explicit Python coding styles.  Think "do
> > this, not that".  It should be as minimal as possible, and in general
> provide
> > rules that can be easily referenced and perhaps automated (e.g. pep8.py).
> >
> > Some of the existing text in PEP 8 already doesn't fall under that
> rubric, but
> > it's close enough (e.g. designing for inheritance).
> >
> > I don't think annotations reach the level of consensus or practical
> experience
> > needed to be added to PEP 8.
> >
> > OTOH, I wouldn't oppose a new informational PEP labeled "Annotations Best
> > Practices", where some of these principles can be laid out and explored.
>
> --
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121204/6ab17712/attachment.html>

From aquavitae69 at gmail.com  Tue Dec  4 10:37:07 2012
From: aquavitae69 at gmail.com (David Townshend)
Date: Tue, 4 Dec 2012 11:37:07 +0200
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CADiSq7fy3hW_Mze1jJND7MeknPKJzVhWU8KPXzfHnbO+TQV++Q@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
	<20121203103416.03094472@resist.wooz.org>
	<CAP7+vJJKXgK_Pc+Y+6jQyGW6CPOc44uhjW+ZCNWdaD5MTa9EmQ@mail.gmail.com>
	<CADiSq7fy3hW_Mze1jJND7MeknPKJzVhWU8KPXzfHnbO+TQV++Q@mail.gmail.com>
Message-ID: <CAEgL-fcHaDvyUG4+NM2RQb_ffDMhW4T-9zS0NzgQatVqvfbUkw@mail.gmail.com>

Just thought of a couple of usages which don't fit into the decorator
model.  The first is using the return annotation for early binding:

    def func(seq) -> dict(sorted=sorted):
        return func.__annotations__['return']['sorted'](seq)

Stangely enough, this seems to run slightly faster than

    def func(seq, sorted=sorted):
        return sorted(seq)

My test shows the first running in about 0.376s and the second in about
0.382s (python 3.3, 64bit).


The second is passing information to base classes.  This is a rather
contrived example which could easily be solved (better) in plenty of other
ways, but it does illustrate a pattern which someone else may be able to
turn into a genuine use case.

class NumberBase:

    def adjust(self, value):
        return self.adjust.__annotations__['return'](value)


class NegativeInteger(NumberBase):

    def adjust(self, value) -> int:
        return super().adjust(-value)


>>> ni = NegativeInteger()
>>> ni.adjust(4.3)
-4


Cheers

David


On Tue, Dec 4, 2012 at 1:02 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> So long as any type hinting semantics are associated with a "@type_hints"
> decorator, none of those ideas conflict with my suggestions for good
> annotation usage practices.
>
> The explicit decorators effectively end up serving as dialect specifiers
> for the annotations, for the benefit of other software (by moving the
> metadata out to purpose specific attributes) and for readers (simply by
> being present).
>
> Anyway, the reactions here confirmed my recollection of a lack of
> consensus amongst the core team. I'll just put something up on my own site,
> instead.
>
> Cheers,
> Nick.
>
> --
> Sent from my phone, thus the relative brevity :)
> On Dec 4, 2012 3:28 AM, "Guido van Rossum" <guido at python.org> wrote:
>
>> Hm. I agree PEP 8 seems an odd place for Nick's recommendation. Even
>> if I were to agree with hos proposal I would think it belongs in a
>> different PEP than PEP 8.
>>
>> But personally I haven't given up on using annotations to give type
>> hints -- I think it can at some times be a useful augmentation to
>> static analysis (whose use I see mostly as an aid to human readers
>> and/or tools like linters, IDEs, and refactoring tools, not for
>> guiding compiler optimizations). I know of several projects (both
>> public and private) for improving the state of the art of Python
>> static analysis with this goal in mind. With the advent of e.g.
>> TypeScript and Dart in the JavaScript world, optional type annotations
>> for dynamic languages appear to be becoming more fashionable, and
>> maybe we can get some use out of them.
>>
>> FWIW, as far as e.g. 'int' being both overspecified and
>> underspecified: I don't care about the underspecification so much,
>> that's always going to happen; and for the overspecification, we can
>> either use some abstract class instead, or simply state that the
>> occurrence of certain concrete types must be taken as a shorthand for
>> a specific abstract type. This could be part of the registration call
>> of the concrete type, or something.
>>
>> Obviously this would require inventing and standardizing notations for
>> things like "list of X", "tuple with items X, Y, Z", "either X or Y",
>> and so on, as well as a standard way of combining annotations intended
>> for different tools.
>>
>> *This* would be a useful discussion. What to do in the interim... I
>> think the current language in PEP 8 is just fine until we have a
>> better story.
>>
>> --Guido
>>
>> On Mon, Dec 3, 2012 at 7:34 AM, Barry Warsaw <barry at python.org> wrote:
>> > On Dec 03, 2012, at 09:08 PM, Nick Coghlan wrote:
>> >
>> >>I would be *quite delighted* if people are open to the idea of making a
>> >>much stronger recommendation along the following lines explicit in PEP
>> 8:
>> >
>> > I am -1 for putting any of what followed in PEP 8, and in fact, I think
>> the
>> > existing examples at the bottom of PEP 8 are inappropriate.
>> >
>> > PEP 8 should be prescriptive of explicit Python coding styles.  Think
>> "do
>> > this, not that".  It should be as minimal as possible, and in general
>> provide
>> > rules that can be easily referenced and perhaps automated (e.g.
>> pep8.py).
>> >
>> > Some of the existing text in PEP 8 already doesn't fall under that
>> rubric, but
>> > it's close enough (e.g. designing for inheritance).
>> >
>> > I don't think annotations reach the level of consensus or practical
>> experience
>> > needed to be added to PEP 8.
>> >
>> > OTOH, I wouldn't oppose a new informational PEP labeled "Annotations
>> Best
>> > Practices", where some of these principles can be laid out and explored.
>>
>> --
>> --Guido van Rossum (python.org/~guido)
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121204/f4a54239/attachment.html>

From haael at interia.pl  Tue Dec  4 10:58:04 2012
From: haael at interia.pl (haael at interia.pl)
Date: Tue, 04 Dec 2012 10:58:04 +0100
Subject: [Python-ideas] New __reference__ hook
Message-ID: <lmmobamdjivxejhjhspo@xsga>


Hi, guys.

Python 3 is very close to become a holy grail of programming languages in the sense that almost everything could be redefined. However, there is still one thing missing: the immutable copy-on-assign numeric types.
Consider this part of code:

a = 1
b = a
a += 1
assert a == b + 1

The object "1" gets assigned to the "a" variable, then another independent copy gets assigned to the "b" variable, then the value in the "a" variable gets modified without affecting the second.
The problem is - this behaviour can not be recreated in user-defined classes:

a = MyInteger(1)
b = a
a += 1
assert a == b + 1

The "a" and "b" variables both point to the same object. This is a difference on what one might expect with numeric types.

My proposal is to define another hook that gets called when an object is referenced.

def MyInteger:
	def __reference__(self, context):
		return copy.copy(self)

Each time when a reference count of an object would normally get incremented, this method should be called and the returned object will be referenced. The default implementation would be of course to return self. The context argument will give the object some information of the reason how we are being referenced.

This will allow easy implementation of such concepts as singletons, copy-on-write, immutables and even simplify things like reference loops.

The most obvious use-case would be implementations of some mathematical types like vectors, polynomials and so on. I've encountered this problem when I was writing a simple vector library and I had to explicitly copy each object on assignment, which is particulary annoying. The programmer's intuition for numeric-like types is for them to be immutable, yet they should have augmented asignment operators. This is a thing that can not be implemented transparently in the current Python, so there is my proposal.

Cheers,
haael.




From steve at pearwood.info  Tue Dec  4 12:14:44 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 04 Dec 2012 22:14:44 +1100
Subject: [Python-ideas] New __reference__ hook
In-Reply-To: <lmmobamdjivxejhjhspo@xsga>
References: <lmmobamdjivxejhjhspo@xsga>
Message-ID: <50BDDB24.7050106@pearwood.info>

On 04/12/12 20:58, haael at interia.pl wrote:
>
> Hi, guys.
>
> Python 3 is very close to become a holy grail of programming languages in
> the sense that almost everything could be redefined. However, there is
>still one thing missing: the immutable copy-on-assign numeric types.
> Consider this part of code:

I dispute that "everything can be redefined" is the holy grail of
programming languages. If it were, why isn't everyone using Forth?


> a = 1
> b = a
> a += 1
> assert a == b + 1
>
> The object "1" gets assigned to the "a" variable,

Correct, for some definition of "assigned" and "variable".


> then another independent copy gets assigned to the "b" variable,

Completely, utterly wrong.


>then the value in the "a" variable gets modified

Incorrect.


> without affecting the second.
> The problem is - this behaviour can not be recreated in user-defined
>  classes:

Of course it can.

py> from decimal import Decimal  # A pure-Python class, prior to Python 3.3
py> a = Decimal(1)
py> b = a
py> a += 1
py> assert a == b + 1
py> print a, b
2 1

If you prefer another example, use fractions.Fraction, also pure Python
and immutable, with support for augmented assignment.


I don't mean to be rude, or dismissive, but this is pretty basic Python
stuff. Please start with the Python data and execution models:

http://docs.python.org/2/reference/datamodel.html
http://docs.python.org/2/reference/executionmodel.html


although I must admit I don't find either of them especially clear. But
in simple terms, you need to reset your thinking: your assumptions about
what Python does are incorrect.

When you say:

a = 1

you are *binding* the name "a" to the object 1. When you then follow by
saying:

b = a

you bind the name "b" to the *same* object 1. It is not a copy. It is not
a "copy on assignment" or any other clever trick. You can prove to
yourself that they are the same object:


py> a = 1
py> b = a
py> a is b
True
py> id(a), id(b)
(140087472, 140087472)


When you then call

a += 1

this does not modify anything. Int objects (and floats, Decimal, strings,
and many others) are *immutable* -- they cannot be modified. So `a += 1`
creates a new object, 2, and binds it to the name "a". But the binding
from object 1 to name "b" is not touched.

If this is still not clear, I recommend you take the discussion onto one
of the other Python mailing lists, especially tutor at python.org or
python-list at python.org, which are more appropriate for discussing these
things.



-- 
Steven


From jstpierre at mecheye.net  Tue Dec  4 17:43:34 2012
From: jstpierre at mecheye.net (Jasper St. Pierre)
Date: Tue, 4 Dec 2012 11:43:34 -0500
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <50BBD4DD.7010703@pearwood.info>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<50BBD4DD.7010703@pearwood.info>
Message-ID: <CAA0H+QRCj1-3rZ0oW73ChjomRjUrwwdyzDanH-1d6sxHYqRt9g@mail.gmail.com>

Indeed. I've looked at annotations before, but I never understood the
purpose. It seemed like a feature that was designed and implemented without
some goal in mind, and where the community was supposed to discover the
goal themselves.

So, if I may ask, what was the original goal of annotations? The PEP gives
some suggestions, but doesn't leave anything concrete. Was it designed to
be an aid to IDEs, or static analysis tools that inspect source code?
Something for applications themselves to munge through to provide special
behaviors, like a command line parser, or runtime static checker?

The local decorator influence might work, but that has the problem of only
being able to be used once before we fall back to the old method. Would you
rather:

    @tab_expand(filename=glob('*.txt'))
    @types
    def read_from_filename(filename:str, num_bytes:int) -> bytes:
        pass

or

    @tab_expand(filename=glob('*.txt'))
    @types(filename=str, num_bytes=int, return_=bytes)
    def read_from_filename(filename, num_bytes):
        pass

For consistency's sake, I'd prefer the latter.

Note that we could take a convention, like Thomas suggests, and adopt both:

    @tab_expand
    @types
    def read_from_filename(filename:(str, glob('*.txt')), num_bytes:int) ->
bytes:
        pass

But that's a "worst of both worlds" approach: we lose the locality of which
argument applies to which decorator (unless we make up rules about
positioning in the tuple or something), and we gunk up the function
signature, all to use a fancy new Python 3 feature.

With a restricted and narrow focus, I could see them gaining adoption, but
for now, it seems like extra syntax was introduced simply for the point of
having extra syntax.



On Sun, Dec 2, 2012 at 5:23 PM, Steven D'Aprano <steve at pearwood.info> wrote:

> On 02/12/12 22:43, Nick Coghlan wrote:
>
>  Last time it came up, the collective opinion on python-dev was still to
>> leave PEP 8 officially neutral on the topic so that people could
>> experiment
>> more freely with annotations and the community could help figure out what
>> worked well and what didn't. Admittedly this was long enough ago that I
>> don't remember the details, just the obvious consequence that PEP 8
>> remains
>> largely silent on the matter, aside from declaring that function
>> annotations are off-limits for standard library modules: "The Python
>> standard library will not use function annotations as that would result in
>> a premature commitment to a particular annotation style. Instead, the
>> annotations are left for users to discover and experiment with useful
>> annotation styles."
>>
>
> I fear that this was a strategic mistake. The result, it seems to me, is
> that
> annotations have been badly neglected.
>
> I can't speak for others, but I heavily use the standard library as a guide
> to what counts as good practice in Python. I'm not a big user of third
> party
> libraries, and most of those are for 2.x, so with the lack of annotations
> in
> the std lib I've had no guidance as to what sort of things annotations
> could
> be used for apart from "type checking".
>
> I'm sure that I'm not the only one.
>
>
>
> --
> Steven
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>



-- 
  Jasper
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121204/d4bd9fa8/attachment.html>

From thomas at kluyver.me.uk  Tue Dec  4 17:51:13 2012
From: thomas at kluyver.me.uk (Thomas Kluyver)
Date: Tue, 4 Dec 2012 16:51:13 +0000
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CAA0H+QRCj1-3rZ0oW73ChjomRjUrwwdyzDanH-1d6sxHYqRt9g@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<50BBD4DD.7010703@pearwood.info>
	<CAA0H+QRCj1-3rZ0oW73ChjomRjUrwwdyzDanH-1d6sxHYqRt9g@mail.gmail.com>
Message-ID: <CAOvn4qhB3etXwrDAb3K2zXg4zA4UBLxdbFZqOU6vvNcdkhiM8g@mail.gmail.com>

On 4 December 2012 16:43, Jasper St. Pierre <jstpierre at mecheye.net> wrote:

> The local decorator influence might work, but that has the problem of only
> being able to be used once before we fall back to the old method. Would you
> rather:
>
>     @tab_expand(filename=glob('*.txt'))
>     @types
>     def read_from_filename(filename:str, num_bytes:int) -> bytes:
>         pass
>
> or
>
>     @tab_expand(filename=glob('*.txt'))
>     @types(filename=str, num_bytes=int, return_=bytes)
>     def read_from_filename(filename, num_bytes):
>         pass
>
> For consistency's sake, I'd prefer the latter.
>

Using the decorator decorator I posted (https://gist.github.com/4189289 ),
you could use these interchangeably, so the annotations are just a
convenient alternative syntax for when you think they'd make the code more
readable.

Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121204/82d15ffc/attachment.html>

From ned at nedbatchelder.com  Tue Dec  4 18:12:12 2012
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Tue, 04 Dec 2012 12:12:12 -0500
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CAA0H+QRCj1-3rZ0oW73ChjomRjUrwwdyzDanH-1d6sxHYqRt9g@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<50BBD4DD.7010703@pearwood.info>
	<CAA0H+QRCj1-3rZ0oW73ChjomRjUrwwdyzDanH-1d6sxHYqRt9g@mail.gmail.com>
Message-ID: <50BE2EEC.9000402@nedbatchelder.com>

On 12/4/2012 11:43 AM, Jasper St. Pierre wrote:
> Indeed. I've looked at annotations before, but I never understood the 
> purpose. It seemed like a feature that was designed and implemented 
> without some goal in mind, and where the community was supposed to 
> discover the goal themselves.
>
> So, if I may ask, what was the original goal of annotations? The PEP 
> gives some suggestions, but doesn't leave anything concrete. Was it 
> designed to be an aid to IDEs, or static analysis tools that inspect 
> source code? Something for applications themselves to munge through to 
> provide special behaviors, like a command line parser, or runtime 
> static checker?

A telling moment for me was during an early Py3k keynote at PyCon 
(perhaps it was in Dallas or Chicago?), Guido couldn't remember the word 
"annotation," and said, "you know, those things that aren't type 
declarations?"  :-)

--Ned.


From guido at python.org  Tue Dec  4 18:56:01 2012
From: guido at python.org (Guido van Rossum)
Date: Tue, 4 Dec 2012 09:56:01 -0800
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <50BE2EEC.9000402@nedbatchelder.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<50BBD4DD.7010703@pearwood.info>
	<CAA0H+QRCj1-3rZ0oW73ChjomRjUrwwdyzDanH-1d6sxHYqRt9g@mail.gmail.com>
	<50BE2EEC.9000402@nedbatchelder.com>
Message-ID: <CAP7+vJK=F+JKH1_vQQxKmADajhxFxCN3xzfoG_o9LoMJKunZFA@mail.gmail.com>

On Tue, Dec 4, 2012 at 9:12 AM, Ned Batchelder <ned at nedbatchelder.com> wrote:
> On 12/4/2012 11:43 AM, Jasper St. Pierre wrote:
>>
>> Indeed. I've looked at annotations before, but I never understood the
>> purpose. It seemed like a feature that was designed and implemented without
>> some goal in mind, and where the community was supposed to discover the goal
>> themselves.

To the contrary. There were too many use cases that immediately looked
important, and we couldn't figure out which ones would be the most
important or how to combine them, so we decided to take a two-step
approach: in step 1, we designed the syntax, whereas in step 2, we
would design the semantics. The idea was very clear that once the
syntax was settled people would be free to experiment with different
semantics -- just not in the stdlib. The idea was also that
eventually, from all those experiments, one would emerge that would be
fit for the stdlib.

The process was somewhat similar to the way decorators were
introduced. In Python 2.3, we introduced things like staticmethod,
classmethod and property. But we *didn't* introduce the @ syntax,
because we couldn't agree about it at that point. Then, for 2.4, we
sorted out the proper syntax, having by then conclusively discovered
that the original way of using e.g. classmethod (an assignment after
the end of the method definition) was hard on the human reader.

(Of course, you may note that for decorators, we decided on semantics
first, syntax second. But no two situations are quite the same, and in
the case of annotations, without syntax it would be nearly impossible
to experiment with semantics.)

>> So, if I may ask, what was the original goal of annotations? The PEP gives
>> some suggestions, but doesn't leave anything concrete. Was it designed to be
>> an aid to IDEs, or static analysis tools that inspect source code? Something
>> for applications themselves to munge through to provide special behaviors,
>> like a command line parser, or runtime static checker?

Pretty much all of the above to some extent. But for me personally,
the main goal was always to arrive at a notation to specify type
constraints (and maybe other constraints) for arguments and return
values. I've toyed at various times with specific ways of combining
types. E.g. list[int] might mean a list of integers, and dict[str,
tuple[float, float, float, bool]] might mean a dict mapping strings to
tuples of three floats and a bool. But I felt it was much harder to
get consensus about such a notation than about the syntax for argument
annotations (think about how many objections you can bring in to these
two examples :-) -- I've always had a strong desire to use "var: type
= default" and to make the type a runtime expression to be evaluated
at the same time as the default.

> A telling moment for me was during an early Py3k keynote at PyCon (perhaps
> it was in Dallas or Chicago?), Guido couldn't remember the word
> "annotation," and said, "you know, those things that aren't type
> declarations?"  :-)

Heh. :-)

-- 
--Guido van Rossum (python.org/~guido)


From masklinn at masklinn.net  Tue Dec  4 19:12:18 2012
From: masklinn at masklinn.net (Masklinn)
Date: Tue, 4 Dec 2012 19:12:18 +0100
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CAP7+vJJKXgK_Pc+Y+6jQyGW6CPOc44uhjW+ZCNWdaD5MTa9EmQ@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
	<20121203103416.03094472@resist.wooz.org>
	<CAP7+vJJKXgK_Pc+Y+6jQyGW6CPOc44uhjW+ZCNWdaD5MTa9EmQ@mail.gmail.com>
Message-ID: <55D9BA21-0C74-4958-A9C7-0C0969366F93@masklinn.net>

On 2012-12-03, at 18:27 , Guido van Rossum wrote:
> 
> Obviously this would require inventing and standardizing notations for
> things like "list of X", "tuple with items X, Y, Z", "either X or Y",
> and so on, as well as a standard way of combining annotations intended
> for different tools.

I've always felt that __getitem__ and __or__/__ror__ on type 1. looked
rather good and 2. looked similar to informal type specs and type specs
of other languages. Although that's the issue with annotations being
Python syntax: it requires changing stuff fairly deep into Python to
be able to experiment.

The most bothersome part is that I "feel" "either X or Y" (aka `X | Y`)
should be a set of type (and thus the same as {X, Y}[0]) but that doesn't
work with `isinstance` or `issubclass`. Likewise, `(a, b, c)` in an
annotation feels like it should mean the same as `tuple[a, b, c]` ("a
tuple with 3 items of types resp. a, b and c") but that's at odds with
the same type-checking functions.

The first could be fixable by relaxing slightly the constraints of 
isinstance and issubclass, but not so for the second.

[0] which works rather neatly for anonymous unions as `|` is the union
    of two sets, so the arithmetic would be `type | type -> typeset`,
    `type | typeset -> typeset` and `typeset | typeset -> typeset`,
    libraries could offer opaque types/typesets which would be composable
    without their users having to know whether they're type atoms or
    typesets


From ericsnowcurrently at gmail.com  Tue Dec  4 19:22:46 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 4 Dec 2012 11:22:46 -0700
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CAA0H+QRCj1-3rZ0oW73ChjomRjUrwwdyzDanH-1d6sxHYqRt9g@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<50BBD4DD.7010703@pearwood.info>
	<CAA0H+QRCj1-3rZ0oW73ChjomRjUrwwdyzDanH-1d6sxHYqRt9g@mail.gmail.com>
Message-ID: <CALFfu7AYSYMuJuWSQZw79vyRsSP6om9AYcZPLnTHhcYgR8r0Wg@mail.gmail.com>

Check out http://www.artima.com/weblogs/viewpost.jsp?thread=89161

-eric

On Tue, Dec 4, 2012 at 9:43 AM, Jasper St. Pierre <jstpierre at mecheye.net> wrote:
> Indeed. I've looked at annotations before, but I never understood the
> purpose. It seemed like a feature that was designed and implemented without
> some goal in mind, and where the community was supposed to discover the goal
> themselves.
>
> So, if I may ask, what was the original goal of annotations? The PEP gives
> some suggestions, but doesn't leave anything concrete. Was it designed to be
> an aid to IDEs, or static analysis tools that inspect source code? Something
> for applications themselves to munge through to provide special behaviors,
> like a command line parser, or runtime static checker?
>
> The local decorator influence might work, but that has the problem of only
> being able to be used once before we fall back to the old method. Would you
> rather:
>
>     @tab_expand(filename=glob('*.txt'))
>     @types
>     def read_from_filename(filename:str, num_bytes:int) -> bytes:
>         pass
>
> or
>
>     @tab_expand(filename=glob('*.txt'))
>     @types(filename=str, num_bytes=int, return_=bytes)
>     def read_from_filename(filename, num_bytes):
>         pass
>
> For consistency's sake, I'd prefer the latter.
>
> Note that we could take a convention, like Thomas suggests, and adopt both:
>
>     @tab_expand
>     @types
>     def read_from_filename(filename:(str, glob('*.txt')), num_bytes:int) ->
> bytes:
>         pass
>
> But that's a "worst of both worlds" approach: we lose the locality of which
> argument applies to which decorator (unless we make up rules about
> positioning in the tuple or something), and we gunk up the function
> signature, all to use a fancy new Python 3 feature.
>
> With a restricted and narrow focus, I could see them gaining adoption, but
> for now, it seems like extra syntax was introduced simply for the point of
> having extra syntax.
>
>
>
> On Sun, Dec 2, 2012 at 5:23 PM, Steven D'Aprano <steve at pearwood.info> wrote:
>>
>> On 02/12/12 22:43, Nick Coghlan wrote:
>>
>>> Last time it came up, the collective opinion on python-dev was still to
>>> leave PEP 8 officially neutral on the topic so that people could
>>> experiment
>>> more freely with annotations and the community could help figure out what
>>> worked well and what didn't. Admittedly this was long enough ago that I
>>> don't remember the details, just the obvious consequence that PEP 8
>>> remains
>>> largely silent on the matter, aside from declaring that function
>>> annotations are off-limits for standard library modules: "The Python
>>> standard library will not use function annotations as that would result
>>> in
>>> a premature commitment to a particular annotation style. Instead, the
>>> annotations are left for users to discover and experiment with useful
>>> annotation styles."
>>
>>
>> I fear that this was a strategic mistake. The result, it seems to me, is
>> that
>> annotations have been badly neglected.
>>
>> I can't speak for others, but I heavily use the standard library as a
>> guide
>> to what counts as good practice in Python. I'm not a big user of third
>> party
>> libraries, and most of those are for 2.x, so with the lack of annotations
>> in
>> the std lib I've had no guidance as to what sort of things annotations
>> could
>> be used for apart from "type checking".
>>
>> I'm sure that I'm not the only one.
>>
>>
>>
>> --
>> Steven
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>
>
>
>
> --
>   Jasper
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


From barry at python.org  Tue Dec  4 20:39:50 2012
From: barry at python.org (Barry Warsaw)
Date: Tue, 4 Dec 2012 14:39:50 -0500
Subject: [Python-ideas] New __reference__ hook
References: <lmmobamdjivxejhjhspo@xsga>
	<50BDDB24.7050106@pearwood.info>
Message-ID: <20121204143950.02880d94@resist.wooz.org>

On Dec 04, 2012, at 10:14 PM, Steven D'Aprano wrote:

>I dispute that "everything can be redefined" is the holy grail of
>programming languages. If it were, why isn't everyone using Forth?

On the readability scale, where Python is pretty close to a 10 (almost
everyone can read almost all Python), Perl is a 4 (hard to read your own code
after a week or so), Forth is a 1 (you can't even read your own code after
your fingers stop moving).

:)

a-forth-enthusiast-from-way-back-in-the-day-ly y'rs,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121204/97e7ac24/attachment.pgp>

From mikegraham at gmail.com  Wed Dec  5 15:10:14 2012
From: mikegraham at gmail.com (Mike Graham)
Date: Wed, 5 Dec 2012 09:10:14 -0500
Subject: [Python-ideas] New __reference__ hook
In-Reply-To: <lmmobamdjivxejhjhspo@xsga>
References: <lmmobamdjivxejhjhspo@xsga>
Message-ID: <CAEBZo3Owwmwa=0S2yfap7vpHpyLcWj=is0Gug+NdSZ9NVP6Zrw@mail.gmail.com>

On Tue, Dec 4, 2012 at 4:58 AM, <haael at interia.pl> wrote:

>
> Python 3 is very close to become a holy grail of programming languages in
> the sense that almost everything could be redefined. However, there is
> still one thing missing: the immutable copy-on-assign numeric types.
> Consider this part of code:
>
> a = 1
> b = a
> a += 1
> assert a == b + 1
>
> The object "1" gets assigned to the "a" variable, then another independent
> copy gets assigned to the "b" variable, then the value in the "a" variable
> gets modified without affecting the second.
> The problem is - this behaviour can not be recreated in user-defined
> classes:
>
> a = MyInteger(1)
> b = a
> a += 1
> assert a == b + 1
>
> The "a" and "b" variables both point to the same object. This is a
> difference on what one might expect with numeric types.
>

You misunderstand Python's semantics. Python never implicitly copies
anything. Some types, like int, are immutable so you can't distinguish
meaningfully between copying and not.

All names like `a` can be rebound `a = ....`, and this never mutates the
object. Some objects can be mutated, which is done by some means other than
rebinding a name.

I don't know what problem you had defining MyInteger. Here is a definition
(albeit comprised of very, very sloppy code) that passes your test

class MyInteger(object):
    def __init__(self, i):
        self._i = i

    def __add__(self, other):
        if isinstance(other, MyInteger):
            other = other._i
        return MyInteger(self._i + other)

    def __eq__(self, other):
        return self._i == other._i

Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121205/dd063981/attachment.html>

From random832 at fastmail.us  Wed Dec  5 17:06:28 2012
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Wed, 05 Dec 2012 11:06:28 -0500
Subject: [Python-ideas] New __reference__ hook
In-Reply-To: <CAEBZo3Owwmwa=0S2yfap7vpHpyLcWj=is0Gug+NdSZ9NVP6Zrw@mail.gmail.com>
References: <lmmobamdjivxejhjhspo@xsga>
	<CAEBZo3Owwmwa=0S2yfap7vpHpyLcWj=is0Gug+NdSZ9NVP6Zrw@mail.gmail.com>
Message-ID: <1354723588.24521.140661162241293.22C960DA@webmail.messagingengine.com>

On Wed, Dec 5, 2012, at 9:10, Mike Graham wrote:
> I don't know what problem you had defining MyInteger. Here is a definition (albeit comprised of very, very sloppy code) that passes your test

Most likely he thought he had to define __iadd__ for += to work.


From jstpierre at mecheye.net  Wed Dec  5 18:05:32 2012
From: jstpierre at mecheye.net (Jasper St. Pierre)
Date: Wed, 5 Dec 2012 12:05:32 -0500
Subject: [Python-ideas] New __reference__ hook
In-Reply-To: <1354723588.24521.140661162241293.22C960DA@webmail.messagingengine.com>
References: <lmmobamdjivxejhjhspo@xsga>
	<CAEBZo3Owwmwa=0S2yfap7vpHpyLcWj=is0Gug+NdSZ9NVP6Zrw@mail.gmail.com>
	<1354723588.24521.140661162241293.22C960DA@webmail.messagingengine.com>
Message-ID: <CAA0H+QR+yW0ii03hskEuUFhbzZgBg-K=GMUYOOM2W4EqRJk1AQ@mail.gmail.com>

And? What's wrong with an __iadd__ that's exactly the same as Mike's
__add__?


On Wed, Dec 5, 2012 at 11:06 AM, <random832 at fastmail.us> wrote:

> On Wed, Dec 5, 2012, at 9:10, Mike Graham wrote:
> > I don't know what problem you had defining MyInteger. Here is a
> definition (albeit comprised of very, very sloppy code) that passes your
> test
>
> Most likely he thought he had to define __iadd__ for += to work.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
  Jasper
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121205/4801aa85/attachment.html>

From sturla at molden.no  Wed Dec  5 19:09:47 2012
From: sturla at molden.no (Sturla Molden)
Date: Wed, 05 Dec 2012 19:09:47 +0100
Subject: [Python-ideas] New __reference__ hook
In-Reply-To: <CAA0H+QR+yW0ii03hskEuUFhbzZgBg-K=GMUYOOM2W4EqRJk1AQ@mail.gmail.com>
References: <lmmobamdjivxejhjhspo@xsga>
	<CAEBZo3Owwmwa=0S2yfap7vpHpyLcWj=is0Gug+NdSZ9NVP6Zrw@mail.gmail.com>
	<1354723588.24521.140661162241293.22C960DA@webmail.messagingengine.com>
	<CAA0H+QR+yW0ii03hskEuUFhbzZgBg-K=GMUYOOM2W4EqRJk1AQ@mail.gmail.com>
Message-ID: <50BF8DEB.9040101@molden.no>

On 05.12.2012 18:05, Jasper St. Pierre wrote:
> And? What's wrong with an __iadd__ that's exactly the same as Mike's
> __add__?

I think it was a Java-confusion. He thought numbers were copied on 
assignment. But there is no difference between value types and object 
types in Python. Ints and floats are immutable, but they are not value 
types as in Java.

But apart from that, I think allowing overloading of the binding 
operator "=" might be a good idea. A special method __bind__ could 
return the object to be bound:

    a = b

should then bind the name "a" to the return value of

    b.__bind__()

if b implements __bind__.

Sure, it could be used to implement copy on assignment. But it would 
also do other things like allowing lazy evaluation of an expression.

NumPy code like

    z = a*x + b*y + c

could avoid creating three temporary arrays if there was a __bind__ 
function called on "=". This is a big thing, cf. the difference between 
NumPy and numexpr:

    z = numexpr.evaluate("""a*x + b*y + c""")

The reason numerical expressions must be written as strings to be 
efficient in Python is because there is no __bind__ function.



Sturla
















From guido at python.org  Wed Dec  5 19:17:53 2012
From: guido at python.org (Guido van Rossum)
Date: Wed, 5 Dec 2012 10:17:53 -0800
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CAEgL-fcHaDvyUG4+NM2RQb_ffDMhW4T-9zS0NzgQatVqvfbUkw@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
	<20121203103416.03094472@resist.wooz.org>
	<CAP7+vJJKXgK_Pc+Y+6jQyGW6CPOc44uhjW+ZCNWdaD5MTa9EmQ@mail.gmail.com>
	<CADiSq7fy3hW_Mze1jJND7MeknPKJzVhWU8KPXzfHnbO+TQV++Q@mail.gmail.com>
	<CAEgL-fcHaDvyUG4+NM2RQb_ffDMhW4T-9zS0NzgQatVqvfbUkw@mail.gmail.com>
Message-ID: <CAP7+vJKVeDzEmBLM7nadQW4zFhOa1Es8r_TR84WGveNgBypFyw@mail.gmail.com>

On Tue, Dec 4, 2012 at 1:37 AM, David Townshend <aquavitae69 at gmail.com> wrote:
> Just thought of a couple of usages which don't fit into the decorator model.
> The first is using the return annotation for early binding:
>
>     def func(seq) -> dict(sorted=sorted):
>         return func.__annotations__['return']['sorted'](seq)

You've got to be kidding...

> Stangely enough, this seems to run slightly faster than
>
>     def func(seq, sorted=sorted):
>         return sorted(seq)
>
> My test shows the first running in about 0.376s and the second in about
> 0.382s (python 3.3, 64bit).

Surely that's some kind of random variation. It's only a 2% difference.

> The second is passing information to base classes.  This is a rather
> contrived example which could easily be solved (better) in plenty of other
> ways, but it does illustrate a pattern which someone else may be able to
> turn into a genuine use case.
>
> class NumberBase:
>
>     def adjust(self, value):
>         return self.adjust.__annotations__['return'](value)
>
>
> class NegativeInteger(NumberBase):
>
>     def adjust(self, value) -> int:
>         return super().adjust(-value)
>
>
>>>> ni = NegativeInteger()
>>>> ni.adjust(4.3)
> -4

This looks like a contrived way to use what is semantically equivalent
to function attributes. The base class could write

  def adjust(self, value):
    return self.adjust.adjuster(value)

and the subclass could write

  def adjust(self, value):
    return super().adjust(-value)
  adjust.adjuster = int

Or invent a decorator to set the attribute:

  @set(adjuster=int)
  def adjust(self, value):
    return super().adjust(-value)

But both of these feel quite awkward compared to just using a class attribute.

class NumberBase:

  def adjust(self, value):
    return self.adjuster(value)

class NegativeInteger(NumberBase):

  adjuster = int
  # No need to override adjust()

IOW, this is not a line of thought to pursue.

-- 
--Guido van Rossum (python.org/~guido)


From masklinn at masklinn.net  Wed Dec  5 19:51:12 2012
From: masklinn at masklinn.net (Masklinn)
Date: Wed, 5 Dec 2012 19:51:12 +0100
Subject: [Python-ideas] New __reference__ hook
In-Reply-To: <50BF8DEB.9040101@molden.no>
References: <lmmobamdjivxejhjhspo@xsga>
	<CAEBZo3Owwmwa=0S2yfap7vpHpyLcWj=is0Gug+NdSZ9NVP6Zrw@mail.gmail.com>
	<1354723588.24521.140661162241293.22C960DA@webmail.messagingengine.com>
	<CAA0H+QR+yW0ii03hskEuUFhbzZgBg-K=GMUYOOM2W4EqRJk1AQ@mail.gmail.com>
	<50BF8DEB.9040101@molden.no>
Message-ID: <63C70C75-1B39-4565-9AF0-1199DA6370C8@masklinn.net>

On 2012-12-05, at 19:09 , Sturla Molden wrote:

> On 05.12.2012 18:05, Jasper St. Pierre wrote:
>> And? What's wrong with an __iadd__ that's exactly the same as Mike's
>> __add__?
> 
> I think it was a Java-confusion. He thought numbers were copied on assignment. But there is no difference between value types and object types in Python. Ints and floats are immutable, but they are not value types as in Java.
> 
> But apart from that, I think allowing overloading of the binding operator "=" might be a good idea. A special method __bind__ could return the object to be bound:
> 
>   a = b
> 
> should then bind the name "a" to the return value of
> 
>   b.__bind__()
> 
> if b implements __bind__.

Sounds odd and full of strange edge-cases. Would bind also get called
when providing parameters to a function call? When putting an object in
a literal of some sort? When returning an object from a function/method?
If not, why not?

> Sure, it could be used to implement copy on assignment. But it would also do other things like allowing lazy evaluation of an expression.
> 
> NumPy code like
> 
>   z = a*x + b*y + c
> 
> could avoid creating three temporary arrays if there was a __bind__ function called on "=".

Why? z could just be a "lazy value" at this point, basically a manual
building of thunks, only reifying them when necessary (whenever that
is). It's not like numpy *has* to create three temporary arrays, just
that it *does*.



From bruce at leapyear.org  Wed Dec  5 19:54:22 2012
From: bruce at leapyear.org (Bruce Leban)
Date: Wed, 5 Dec 2012 10:54:22 -0800
Subject: [Python-ideas] New __reference__ hook
In-Reply-To: <50BF8DEB.9040101@molden.no>
References: <lmmobamdjivxejhjhspo@xsga>
	<CAEBZo3Owwmwa=0S2yfap7vpHpyLcWj=is0Gug+NdSZ9NVP6Zrw@mail.gmail.com>
	<1354723588.24521.140661162241293.22C960DA@webmail.messagingengine.com>
	<CAA0H+QR+yW0ii03hskEuUFhbzZgBg-K=GMUYOOM2W4EqRJk1AQ@mail.gmail.com>
	<50BF8DEB.9040101@molden.no>
Message-ID: <CAGu0AnsUzN52Aayt+VYHKLtCZVua_VVpcaa_DWkCvUwKiuAPOw@mail.gmail.com>

On Wed, Dec 5, 2012 at 10:09 AM, Sturla Molden <sturla at molden.no> wrote:

>
> But apart from that, I think allowing overloading of the binding operator
> "=" might be a good idea. A special method __bind__ could return the object
> to be bound:
>
>    a = b
>
> should then bind the name "a" to the return value of
>
>    b.__bind__()
>
> if b implements __bind__.
>

It' seems a bit more complicated than that. Take the example below. When is
__bind__ going to be called? After a is multiplied by x, b is multipled by
y, etc. or before? If after, that doesn't accomplish lazy evaluation as
below. If before, then somehow this has to convert to a form that calls
z.__bind__(something) and what is that something?

>
> Sure, it could be used to implement copy on assignment. But it would also
> do other things like allowing lazy evaluation of an expression.
>
> NumPy code like
>
>    z = a*x + b*y + c
>
> could avoid creating three temporary arrays if there was a __bind__
> function called on "=". This is a big thing, cf. the difference between
> NumPy and numexpr:
>
>    z = numexpr.evaluate("""a*x + b*y + c""")
>
> The reason numerical expressions must be written as strings to be
> efficient in Python is because there is no __bind__ function.
>

There is another way to write expressions that don't get evaluated:

lambda: a*x + b*y + c


So you could write this as z.bind(lambda: rhs) or if this is important
enough there could be a new bind operator:

lhs @= rhs


which is equivalent to

lhs.__bind__(lambda: rhs)


I think overriding = so sometimes it does regular binding and sometimes
this magic binding would be confusing and dangerous. It means that every
assignment operates differently if the lhs is already bound. Consider the
difference between

t = a + b
typo = a+b

t @= a+b
typo @= a+b


where typo was supposed to be t but was mistyped. In the first set, line 1
does __bind__ to a+b while line just adds a and b and does a normal
binding. In the second set, the first does __bind__ while the second raises
an exception that typo is not bound.

It's even worse in the context of something like this:

d = {}
for i in range(2):
    d['x'] = a + i


in the first pass through the loop this is a regular assignment. In the
second pass it may call __bind__ depending on what the value of a + 0 is.
Ick.

--- Bruce
Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121205/a3de59b8/attachment.html>

From guido at python.org  Wed Dec  5 20:22:33 2012
From: guido at python.org (Guido van Rossum)
Date: Wed, 5 Dec 2012 11:22:33 -0800
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <55D9BA21-0C74-4958-A9C7-0C0969366F93@masklinn.net>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
	<20121203103416.03094472@resist.wooz.org>
	<CAP7+vJJKXgK_Pc+Y+6jQyGW6CPOc44uhjW+ZCNWdaD5MTa9EmQ@mail.gmail.com>
	<55D9BA21-0C74-4958-A9C7-0C0969366F93@masklinn.net>
Message-ID: <CAP7+vJL3oAbBPf49hS0Z6kKY6AY++sMgp_-0RpmGJoHHTtUEKQ@mail.gmail.com>

On Tue, Dec 4, 2012 at 10:12 AM, Masklinn <masklinn at masklinn.net> wrote:
> On 2012-12-03, at 18:27 , Guido van Rossum wrote:
>>
>> Obviously this would require inventing and standardizing notations for
>> things like "list of X", "tuple with items X, Y, Z", "either X or Y",
>> and so on, as well as a standard way of combining annotations intended
>> for different tools.
>
> I've always felt that __getitem__ and __or__/__ror__ on type 1. looked
> rather good and 2. looked similar to informal type specs and type specs
> of other languages. Although that's the issue with annotations being
> Python syntax: it requires changing stuff fairly deep into Python to
> be able to experiment.

So, instead of using

def foo(a: int, b: str) -> float:
  <blah>

you use

from experimental_type_annotations import Int, Str, Float

def foo(a: Int, b: Str) -> Float:
  <blah>

And now we're ready for experimentation.

[Warning: none of this is particularly new; I've had these things in
my brain for years, as the referenced Artima blog post made clear.]

> The most bothersome part is that I "feel" "either X or Y" (aka `X | Y`)
> should be a set of type (and thus the same as {X, Y}[0]) but that doesn't
> work with `isinstance` or `issubclass`. Likewise, `(a, b, c)` in an
> annotation feels like it should mean the same as `tuple[a, b, c]` ("a
> tuple with 3 items of types resp. a, b and c") but that's at odds with
> the same type-checking functions.

Note that in Python 3 you can override isinstance, by defining
__instancecheck__ in the class:
http://docs.python.org/3/reference/datamodel.html?highlight=__instancecheck__#class.__instancecheck__

So it shouldn't be a problem to make isinstance(42, Int) work.

We can also make things like List[Int] and Dict[Str, Float] work, and
even rig it so that

  isinstance([1, 2, 3], List[Int]) == True

while

  isinstance([1, 2, 'booh'], List[Int]) == False

Of course there are many bikeshedding topics like whether we should
ever write List -- maybe we should write Iterable or Sequence instead,
and maybe we have to be able to express mutability, and so on. The
numeric tower (PEP 3141) is also good to keep in mind. I think that's
all solvable once we start experimenting a bit.

Some important issues to bikeshed over:

- Tuples. Sometimes you want to say e.g. "a tuple of integers, don't
mind the length"; other times you want to say e.g. "a tuple of fixed
length containing an int and two strs". Perhaps the former should be
expressed using ImmutableSequence[Int] and the second as Tuple[Int,
Str, Str].

- Unions. We need a way to say "either X or Y". Given that we're
defining our own objects we may actually be able to get away with
writing e.g. "Int | Str" or "Str | List[Str]", and isinstance() would
still work. It would also be useful to have a shorthand for "either T
or None", written as Optional[T] or Optional(T).

- Whether to design notations to express other constraints. E.g.
"integer in range(10, 100)", or "one of the strings 'r', 'w' or 'a'",
etc. You can go crazy on this.

- Composability (Nick's pet peeve, in that he is against it). I
propose that we reserve plain tuples for this. If an annotation has
the form "x: (P, Q)" then that ought to mean that x must conform to
both P and Q. Even though Nick doesn't like this, I don't think we
should do everything with decorators. Surly, the decorators approach
is good for certain use cases, and should take precedence if it is
used. But e.g. IDEs that use annotations for suggestions and
refactoring should not require everything to be decorated -- that
would just make the code too busy.

- Runtime enforcement. What should we use type annotations for? IDEs,
static checkers (linters) and refactoring tools only need the
annotations when they are parsing the code. While it is tempting to
invent some kind of runtime checking that automatically checks the
actual types against the annotations whenever a function is called, I
think this is rarely useful, and often prohibitively slow. So I'd say
don't focus on this. Instead, explicit type assertions like "assert
isinstance(x, List[Int])" might be used, sparingly, for those cases
where we'd otherwise write a manual assertion with the same meaning
(which is also sparingly!). A decorator to do this might be useful
(especially if there's a separate mechanism for turning actual
checking on or off through some configuration mechanism).

> The first could be fixable by relaxing slightly the constraints of
> isinstance and issubclass, but not so for the second.
>
> [0] which works rather neatly for anonymous unions as `|` is the union
>     of two sets, so the arithmetic would be `type | type -> typeset`,
>     `type | typeset -> typeset` and `typeset | typeset -> typeset`,
>     libraries could offer opaque types/typesets which would be composable
>     without their users having to know whether they're type atoms or
>     typesets

I like this for declaring union types. I don't like it for composing
constraints that are intended for different tools.

-- 
--Guido van Rossum (python.org/~guido)


From ericsnowcurrently at gmail.com  Wed Dec  5 20:42:41 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Wed, 5 Dec 2012 12:42:41 -0700
Subject: [Python-ideas] A bind protocol (was Re:  New __reference__ hook)
Message-ID: <CALFfu7BwhpPNAZrHrHszf8jsiDjFoUYgQ06DtF7ymFdhBA58WQ@mail.gmail.com>

On Wed, Dec 5, 2012 at 11:09 AM, Sturla Molden <sturla at molden.no> wrote:
> But apart from that, I think allowing overloading of the binding operator
> "=" might be a good idea. A special method __bind__ could return the object
> to be bound:
>
>    a = b
>
> should then bind the name "a" to the return value of
>
>    b.__bind__()
>
> if b implements __bind__.

Keep in mind that descriptors already give you that for classes.
There are other workarounds if you *really* have to have this
functionality.  You're right that globals (module body namespace) and
locals (function body namespace) do not have that capability[1].

The main case I've heard for a generic "bind" protocol is for DRY.
For instance, you create a new object with some name as an argument
and then bind that object to that name in the current running
namespace.  This has been brought up before[2], with the canonical
example of namedtuple (along with arguments on why it's not a big
deal[3]).

I'd expect such an API to look something like this:

object.__bind__(name, namespace)
object.__unbind__(name, namespace, replacement=None)

namespace is the mapping for the locals/object (a.k.a. vars()) where
the name is going to be bound.  When an object is already bound to a
name, __unbind__() would be called first on the current object.  In
that case, replacement would be the object that is replacing the
currently bound one.  At a high level the whole binding operation
would look something like this:

def bind(ns, name, obj):
    if name in ns:
        ns[name].__unbind__(name, ns, obj)
    obj.__bind__(name, ns)
    ns[name] = obj  # or whatever

If you wanted to get fancy, both methods could return a boolean
indicating that the name should *not* be bound/unbound (respectively):

def bind(ns, name, obj):
    if name in ns:
        if not ns[name].__unbind__(name, ns, obj):
            return
    if obj.__bind__(name, ns):
        ns[name] = obj  # or whatever

The bind protocol could also be used in the fallback behavior of
augmented assignment operations.

Ultimately, considering how often things are bound/unbound, I'd worry
that it would be too expensive for any bind API to see the light of
day.

-eric

[1] You *can* use your own module class to get it for "globals", sort
of.  This wouldn't quite work for the globals associated with
functions defined in the module.
[2] http://mail.python.org/pipermail/python-ideas/2011-March/009233.html,
and others.
[3] http://mail.python.org/pipermail/python-ideas/2011-March/009277.html


From benhoyt at gmail.com  Wed Dec  5 20:52:08 2012
From: benhoyt at gmail.com (Ben Hoyt)
Date: Thu, 6 Dec 2012 08:52:08 +1300
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CAP7+vJL3oAbBPf49hS0Z6kKY6AY++sMgp_-0RpmGJoHHTtUEKQ@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
	<20121203103416.03094472@resist.wooz.org>
	<CAP7+vJJKXgK_Pc+Y+6jQyGW6CPOc44uhjW+ZCNWdaD5MTa9EmQ@mail.gmail.com>
	<55D9BA21-0C74-4958-A9C7-0C0969366F93@masklinn.net>
	<CAP7+vJL3oAbBPf49hS0Z6kKY6AY++sMgp_-0RpmGJoHHTtUEKQ@mail.gmail.com>
Message-ID: <CAL9jXCGHw_tEa6HY4ZfLZpwe83-rmQnV3gQROkBApKyXQ3_WaQ@mail.gmail.com>

> - Tuples. Sometimes you want to say e.g. "a tuple of integers, don't
> mind the length"; other times you want to say e.g. "a tuple of fixed
> length containing an int and two strs". Perhaps the former should be
> expressed using ImmutableSequence[Int] and the second as Tuple[Int,
> Str, Str].

Nice, that seems very explicit. ImmutableSequence is long, but clear.
In this specific case, should it be just Sequence, and a mutable one
would be MutableSequence (to be consistent with collections.abc
names?).

> - Unions. We need a way to say "either X or Y". Given that we're
> defining our own objects we may actually be able to get away with
> writing e.g. "Int | Str" or "Str | List[Str]", and isinstance() would
> still work. It would also be useful to have a shorthand for "either T
> or None", written as Optional[T] or Optional(T).

Definitely useful to have a notation for "either T or None", as it's a
pretty heavily-used pattern. But what about using the same approach,
something like "T | None" or "T | NoneType". Though if you use the
real None rather than experimental_type_annotations.None, is that
confusing? In any case, it seems unnecessary to have a special
Optional(T) notation when you've already got the simple "T1 | T2"
notation.

> - Whether to design notations to express other constraints. E.g.
> "integer in range(10, 100)", or "one of the strings 'r', 'w' or 'a'",
> etc. You can go crazy on this.

Yes, I think this is dangerous territory -- it could get crazy very
fast. Statically typed languages don't have this. Then again, I guess
type annotations have the potential to be *more* powerful in this
regard. Still, it'd have to be an awfully nice and general notation
for it to be useful. Even then, your "def" line complete with
type/constraint annotations may get far too long to be readable...

-Ben


From bruce at leapyear.org  Wed Dec  5 21:01:47 2012
From: bruce at leapyear.org (Bruce Leban)
Date: Wed, 5 Dec 2012 12:01:47 -0800
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CAP7+vJL3oAbBPf49hS0Z6kKY6AY++sMgp_-0RpmGJoHHTtUEKQ@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
	<20121203103416.03094472@resist.wooz.org>
	<CAP7+vJJKXgK_Pc+Y+6jQyGW6CPOc44uhjW+ZCNWdaD5MTa9EmQ@mail.gmail.com>
	<55D9BA21-0C74-4958-A9C7-0C0969366F93@masklinn.net>
	<CAP7+vJL3oAbBPf49hS0Z6kKY6AY++sMgp_-0RpmGJoHHTtUEKQ@mail.gmail.com>
Message-ID: <CAGu0Anvq+T-S7LjoCg11Evnce5dZRwcojVNB0xgJ+ee1aMv-vg@mail.gmail.com>

On Wed, Dec 5, 2012 at 11:22 AM, Guido van Rossum <guido at python.org> wrote:

> - Unions. We need a way to say "either X or Y". Given that we're
> defining our own objects we may actually be able to get away with
> writing e.g. "Int | Str" or "Str | List[Str]", and isinstance() would
> still work. It would also be useful to have a shorthand for "either T
> or None", written as Optional[T] or Optional(T).
>

Optional is not the same as "or None" to me:

Dict(a=Int, b=Int | None, c=Optional(Int))


suggests that b is required but might be None while c is not required,
i.e., {'a': 3, b: None} is allowed while {'a': 3, c: None} is not.

Ditto for Tuples:

Tuple[Int, Str | None, Optional(Int)]

where (3, None) matches as does (3, 'a', 4) but not (3, None, None).

Optionals might be restricted to the end as matching in the middle would be
complicated and possibly error-prone:

Tuple[Int, Optional(Int | None), Int | Str, Int | None]

--- Bruce
Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121205/fae12413/attachment.html>

From sturla at molden.no  Wed Dec  5 21:09:49 2012
From: sturla at molden.no (Sturla Molden)
Date: Wed, 5 Dec 2012 21:09:49 +0100
Subject: [Python-ideas] New __reference__ hook
In-Reply-To: <63C70C75-1B39-4565-9AF0-1199DA6370C8@masklinn.net>
References: <lmmobamdjivxejhjhspo@xsga>
	<CAEBZo3Owwmwa=0S2yfap7vpHpyLcWj=is0Gug+NdSZ9NVP6Zrw@mail.gmail.com>
	<1354723588.24521.140661162241293.22C960DA@webmail.messagingengine.com>
	<CAA0H+QR+yW0ii03hskEuUFhbzZgBg-K=GMUYOOM2W4EqRJk1AQ@mail.gmail.com>
	<50BF8DEB.9040101@molden.no>
	<63C70C75-1B39-4565-9AF0-1199DA6370C8@masklinn.net>
Message-ID: <5539A564-7FD9-41E5-9BA5-14BB829A9CE7@molden.no>


Den 5. des. 2012 kl. 19:51 skrev Masklinn <masklinn at masklinn.net>:

> 
> Why? z could just be a "lazy value" at this point, basically a manual
> building of thunks, only reifying them when necessary (whenever that
> is). It's not like numpy *has* to create three temporary arrays, just
> that it *does*.
> 

It has to, because it does not know when to flush an expression. This strangely enough, accounts for most of the speed difference between Python/NumPy and e.g. Fortran 95. A Fortran 95 compiler can compile an array expression as a single loop. NumPy cannot, because the binary operators does not tell when an expression is "finalized". That is why the numexpr JIT compiler evaluates Python expressions as strings, and needs to include a parser and whatnot. Today, most numerical code is memory bound, not compute bound, as CPUs are immensely faster than RAM. So what keeps numerical/scientific code written in Python slower than C or Fortran today is mostly creation of temporary array objects ? i.e. memory access ?, not the computations per se. If we could get rid of temprary arrays, Python codes could possibly achieve 80 % of Fortran 95 speed. For scientistis that would mean we don't need to write any more Fortran or C.

But perhaps it is possible to do this with AST magic? I don't know. Nor do I know if __bind__ is the best way to do this. Perhaps not. But I do know that automatically detecting when to "flush a compund expression with (NumPy?) arrays" would be the holy grail for scientific computing with Python. A binary operator x+y would just return a symbolic representation of the expression, but when the full expression needs to be flushed we can e.g. ask OpenCL or LLVM to generate the code on the fly. It would turn numerical computing into something similar to dynamic HTML. And we know how good Python is at generating structured text on the fly.

Sturla





From masklinn at masklinn.net  Wed Dec  5 21:34:43 2012
From: masklinn at masklinn.net (Masklinn)
Date: Wed, 5 Dec 2012 21:34:43 +0100
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CAP7+vJL3oAbBPf49hS0Z6kKY6AY++sMgp_-0RpmGJoHHTtUEKQ@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
	<20121203103416.03094472@resist.wooz.org>
	<CAP7+vJJKXgK_Pc+Y+6jQyGW6CPOc44uhjW+ZCNWdaD5MTa9EmQ@mail.gmail.com>
	<55D9BA21-0C74-4958-A9C7-0C0969366F93@masklinn.net>
	<CAP7+vJL3oAbBPf49hS0Z6kKY6AY++sMgp_-0RpmGJoHHTtUEKQ@mail.gmail.com>
Message-ID: <4F969DC0-2B67-4C35-B0E7-EEEAD992E840@masklinn.net>

On 2012-12-05, at 20:22 , Guido van Rossum wrote:
> 
>> The most bothersome part is that I "feel" "either X or Y" (aka `X | Y`)
>> should be a set of type (and thus the same as {X, Y}[0]) but that doesn't
>> work with `isinstance` or `issubclass`. Likewise, `(a, b, c)` in an
>> annotation feels like it should mean the same as `tuple[a, b, c]` ("a
>> tuple with 3 items of types resp. a, b and c") but that's at odds with
>> the same type-checking functions.
> 
> Note that in Python 3 you can override isinstance, by defining
> __instancecheck__ in the class:
> http://docs.python.org/3/reference/datamodel.html?highlight=__instancecheck__#class.__instancecheck__
> 
> So it shouldn't be a problem to make isinstance(42, Int) work.

My problem there was more about having e.g. Int | Float return a set,
but isinstance not working with a set. But indeed it could return a
TypeSet which would implement __instancecheck__.

> - Tuples. Sometimes you want to say e.g. "a tuple of integers, don't
> mind the length"; other times you want to say e.g. "a tuple of fixed
> length containing an int and two strs". Perhaps the former should be
> expressed using ImmutableSequence[Int] and the second as Tuple[Int,
> Str, Str].



> - Unions. We need a way to say "either X or Y". Given that we're
> defining our own objects we may actually be able to get away with
> writing e.g. "Int | Str" or "Str | List[Str]", and isinstance() would
> still work. It would also be useful to have a shorthand for "either T
> or None", written as Optional[T] or Optional(T).

Well if `|` is the "union operator", as Ben notes `T | None` works well,
is clear and is sufficient. Though that's if and only if "Optional[T]"
is equivalent to "T or None" which Bruce seems to disagree with. There's
some history with this pattern:
http://journal.stuffwithstuff.com/2010/08/23/void-null-maybe-and-nothing/
(bottom section, from "Or Some Other Solution")

> - Whether to design notations to express other constraints. E.g.
> "integer in range(10, 100)", or "one of the strings 'r', 'w' or 'a'",
> etc. You can go crazy on this.

Yes this is going in Oleg territory, a sound core is probably a
good starting idea. Although basic enumerations ("one of the strings
'r', 'w' or 'a'") could be rather neat.

> - Composability (Nick's pet peeve, in that he is against it). I
> propose that we reserve plain tuples for this. If an annotation has
> the form "x: (P, Q)" then that ought to mean that x must conform to
> both P and Q. Even though Nick doesn't like this, I don't think we
> should do everything with decorators. Surly, the decorators approach
> is good for certain use cases, and should take precedence if it is
> used. But e.g. IDEs that use annotations for suggestions and
> refactoring should not require everything to be decorated -- that
> would just make the code too busy.
> 
> - Runtime enforcement. What should we use type annotations for? IDEs,
> static checkers (linters) and refactoring tools only need the
> annotations when they are parsing the code.

For IDEs, that's pretty much all the time though, either they're parsing
the code or they're trying to perform static analysis on it, which uses
the annotations.

> While it is tempting to
> invent some kind of runtime checking that automatically checks the
> actual types against the annotations whenever a function is called, I
> think this is rarely useful, and often prohibitively slow.

Could be useful for debug or testing runs though, in the same way
event-based profilers are prohibitively slow and can't be enabled all
the time but are still useful. Plus it might be possible to
enable/disable this mechanism with little to no source modification via
sys.setprofile (I'm not sure what hooks it provides exactly and the
documentation is rather sparse, so I'm not sure if the function object
itself is available to the setprofile callback, looking at
Lib/profiler.py it might only get the code object).

From ericsnowcurrently at gmail.com  Wed Dec  5 21:40:44 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Wed, 5 Dec 2012 13:40:44 -0700
Subject: [Python-ideas] A bind protocol (was Re: New __reference__ hook)
In-Reply-To: <CALFfu7BwhpPNAZrHrHszf8jsiDjFoUYgQ06DtF7ymFdhBA58WQ@mail.gmail.com>
References: <CALFfu7BwhpPNAZrHrHszf8jsiDjFoUYgQ06DtF7ymFdhBA58WQ@mail.gmail.com>
Message-ID: <CALFfu7BMJGjwvXTVGxtYQkowdLPMkdCY-P9dMDuPKhpvqYKLTA@mail.gmail.com>

(from the "Re: New __reference__ hook" thread)

On Wed, Dec 5, 2012 at 11:54 AM, Bruce Leban <bruce at leapyear.org> wrote:
> There is another way to write expressions that don't get evaluated:
>
> lambda: a*x + b*y + c
>
>
> So you could write this as z.bind(lambda: rhs) or if this is important
> enough there could be a new bind operator:
>
> lhs @= rhs
>
>
> which is equivalent to
>
> lhs.__bind__(lambda: rhs)

The lazy/lambda part aside, such an operator would somewhat help with
performance concerns and allow the "binder" to control when the
"bindee" gets notified.

-eric


From masklinn at masklinn.net  Wed Dec  5 21:45:51 2012
From: masklinn at masklinn.net (Masklinn)
Date: Wed, 5 Dec 2012 21:45:51 +0100
Subject: [Python-ideas] New __reference__ hook
In-Reply-To: <5539A564-7FD9-41E5-9BA5-14BB829A9CE7@molden.no>
References: <lmmobamdjivxejhjhspo@xsga>
	<CAEBZo3Owwmwa=0S2yfap7vpHpyLcWj=is0Gug+NdSZ9NVP6Zrw@mail.gmail.com>
	<1354723588.24521.140661162241293.22C960DA@webmail.messagingengine.com>
	<CAA0H+QR+yW0ii03hskEuUFhbzZgBg-K=GMUYOOM2W4EqRJk1AQ@mail.gmail.com>
	<50BF8DEB.9040101@molden.no>
	<63C70C75-1B39-4565-9AF0-1199DA6370C8@masklinn.net>
	<5539A564-7FD9-41E5-9BA5-14BB829A9CE7@molden.no>
Message-ID: <D6A5A5DE-6F9B-40EB-A783-64F5514B88E6@masklinn.net>

On 2012-12-05, at 21:09 , Sturla Molden wrote:

> 
> Den 5. des. 2012 kl. 19:51 skrev Masklinn <masklinn at masklinn.net>:
> 
>> 
>> Why? z could just be a "lazy value" at this point, basically a manual
>> building of thunks, only reifying them when necessary (whenever that
>> is). It's not like numpy *has* to create three temporary arrays, just
>> that it *does*.
>> 
> 
> It has to, because it does not know when to flush an expression.

That tends to be the hard thing to decide, but it should be possible to
find out most cases e.g. evaluate the thunks when elements are requested
(similar to generators, but do the whole thunk at once), when printing,
etc?

Or use the numexpr approach and perform the reification explicitly.

> But perhaps it is possible to do this with AST magic? I don't know.

I'm not sure there's even a need for AST magic (although you could also
play with that by writing operations within lambdas I guess, I've never
done much AST analysis/rewriting), it could simply use an approach
similar to SQLAlchemy's's ClauseElement: when applying an operation to
e.g. an array, rather than perform it just return a representation of
the operation itself (effectively rebuild some sort of AST), new
operations on *that* would simply build the tree further (composing the
thunk), and an explicit evaluation call or implicit evaluation due to
e.g. accessing stuff would compile the "potential" operation and perform
the actual computations.

> A binary operator x+y would just return a symbolic representation of the
> expression, but when the full expression needs to be flushed we can e.g.
> ask OpenCL or LLVM to generate the code on the fly.

Indeed.



From jsbueno at python.org.br  Wed Dec  5 21:48:06 2012
From: jsbueno at python.org.br (Joao S. O. Bueno)
Date: Wed, 5 Dec 2012 18:48:06 -0200
Subject: [Python-ideas] New __reference__ hook
In-Reply-To: <5539A564-7FD9-41E5-9BA5-14BB829A9CE7@molden.no>
References: <lmmobamdjivxejhjhspo@xsga>
	<CAEBZo3Owwmwa=0S2yfap7vpHpyLcWj=is0Gug+NdSZ9NVP6Zrw@mail.gmail.com>
	<1354723588.24521.140661162241293.22C960DA@webmail.messagingengine.com>
	<CAA0H+QR+yW0ii03hskEuUFhbzZgBg-K=GMUYOOM2W4EqRJk1AQ@mail.gmail.com>
	<50BF8DEB.9040101@molden.no>
	<63C70C75-1B39-4565-9AF0-1199DA6370C8@masklinn.net>
	<5539A564-7FD9-41E5-9BA5-14BB829A9CE7@molden.no>
Message-ID: <CAH0mxTRTQkSxJLHZCmAu3J1f45bGg7YaStA0Opxss-qmXsv1nw@mail.gmail.com>

On 5 December 2012 18:09, Sturla Molden <sturla at molden.no> wrote:
>
> Den 5. des. 2012 kl. 19:51 skrev Masklinn <masklinn at masklinn.net>:
>
>>
>> Why? z could just be a "lazy value" at this point, basically a manual
>> building of thunks, only reifying them when necessary (whenever that
>> is). It's not like numpy *has* to create three temporary arrays, just
>> that it *does*.
>>
>
> It has to, because it does not know when to flush an expression. This strangely enough, accounts for most of the speed difference between Python/NumPy and e.g. Fortran 95. A Fortran 95 compiler can compile an array expression as a single loop. NumPy cannot, because the binary operators does not tell when an expression is "finalized". That is why the numexpr JIT compiler evaluates Python expressions as strings, and needs to include a parser and whatnot. Today, most numerical code is memory bound, not compute bound, as CPUs are immensely faster than RAM. So what keeps numerical/scientific code written in Python slower than C or Fortran today is mostly creation of temporary array objects ? i.e. memory access ?, not the computations per se. If we could get rid of temprary arrays, Python codes could possibly achieve 80 % of Fortran 95 speed. For scientistis that would mean we don't need to write any more Fortran or C.
>
> But perhaps it is possible to do this with AST magic? I don't know. Nor do I know if __bind__ is the best way to do this. Perhaps not. But I do know that automatically detecting when to "flush a compund expression with (NumPy?) arrays" would be the holy grail for scientific computing with Python. A binary operator x+y would just return a symbolic representation of the expression, but when the full expression needs to be flushed we can e.g. ask OpenCL or LLVM to generate the code on the fly. It would turn numerical computing into something similar to dynamic HTML. And we know how good Python is at generating structured text on the fly.

Today that can be achieved by crafting a class that overrides all ops
to perform literal transforms and with a "flush" or "calculate"
method. Sympy does something like that, and it would not be hard to
have a numpy module to perform like that with numpy arrays. In this
particular use case, we'd have the full benefit of "explicit is better
than implicit".

  js
 -><-

>
> Sturla
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas


From ubershmekel at gmail.com  Wed Dec  5 21:50:41 2012
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Wed, 5 Dec 2012 15:50:41 -0500
Subject: [Python-ideas] New __reference__ hook
In-Reply-To: <5539A564-7FD9-41E5-9BA5-14BB829A9CE7@molden.no>
References: <lmmobamdjivxejhjhspo@xsga>
	<CAEBZo3Owwmwa=0S2yfap7vpHpyLcWj=is0Gug+NdSZ9NVP6Zrw@mail.gmail.com>
	<1354723588.24521.140661162241293.22C960DA@webmail.messagingengine.com>
	<CAA0H+QR+yW0ii03hskEuUFhbzZgBg-K=GMUYOOM2W4EqRJk1AQ@mail.gmail.com>
	<50BF8DEB.9040101@molden.no>
	<63C70C75-1B39-4565-9AF0-1199DA6370C8@masklinn.net>
	<5539A564-7FD9-41E5-9BA5-14BB829A9CE7@molden.no>
Message-ID: <CANSw7Ky3q_m7rh4=2ngseWyAFz2jzb-SeaNYobEN-CU80UkOkQ@mail.gmail.com>

On Wed, Dec 5, 2012 at 3:09 PM, Sturla Molden <sturla at molden.no> wrote:

>
> But perhaps it is possible to do this with AST magic? I don't know. Nor do
> I know if __bind__ is the best way to do this. Perhaps not. But I do know
> that automatically detecting when to "flush a compund expression with
> (NumPy?) arrays" would be the holy grail for scientific computing with
> Python. A binary operator x+y would just return a symbolic representation
> of the expression, but when the full expression needs to be flushed we can
> e.g. ask OpenCL or LLVM to generate the code on the fly. It would turn
> numerical computing into something similar to dynamic HTML. And we know how
> good Python is at generating structured text on the fly.
>
> Sturla
>
>
Not all pixel fiddling can be solved using array calculus, so there will
always be C involved at some point.

Still this could be a great advancement. Though I don't think bind-time is
the right time to evaluate anything as it would drive optimizing
programmers to "one-line" things. Using intermediate variable names to
explain an algorithm is crucial for readability in my experience.

Creating intermediate objects only to be evaluated when programmer
explicitly demands is the way to go. E.g. "evalf" in sympy
http://scipy-lectures.github.com/advanced/sympy.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121205/2bde017e/attachment.html>

From amauryfa at gmail.com  Wed Dec  5 21:59:23 2012
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Wed, 5 Dec 2012 21:59:23 +0100
Subject: [Python-ideas] New __reference__ hook
In-Reply-To: <CAH0mxTRTQkSxJLHZCmAu3J1f45bGg7YaStA0Opxss-qmXsv1nw@mail.gmail.com>
References: <lmmobamdjivxejhjhspo@xsga>
	<CAEBZo3Owwmwa=0S2yfap7vpHpyLcWj=is0Gug+NdSZ9NVP6Zrw@mail.gmail.com>
	<1354723588.24521.140661162241293.22C960DA@webmail.messagingengine.com>
	<CAA0H+QR+yW0ii03hskEuUFhbzZgBg-K=GMUYOOM2W4EqRJk1AQ@mail.gmail.com>
	<50BF8DEB.9040101@molden.no>
	<63C70C75-1B39-4565-9AF0-1199DA6370C8@masklinn.net>
	<5539A564-7FD9-41E5-9BA5-14BB829A9CE7@molden.no>
	<CAH0mxTRTQkSxJLHZCmAu3J1f45bGg7YaStA0Opxss-qmXsv1nw@mail.gmail.com>
Message-ID: <CAGmFidb5QcoKowjBwn_Nvnv3wTvbBTyob2d1mqTK+f-o1UNz8g@mail.gmail.com>

On 5 December 2012 18:09, Sturla Molden <sturla at molden.no> wrote:
> >
> > Den 5. des. 2012 kl. 19:51 skrev Masklinn <masklinn at masklinn.net>:
> >
> >>
> >> Why? z could just be a "lazy value" at this point, basically a manual
> >> building of thunks, only reifying them when necessary (whenever that
> >> is). It's not like numpy *has* to create three temporary arrays, just
> >> that it *does*.
> >>
> >
> > It has to, because it does not know when to flush an expression. This
> strangely enough, accounts for most of the speed difference between
> Python/NumPy and e.g. Fortran 95. A Fortran 95 compiler can compile an
> array expression as a single loop. NumPy cannot, because the binary
> operators does not tell when an expression is "finalized". That is why the
> numexpr JIT compiler evaluates Python expressions as strings, and needs to
> include a parser and whatnot. Today, most numerical code is memory bound,
> not compute bound, as CPUs are immensely faster than RAM. So what keeps
> numerical/scientific code written in Python slower than C or Fortran today
> is mostly creation of temporary array objects ? i.e. memory access ?, not
> the computations per se. If we could get rid of temprary arrays, Python
> codes could possibly achieve 80 % of Fortran 95 speed. For scientistis that
> would mean we don't need to write any more Fortran or C.
> >
> > But perhaps it is possible to do this with AST magic? I don't know. Nor
> do I know if __bind__ is the best way to do this. Perhaps not. But I do
> know that automatically detecting when to "flush a compund expression with
> (NumPy?) arrays" would be the holy grail for scientific computing with
> Python. A binary operator x+y would just return a symbolic representation
> of the expression, but when the full expression needs to be flushed we can
> e.g. ask OpenCL or LLVM to generate the code on the fly. It would turn
> numerical computing into something similar to dynamic HTML. And we know how
> good Python is at generating structured text on the fly.
>
>
FYI, the numpy module shipped with PyPy does exactly this: the operations
are recorded in some AST structure, which is evaluated only when the first
item of the array is read.
This is completely transparent to the user, or to other parts of the
interpreter.

PyPy uses JIT techniques to generate machine code specialized for the
particular AST, and is typically 2x to 5x faster than Numpy, probably
because a lot of allocations/copies are avoided.

-- 
Amaury Forgeot d'Arc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121205/96229ab7/attachment.html>

From thomas at kluyver.me.uk  Wed Dec  5 22:33:07 2012
From: thomas at kluyver.me.uk (Thomas Kluyver)
Date: Wed, 5 Dec 2012 21:33:07 +0000
Subject: [Python-ideas] New __reference__ hook
In-Reply-To: <5539A564-7FD9-41E5-9BA5-14BB829A9CE7@molden.no>
References: <lmmobamdjivxejhjhspo@xsga>
	<CAEBZo3Owwmwa=0S2yfap7vpHpyLcWj=is0Gug+NdSZ9NVP6Zrw@mail.gmail.com>
	<1354723588.24521.140661162241293.22C960DA@webmail.messagingengine.com>
	<CAA0H+QR+yW0ii03hskEuUFhbzZgBg-K=GMUYOOM2W4EqRJk1AQ@mail.gmail.com>
	<50BF8DEB.9040101@molden.no>
	<63C70C75-1B39-4565-9AF0-1199DA6370C8@masklinn.net>
	<5539A564-7FD9-41E5-9BA5-14BB829A9CE7@molden.no>
Message-ID: <CAOvn4qh05-hwPLx+dqQTbFJ73OoUaKH5RweW6o9M2FEtv_Athg@mail.gmail.com>

On 5 December 2012 20:09, Sturla Molden <sturla at molden.no> wrote:

> But perhaps it is possible to do this with AST magic?


As far as I understand it, numba [1] does this kind of AST magic (among
other things).

https://github.com/numba/numba

Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121205/d1587c39/attachment.html>

From tjreedy at udel.edu  Wed Dec  5 22:59:15 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 05 Dec 2012 16:59:15 -0500
Subject: [Python-ideas] New __reference__ hook
In-Reply-To: <50BF8DEB.9040101@molden.no>
References: <lmmobamdjivxejhjhspo@xsga>
	<CAEBZo3Owwmwa=0S2yfap7vpHpyLcWj=is0Gug+NdSZ9NVP6Zrw@mail.gmail.com>
	<1354723588.24521.140661162241293.22C960DA@webmail.messagingengine.com>
	<CAA0H+QR+yW0ii03hskEuUFhbzZgBg-K=GMUYOOM2W4EqRJk1AQ@mail.gmail.com>
	<50BF8DEB.9040101@molden.no>
Message-ID: <k9og3o$l62$1@ger.gmane.org>

On 12/5/2012 1:09 PM, Sturla Molden wrote:
> On 05.12.2012 18:05, Jasper St. Pierre wrote:
>> And? What's wrong with an __iadd__ that's exactly the same as Mike's
>> __add__?
>
> I think it was a Java-confusion. He thought numbers were copied on
> assignment. But there is no difference between value types and object
> types in Python. Ints and floats are immutable, but they are not value
> types as in Java.
>
> But apart from that, I think allowing overloading of the binding
> operator "=" might be a good idea.

An assignment statement mutates the current local namespace. The 
'current local namespace' is a hidden input to the function performed by 
all statements, but it need not be a python object. The key symbol '=' 
is not an operator and 'a = b' is not an expression.

> A special method __bind__ could
> return the object to be bound:
>
>     a = b
>
> should then bind the name "a" to the return value of
>
>     b.__bind__()

If one wants to perform 'a = f(b)' or 'a = b.meth()' instead of 'a = b', 
then one should just explicitly say so.

> if b implements __bind__.
>
> Sure, it could be used to implement copy on assignment. But it would
> also do other things like allowing lazy evaluation of an expression.
>
> NumPy code like
>
>     z = a*x + b*y + c
>
> could avoid creating three temporary arrays if there was a __bind__
> function called on "=".

No, z = (a*x + b*y * c).__bind__, which is how you defined .__bind__ 
working, still requires that the expression be evaluated to an object.

The definition of Python requires computation of a*x, b*y, (a*x + b*y), 
and finally (a*x + b*y) + c in that order. Either '*' or '+' may have 
side-effects.

> This is a big thing, cf. the difference between
> NumPy and numexpr:
>
>     z = numexpr.evaluate("""a*x + b*y + c""")

> The reason numerical expressions must be written as strings to be
> efficient in Python is because there is no __bind__ function.

No, it is because the semantics of Python require inefficiency that can 
only be removed by a special parser-compiler with additional knowledge 
of the relevant object class, method, and instance properties. Such 
knowledge allows code to be re-written without changing the effect. For 
Fortran arrays, the needed information includes the number and length of 
each dimension. These are either declared or parameterized and passed as 
arguments.

https://code.google.com/p/numexpr/wiki/Overview
says that numexpr.evaluate(ex) first calls compile(ex), but does not say 
whether it has compile compile to cpython bytecode or only to ast. In 
either case, it converts arraywise operations to blockwise operations 
inside loops run on a custom C-coded virtual machine. They imply that 
this is not as good as elementwise operations compiled to native machine 
code. In any case, it knows that numpy array operations are side-effect 
free and must use the runtime dimension and size info.

-- 
Terry Jan Reedy



From guido at python.org  Thu Dec  6 00:01:16 2012
From: guido at python.org (Guido van Rossum)
Date: Wed, 5 Dec 2012 15:01:16 -0800
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CAGu0Anvq+T-S7LjoCg11Evnce5dZRwcojVNB0xgJ+ee1aMv-vg@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
	<20121203103416.03094472@resist.wooz.org>
	<CAP7+vJJKXgK_Pc+Y+6jQyGW6CPOc44uhjW+ZCNWdaD5MTa9EmQ@mail.gmail.com>
	<55D9BA21-0C74-4958-A9C7-0C0969366F93@masklinn.net>
	<CAP7+vJL3oAbBPf49hS0Z6kKY6AY++sMgp_-0RpmGJoHHTtUEKQ@mail.gmail.com>
	<CAGu0Anvq+T-S7LjoCg11Evnce5dZRwcojVNB0xgJ+ee1aMv-vg@mail.gmail.com>
Message-ID: <CAP7+vJJL1ywwDJT-SpsNbE5O=0bcmQtJbV8SFX4iMB9sdYCAGA@mail.gmail.com>

On Wed, Dec 5, 2012 at 12:01 PM, Bruce Leban <bruce at leapyear.org> wrote:
>
>
> On Wed, Dec 5, 2012 at 11:22 AM, Guido van Rossum <guido at python.org> wrote:
>>
>> - Unions. We need a way to say "either X or Y". Given that we're
>> defining our own objects we may actually be able to get away with
>> writing e.g. "Int | Str" or "Str | List[Str]", and isinstance() would
>> still work. It would also be useful to have a shorthand for "either T
>> or None", written as Optional[T] or Optional(T).
>
>
> Optional is not the same as "or None" to me:
>
> Dict(a=Int, b=Int | None, c=Optional(Int))
>
>
> suggests that b is required but might be None while c is not required, i.e.,
> {'a': 3, b: None} is allowed while {'a': 3, c: None} is not.
>
> Ditto for Tuples:
>
> Tuple[Int, Str | None, Optional(Int)]
>
> where (3, None) matches as does (3, 'a', 4) but not (3, None, None).
>
> Optionals might be restricted to the end as matching in the middle would be
> complicated and possibly error-prone:
>
> Tuple[Int, Optional(Int | None), Int | Str, Int | None]

Those are not the semantics I had in mind for Optional.

-- 
--Guido van Rossum (python.org/~guido)


From guido at python.org  Thu Dec  6 00:06:01 2012
From: guido at python.org (Guido van Rossum)
Date: Wed, 5 Dec 2012 15:06:01 -0800
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <4F969DC0-2B67-4C35-B0E7-EEEAD992E840@masklinn.net>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
	<20121203103416.03094472@resist.wooz.org>
	<CAP7+vJJKXgK_Pc+Y+6jQyGW6CPOc44uhjW+ZCNWdaD5MTa9EmQ@mail.gmail.com>
	<55D9BA21-0C74-4958-A9C7-0C0969366F93@masklinn.net>
	<CAP7+vJL3oAbBPf49hS0Z6kKY6AY++sMgp_-0RpmGJoHHTtUEKQ@mail.gmail.com>
	<4F969DC0-2B67-4C35-B0E7-EEEAD992E840@masklinn.net>
Message-ID: <CAP7+vJ+Qusnw=oqKoEvX8dVCR2rY67hC=UC36bn5Sm9hE9cJWg@mail.gmail.com>

On Wed, Dec 5, 2012 at 12:34 PM, Masklinn <masklinn at masklinn.net> wrote:
> On 2012-12-05, at 20:22 , Guido van Rossum wrote:
>>
>>> The most bothersome part is that I "feel" "either X or Y" (aka `X | Y`)
>>> should be a set of type (and thus the same as {X, Y}[0]) but that doesn't
>>> work with `isinstance` or `issubclass`. Likewise, `(a, b, c)` in an
>>> annotation feels like it should mean the same as `tuple[a, b, c]` ("a
>>> tuple with 3 items of types resp. a, b and c") but that's at odds with
>>> the same type-checking functions.
>>
>> Note that in Python 3 you can override isinstance, by defining
>> __instancecheck__ in the class:
>> http://docs.python.org/3/reference/datamodel.html?highlight=__instancecheck__#class.__instancecheck__
>>
>> So it shouldn't be a problem to make isinstance(42, Int) work.
>
> My problem there was more about having e.g. Int | Float return a set,
> but isinstance not working with a set. But indeed it could return a
> TypeSet which would implement __instancecheck__.

Right, that's what I meant.

>> - Tuples. Sometimes you want to say e.g. "a tuple of integers, don't
>> mind the length"; other times you want to say e.g. "a tuple of fixed
>> length containing an int and two strs". Perhaps the former should be
>> expressed using ImmutableSequence[Int] and the second as Tuple[Int,
>> Str, Str].
>
>
>
>> - Unions. We need a way to say "either X or Y". Given that we're
>> defining our own objects we may actually be able to get away with
>> writing e.g. "Int | Str" or "Str | List[Str]", and isinstance() would
>> still work. It would also be useful to have a shorthand for "either T
>> or None", written as Optional[T] or Optional(T).
>
> Well if `|` is the "union operator", as Ben notes `T | None` works well,
> is clear and is sufficient. Though that's if and only if "Optional[T]"
> is equivalent to "T or None" which Bruce seems to disagree with. There's
> some history with this pattern:
> http://journal.stuffwithstuff.com/2010/08/23/void-null-maybe-and-nothing/
> (bottom section, from "Or Some Other Solution")

Actually, I find "T|None" somewhat impure, since None is not a type
but a value. If you were allow this, what about "T|False"? And then
what about "True|None"? (There's no way to make the latter work!) And
I think "T|NoneType" is obscure; hence my proposal of Optional(T).
(Not Optional[T], since Optional is not a type.)

>> - Whether to design notations to express other constraints. E.g.
>> "integer in range(10, 100)", or "one of the strings 'r', 'w' or 'a'",
>> etc. You can go crazy on this.
>
> Yes this is going in Oleg territory, a sound core is probably a
> good starting idea. Although basic enumerations ("one of the strings
> 'r', 'w' or 'a'") could be rather neat.
>
>> - Composability (Nick's pet peeve, in that he is against it). I
>> propose that we reserve plain tuples for this. If an annotation has
>> the form "x: (P, Q)" then that ought to mean that x must conform to
>> both P and Q. Even though Nick doesn't like this, I don't think we
>> should do everything with decorators. Surly, the decorators approach
>> is good for certain use cases, and should take precedence if it is
>> used. But e.g. IDEs that use annotations for suggestions and
>> refactoring should not require everything to be decorated -- that
>> would just make the code too busy.
>>
>> - Runtime enforcement. What should we use type annotations for? IDEs,
>> static checkers (linters) and refactoring tools only need the
>> annotations when they are parsing the code.
>
> For IDEs, that's pretty much all the time though, either they're parsing
> the code or they're trying to perform static analysis on it, which uses
> the annotations.

Yeah, they're parsing it, but they're not executing it.

>> While it is tempting to
>> invent some kind of runtime checking that automatically checks the
>> actual types against the annotations whenever a function is called, I
>> think this is rarely useful, and often prohibitively slow.
>
> Could be useful for debug or testing runs though, in the same way
> event-based profilers are prohibitively slow and can't be enabled all
> the time but are still useful. Plus it might be possible to
> enable/disable this mechanism with little to no source modification via
> sys.setprofile (I'm not sure what hooks it provides exactly and the
> documentation is rather sparse, so I'm not sure if the function object
> itself is available to the setprofile callback, looking at
> Lib/profiler.py it might only get the code object).

Hence my idea of using a decorator to enable this on specific functions.

-- 
--Guido van Rossum (python.org/~guido)


From bruce at leapyear.org  Thu Dec  6 01:13:51 2012
From: bruce at leapyear.org (Bruce Leban)
Date: Wed, 5 Dec 2012 16:13:51 -0800
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CAP7+vJJL1ywwDJT-SpsNbE5O=0bcmQtJbV8SFX4iMB9sdYCAGA@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
	<20121203103416.03094472@resist.wooz.org>
	<CAP7+vJJKXgK_Pc+Y+6jQyGW6CPOc44uhjW+ZCNWdaD5MTa9EmQ@mail.gmail.com>
	<55D9BA21-0C74-4958-A9C7-0C0969366F93@masklinn.net>
	<CAP7+vJL3oAbBPf49hS0Z6kKY6AY++sMgp_-0RpmGJoHHTtUEKQ@mail.gmail.com>
	<CAGu0Anvq+T-S7LjoCg11Evnce5dZRwcojVNB0xgJ+ee1aMv-vg@mail.gmail.com>
	<CAP7+vJJL1ywwDJT-SpsNbE5O=0bcmQtJbV8SFX4iMB9sdYCAGA@mail.gmail.com>
Message-ID: <CAGu0AnuLi+=wVArPCuEevEoD-HYBre3NVKC=HKn5vPEtc4yhEw@mail.gmail.com>

On Wed, Dec 5, 2012 at 3:01 PM, Guido van Rossum <guido at python.org> wrote:

>
> Those are not the semantics I had in mind for Optional.


I know that. My point was that the standard meaning of the word optional is
that something may or may not be given (or whatever the applicable verb
is). That's quite different from saying it must be provided but may be
None. Since you invited a bit of bikeshedding, I felt it was appropriate to
point that out and then I got distracted by discussing the alternative that
you weren't talking about. Sorry that was confusing.

In C#, this is called Nullable and you can write Nullable<String> to
indicate the type (String or null type). The shorthand for that is String?.

If you want a shorthand to specify that None is allowed, I'd suggest ~Str.

--- Bruce

P.S. Optional[T] is not literally a shorthand for T | None as the former is
11 characters and the latter is 10 characters even if we include and count
the spaces. :-)

P.P.S. I don't think Str | None rather than Str | NoneType is confusing.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121205/117ce778/attachment.html>

From ncoghlan at gmail.com  Thu Dec  6 06:27:21 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 6 Dec 2012 15:27:21 +1000
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CAP7+vJL3oAbBPf49hS0Z6kKY6AY++sMgp_-0RpmGJoHHTtUEKQ@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
	<20121203103416.03094472@resist.wooz.org>
	<CAP7+vJJKXgK_Pc+Y+6jQyGW6CPOc44uhjW+ZCNWdaD5MTa9EmQ@mail.gmail.com>
	<55D9BA21-0C74-4958-A9C7-0C0969366F93@masklinn.net>
	<CAP7+vJL3oAbBPf49hS0Z6kKY6AY++sMgp_-0RpmGJoHHTtUEKQ@mail.gmail.com>
Message-ID: <CADiSq7fdm5oGPE8dx2g7WE0GzL3N1Q9hfWr0JtaS7uD1MG9hAA@mail.gmail.com>

On Thu, Dec 6, 2012 at 5:22 AM, Guido van Rossum <guido at python.org> wrote:

> - Composability (Nick's pet peeve, in that he is against it). I
> propose that we reserve plain tuples for this. If an annotation has
> the form "x: (P, Q)" then that ought to mean that x must conform to
> both P and Q. Even though Nick doesn't like this, I don't think we
> should do everything with decorators. Surly, the decorators approach
> is good for certain use cases, and should take precedence if it is
> used. But e.g. IDEs that use annotations for suggestions and
> refactoring should not require everything to be decorated -- that
> would just make the code too busy.
>

I'm not against using composition within a particular set of annotation
semantics, I'm against developing a convention for arbitrary composition of
annotations with *different* semantics.

Instead, I'm advocating for the following guidelines to avoid treading on
each others toes when experimenting with annotations and to leave scope for
us to define standard annotation semantics at a future date:

1. Always use a decorator that expresses the annotation semantics in use
(e.g. tab completion, type descriptions, parameter documentation)
2. Always *move* the annotations out to purpose-specific storage as part of
the decorator (don't leave them in the annotations storage)
3. When analysing a function later, use only the purpose-specific
attribute(s), not the raw annotations storage
4. To support composition with other sets of annotation semantics, always
provide an alternate API that accepts the per-parameter details directly
(e.g. by name or index) rather than relying solely on the annotations

The reason for this is so that if, at some future point in the time,
python-dev agrees to bless some particular set of semantics as *the*
meaning of function annotations (such as the type hinting system being
discussed), then that won't break anything. Otherwise, if people believe
that it's OK for them to simply assume that the contents of the annotations
mean whatever they mean for their particular project, then it *will* cause
problems further down the road as annotations written for one set of
semantics (e.g. tab completion, parameter documentation) get interpreted by
a processor expecting different semantics (e.g. type hinting).

Here's how your example experiment would look under such a scheme:

    from experimental_type_annotations import type_hints, Int, Str, Float

    # After type_hints runs, foo.__annotations__ would be empty, and the
type
    # hinting data would instead be stored in (e.g.) a foo._type_hints
attribute.
    @type_hints
    def foo(a: Int, b: Str) -> Float:
        <blah>

This is then completely clear and unambigious:
- readers can see clearly that these annotations are intended as type hints
- the type hinting processor can see that there *is* type hinting
information available, due to the presence of a _type_hints attribute
- other automated processors see that there are no "default" annotations
(which is good, since there is currently no such thing as "default"
annotation semantics)

Furthermore, (as noted elsewhere in the thread) an alternate API can then
easily be provided that supports composition with other annotations:

    @type_hints(Int, Str, _return=Float)
    def foo(a, b):
        <blah>

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121206/f60781b4/attachment.html>

From tjreedy at udel.edu  Thu Dec  6 06:27:33 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 06 Dec 2012 00:27:33 -0500
Subject: [Python-ideas] A bind protocol (was Re: New __reference__ hook)
In-Reply-To: <CALFfu7BMJGjwvXTVGxtYQkowdLPMkdCY-P9dMDuPKhpvqYKLTA@mail.gmail.com>
References: <CALFfu7BwhpPNAZrHrHszf8jsiDjFoUYgQ06DtF7ymFdhBA58WQ@mail.gmail.com>
	<CALFfu7BMJGjwvXTVGxtYQkowdLPMkdCY-P9dMDuPKhpvqYKLTA@mail.gmail.com>
Message-ID: <k9pacc$gsf$1@ger.gmane.org>

On 12/5/2012 3:40 PM, Eric Snow wrote:
> (from the "Re: New __reference__ hook" thread)
>
> On Wed, Dec 5, 2012 at 11:54 AM, Bruce Leban <bruce at leapyear.org> wrote:
>> There is another way to write expressions that don't get evaluated:
>>
>> lambda: a*x + b*y + c
>>
>>
>> So you could write this as z.bind(lambda: rhs) or if this is important
>> enough there could be a new bind operator:
>>
>> lhs @= rhs
>>
>>
>> which is equivalent to
>>
>> lhs.__bind__(lambda: rhs)

This makes no sense to me. The targets of bind statements are not Python 
objects and do not have methods. They may be 'slots' in a python objects 
or may be turned into Python objects (strings), but within functions, 
they are not. In CPython, function local names are turned into C ints or 
uints.

> The lazy/lambda part aside, such an operator would somewhat help with
> performance concerns and allow the "binder" to control when the
> "bindee" gets notified.

So this does not make much sense either.

-- 
Terry Jan Reedy



From guido at python.org  Thu Dec  6 06:54:25 2012
From: guido at python.org (Guido van Rossum)
Date: Wed, 5 Dec 2012 21:54:25 -0800
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CADiSq7fdm5oGPE8dx2g7WE0GzL3N1Q9hfWr0JtaS7uD1MG9hAA@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
	<20121203103416.03094472@resist.wooz.org>
	<CAP7+vJJKXgK_Pc+Y+6jQyGW6CPOc44uhjW+ZCNWdaD5MTa9EmQ@mail.gmail.com>
	<55D9BA21-0C74-4958-A9C7-0C0969366F93@masklinn.net>
	<CAP7+vJL3oAbBPf49hS0Z6kKY6AY++sMgp_-0RpmGJoHHTtUEKQ@mail.gmail.com>
	<CADiSq7fdm5oGPE8dx2g7WE0GzL3N1Q9hfWr0JtaS7uD1MG9hAA@mail.gmail.com>
Message-ID: <CAP7+vJKpMMyHRCRL4JBp7_a4syoxwdfVLhT=P7stDTt8EKnjDw@mail.gmail.com>

Hi Nick,

I understand your position completely (and I did before). I just disagree. :-)

I think that requiring the experiment I am proposing to use a
decorator on each function that uses it (rather than just an import at
the top of the module) will cause too much friction, and the
experiment won't get off the ground. That's why I am proposing a
universal composition convention:

When an annotation for a particular argument is a tuple, then any
framework or decorator that tries to assign meanings to annotations
must search the items of the tuple for one that it can understand. For
the experimental type annotation system I am proposing this should be
simple enough -- the type annotation system can require that the
things it cares about must all be subclasses of a specific base class
(let's call it TypeConstraint). If the annotation is not a tuple, it
should be interpreted as a singleton tuple.

Yes, it is possible that a mistake leaves an annotation unclaimed. But
that's no worse than currently, where all annotations are ignored. And
for TypeConstraint there is no runtime behavior anyway (unless you
*do* add a decorator) -- its annotations are there for other tools to
parse and interpret. It's like pylint directives -- if you
accidentally misspell it 'pylnt' you get no error (but you may still
notice that something's fishy, because when pylint runs it doesn't
suppress the thing you tried to suppress :-).

--Guido

On Wed, Dec 5, 2012 at 9:27 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Thu, Dec 6, 2012 at 5:22 AM, Guido van Rossum <guido at python.org> wrote:
>>
>> - Composability (Nick's pet peeve, in that he is against it). I
>> propose that we reserve plain tuples for this. If an annotation has
>> the form "x: (P, Q)" then that ought to mean that x must conform to
>> both P and Q. Even though Nick doesn't like this, I don't think we
>> should do everything with decorators. Surly, the decorators approach
>> is good for certain use cases, and should take precedence if it is
>> used. But e.g. IDEs that use annotations for suggestions and
>> refactoring should not require everything to be decorated -- that
>> would just make the code too busy.
>
>
> I'm not against using composition within a particular set of annotation
> semantics, I'm against developing a convention for arbitrary composition of
> annotations with *different* semantics.
>
> Instead, I'm advocating for the following guidelines to avoid treading on
> each others toes when experimenting with annotations and to leave scope for
> us to define standard annotation semantics at a future date:
>
> 1. Always use a decorator that expresses the annotation semantics in use
> (e.g. tab completion, type descriptions, parameter documentation)
> 2. Always *move* the annotations out to purpose-specific storage as part of
> the decorator (don't leave them in the annotations storage)
> 3. When analysing a function later, use only the purpose-specific
> attribute(s), not the raw annotations storage
> 4. To support composition with other sets of annotation semantics, always
> provide an alternate API that accepts the per-parameter details directly
> (e.g. by name or index) rather than relying solely on the annotations
>
> The reason for this is so that if, at some future point in the time,
> python-dev agrees to bless some particular set of semantics as *the* meaning
> of function annotations (such as the type hinting system being discussed),
> then that won't break anything. Otherwise, if people believe that it's OK
> for them to simply assume that the contents of the annotations mean whatever
> they mean for their particular project, then it *will* cause problems
> further down the road as annotations written for one set of semantics (e.g.
> tab completion, parameter documentation) get interpreted by a processor
> expecting different semantics (e.g. type hinting).
>
> Here's how your example experiment would look under such a scheme:
>
>     from experimental_type_annotations import type_hints, Int, Str, Float
>
>     # After type_hints runs, foo.__annotations__ would be empty, and the
> type
>     # hinting data would instead be stored in (e.g.) a foo._type_hints
> attribute.
>     @type_hints
>
>     def foo(a: Int, b: Str) -> Float:
>         <blah>
>
> This is then completely clear and unambigious:
> - readers can see clearly that these annotations are intended as type hints
> - the type hinting processor can see that there *is* type hinting
> information available, due to the presence of a _type_hints attribute
> - other automated processors see that there are no "default" annotations
> (which is good, since there is currently no such thing as "default"
> annotation semantics)
>
> Furthermore, (as noted elsewhere in the thread) an alternate API can then
> easily be provided that supports composition with other annotations:
>
>     @type_hints(Int, Str, _return=Float)
>     def foo(a, b):
>         <blah>
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia



-- 
--Guido van Rossum (python.org/~guido)


From aquavitae69 at gmail.com  Thu Dec  6 08:23:31 2012
From: aquavitae69 at gmail.com (David Townshend)
Date: Thu, 6 Dec 2012 09:23:31 +0200
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CAP7+vJKVeDzEmBLM7nadQW4zFhOa1Es8r_TR84WGveNgBypFyw@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
	<20121203103416.03094472@resist.wooz.org>
	<CAP7+vJJKXgK_Pc+Y+6jQyGW6CPOc44uhjW+ZCNWdaD5MTa9EmQ@mail.gmail.com>
	<CADiSq7fy3hW_Mze1jJND7MeknPKJzVhWU8KPXzfHnbO+TQV++Q@mail.gmail.com>
	<CAEgL-fcHaDvyUG4+NM2RQb_ffDMhW4T-9zS0NzgQatVqvfbUkw@mail.gmail.com>
	<CAP7+vJKVeDzEmBLM7nadQW4zFhOa1Es8r_TR84WGveNgBypFyw@mail.gmail.com>
Message-ID: <CAEgL-ffa3Bw8iKy2N3Y5Ac4WFtJkUVfPV-byZEfSD+YNrxPcOA@mail.gmail.com>

On Wed, Dec 5, 2012 at 8:17 PM, Guido van Rossum <guido at python.org> wrote:

> On Tue, Dec 4, 2012 at 1:37 AM, David Townshend <aquavitae69 at gmail.com>
> wrote:
> > Just thought of a couple of usages which don't fit into the decorator
> model.
> > The first is using the return annotation for early binding:
> >
> >     def func(seq) -> dict(sorted=sorted):
> >         return func.__annotations__['return']['sorted'](seq)
>
> You've got to be kidding...
>

> > Stangely enough, this seems to run slightly faster than
> >
> >     def func(seq, sorted=sorted):
> >         return sorted(seq)
> >
> > My test shows the first running in about 0.376s and the second in about
> > 0.382s (python 3.3, 64bit).
>
> Surely that's some kind of random variation. It's only a 2% difference.
>

It's consistent.  I ran several tests and came out with the same 2%
difference every time.


> IOW, this is not a line of thought to pursue.
>
>
I wasn't suggesting that this is a good idea, I was merely trying to point
out that there are currently ways of using annotations beyond type
declarations with decorators, and that there may be other use cases out
there which will work well.  Documenting recommendations that annotations
only be used with decorators, or only be used for type declarations will
limit the possibilities because nobody will bother to look further, and if
they do, the ideas will no doubt be shut down as being bad style because
they go against the recommended usage.  I thought that limiting annotations
like this was what you wanted to avoid?

Having said that, I've never found a good use for annotations in my own
code, so I'm not emotionally invested one way or the other.  I do think
that the best usage I've seen is exactly what is being discussed here and
it would be great if there was some prescribed use for annotations.
Perhaps people would actually use them then.

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121206/73393b3c/attachment.html>

From masklinn at masklinn.net  Thu Dec  6 09:43:34 2012
From: masklinn at masklinn.net (Masklinn)
Date: Thu, 6 Dec 2012 09:43:34 +0100
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CAP7+vJ+Qusnw=oqKoEvX8dVCR2rY67hC=UC36bn5Sm9hE9cJWg@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
	<20121203103416.03094472@resist.wooz.org>
	<CAP7+vJJKXgK_Pc+Y+6jQyGW6CPOc44uhjW+ZCNWdaD5MTa9EmQ@mail.gmail.com>
	<55D9BA21-0C74-4958-A9C7-0C0969366F93@masklinn.net>
	<CAP7+vJL3oAbBPf49hS0Z6kKY6AY++sMgp_-0RpmGJoHHTtUEKQ@mail.gmail.com>
	<4F969DC0-2B67-4C35-B0E7-EEEAD992E840@masklinn.net>
	<CAP7+vJ+Qusnw=oqKoEvX8dVCR2rY67hC=UC36bn5Sm9hE9cJWg@mail.gmail.
	com>
Message-ID: <33AD9673-BFDD-4C1E-8149-BAC13ADB29BB@masklinn.net>

On 2012-12-06, at 00:06 , Guido van Rossum wrote:
> 
>>> - Unions. We need a way to say "either X or Y". Given that we're
>>> defining our own objects we may actually be able to get away with
>>> writing e.g. "Int | Str" or "Str | List[Str]", and isinstance() would
>>> still work. It would also be useful to have a shorthand for "either T
>>> or None", written as Optional[T] or Optional(T).
>> 
>> Well if `|` is the "union operator", as Ben notes `T | None` works well,
>> is clear and is sufficient. Though that's if and only if "Optional[T]"
>> is equivalent to "T or None" which Bruce seems to disagree with. There's
>> some history with this pattern:
>> http://journal.stuffwithstuff.com/2010/08/23/void-null-maybe-and-nothing/
>> (bottom section, from "Or Some Other Solution")
> 
> Actually, I find "T|None" somewhat impure, since None is not a type
> but a value. If you were allow this, what about "T|False"? And then
> what about "True|None"? (There's no way to make the latter work!) And
> I think "T|NoneType" is obscure; hence my proposal of Optional(T).
> (Not Optional[T], since Optional is not a type.)

Why would Optional not be a type? It's coherent with Option or Maybe types
in languages with such features, or C#'s Nullable.


From andrew.svetlov at gmail.com  Thu Dec  6 15:17:35 2012
From: andrew.svetlov at gmail.com (Andrew Svetlov)
Date: Thu, 6 Dec 2012 16:17:35 +0200
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CAP7+vJL3oAbBPf49hS0Z6kKY6AY++sMgp_-0RpmGJoHHTtUEKQ@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
	<20121203103416.03094472@resist.wooz.org>
	<CAP7+vJJKXgK_Pc+Y+6jQyGW6CPOc44uhjW+ZCNWdaD5MTa9EmQ@mail.gmail.com>
	<55D9BA21-0C74-4958-A9C7-0C0969366F93@masklinn.net>
	<CAP7+vJL3oAbBPf49hS0Z6kKY6AY++sMgp_-0RpmGJoHHTtUEKQ@mail.gmail.com>
Message-ID: <CAL3CFcUr2CwxvrkLf5HEXk+HfjHXARKq8-22etiCJK8KGM6u=w@mail.gmail.com>

On Wed, Dec 5, 2012 at 9:22 PM, Guido van Rossum <guido at python.org> wrote:
> - Unions. We need a way to say "either X or Y". Given that we're
> defining our own objects we may actually be able to get away with
> writing e.g. "Int | Str" or "Str | List[Str]", and isinstance() would
> still work. It would also be useful to have a shorthand for "either T
> or None", written as Optional[T] or Optional(T).

Just to note: there are https://github.com/Deepwalker/trafaret library
intended for checking on complex enough structures.


From random832 at fastmail.us  Thu Dec  6 20:56:07 2012
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Thu, 06 Dec 2012 14:56:07 -0500
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <33AD9673-BFDD-4C1E-8149-BAC13ADB29BB@masklinn.net>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
	<CAL3CFcVwH=vf9FkfMvYqH17JDvqvfcE3ULpuB0HU-Gpmr4EMTg@mail.gmail.com>
	<CAOvn4qg-kfri06XWDKUc3P-c=wdayMg-4Z7=EKkJS5u=ituOJw@mail.gmail.com>
	<CADiSq7fVyf+cqP84i+Q+xLt7hOh81pCcLqChYV90e+PCtS20kg@mail.gmail.com>
	<A0120E65-50AF-4ABA-A22E-E36224184C58@gmail.com>
	<CADiSq7dfKVn154NYsO10vg+AjEb3_THm5DOn6=aYFWyRB+iyXg@mail.gmail.com>
	<4B6491A4-315B-4C39-A0F2-42F0EFB42ADA@gmail.com>
	<CADiSq7drwn6eBG-i9myA6PYxdHLhw7=N5jLUDmKD2RN=c8nX4Q@mail.gmail.com>
	<9C9240AB-CE2D-4E0D-B74C-526EEB09AEB5@gmail.com>
	<CADiSq7cD9LrsmAy90+dgyfWXSSQkQ7aAaBsDR+x4Oxkg2BbcoQ@mail.gmail.com>
	<20121203103416.03094472@resist.wooz.org>
	<CAP7+vJJKXgK_Pc+Y+6jQyGW6CPOc44uhjW+ZCNWdaD5MTa9EmQ@mail.gmail.com>
	<55D9BA21-0C74-4958-A9C7-0C0969366F93@masklinn.net>
	<CAP7+vJL3oAbBPf49hS0Z6kKY6AY++sMgp_-0RpmGJoHHTtUEKQ@mail.gmail.com>
	<4F969DC0-2B67-4C35-B0E7-EEEAD992E840@masklinn.net>
	<33AD9673-BFDD-4C1E-8149-BAC13ADB29BB@masklinn.net>
Message-ID: <1354823767.1386.140661162801849.0482DD14@webmail.messagingengine.com>

On Thu, Dec 6, 2012, at 3:43, Masklinn wrote:
> Why would Optional not be a type? It's coherent with Option or Maybe
> types in languages with such features, or C#'s Nullable.

C#'s Nullable doesn't really work outside a static typing system - when
you assign a Nullable to an 'object' or a 'dynamic', you get either the
original type (e.g. Int32) or a null reference (which has no type). It's
a real type only as far as the static typing system goes: it can be the
type of a field or a local variable, it _cannot_ be the type of an
object on the heap.

And since python doesn't have static typing...


From dreamingforward at gmail.com  Fri Dec  7 22:45:25 2012
From: dreamingforward at gmail.com (Mark Adam)
Date: Fri, 7 Dec 2012 15:45:25 -0600
Subject: [Python-ideas] Graph class
Message-ID: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>

I have a decent semi-recursive Graph class that I think could be a
good addition to the Collections module.  It probably needs some
refactoring, but I'm posting here to see if there's any interest.

For those who aren't too abreast of CS theory, a graph is one of the
most abstract data structures in computer science, encompassing trees,
and lists. I'm a bit surprised that no one's offered one up yet, so
I'll present mine.

The code is at http://github.com/theProphet/Social-Garden under the
pangaia directly called graph.py.  It has a default dictionary
(defdict.py) dependency that I made before Python came up with it on
it's own (another place for refactoring).

Cheers,

MarkJ


From thomas at kluyver.me.uk  Fri Dec  7 23:22:42 2012
From: thomas at kluyver.me.uk (Thomas Kluyver)
Date: Fri, 7 Dec 2012 22:22:42 +0000
Subject: [Python-ideas] Graph class
In-Reply-To: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>
References: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>
Message-ID: <CAOvn4qgkX4+4B9PmmoRmHC+HG-6s+HHb6BiQfzEo+RTaFVdz+Q@mail.gmail.com>

On 7 December 2012 21:45, Mark Adam <dreamingforward at gmail.com> wrote:

> I have a decent semi-recursive Graph class that I think could be a
> good addition to the Collections module.  It probably needs some
> refactoring, but I'm posting here to see if there's any interest.
>

For reference, there was a previous idea to make some kind of standard
Graph API:
http://wiki.python.org/moin/PythonGraphApi

When I had to implement a really simple DAG myself, I based it on this
Graph ABC library:
http://www.linux.it/~della/GraphABC/

Best wishes,
Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121207/d87432bd/attachment.html>

From tjreedy at udel.edu  Sat Dec  8 08:17:17 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 08 Dec 2012 02:17:17 -0500
Subject: [Python-ideas] Graph class
In-Reply-To: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>
References: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>
Message-ID: <k9upi9$fmq$1@ger.gmane.org>

On 12/7/2012 4:45 PM, Mark Adam wrote:
> I have a decent semi-recursive Graph class that I think could be a
> good addition to the Collections module.  It probably needs some
> refactoring, but I'm posting here to see if there's any interest.
>
> For those who aren't too abreast of CS theory, a graph is one of the
> most abstract data structures in computer science, encompassing trees,
> and lists. I'm a bit surprised that no one's offered one up yet, so
> I'll present mine.

I believe there are are multiple graph modules and packages, but none is 
really dominant. It is partly because there are multiple representations 
and the best depends on the problem.

> The code is at http://github.com/theProphet/Social-Garden under the
> pangaia directly called graph.py.  It has a default dictionary
> (defdict.py) dependency that I made before Python came up with it on
> it's own (another place for refactoring).

-- 
Terry Jan Reedy



From storchaka at gmail.com  Sat Dec  8 09:07:40 2012
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Sat, 08 Dec 2012 10:07:40 +0200
Subject: [Python-ideas] Graph class
In-Reply-To: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>
References: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>
Message-ID: <k9usgh$273$1@ger.gmane.org>

On 07.12.12 23:45, Mark Adam wrote:
> I have a decent semi-recursive Graph class that I think could be a
> good addition to the Collections module.  It probably needs some
> refactoring, but I'm posting here to see if there's any interest.
>
> For those who aren't too abreast of CS theory, a graph is one of the
> most abstract data structures in computer science, encompassing trees,
> and lists. I'm a bit surprised that no one's offered one up yet, so
> I'll present mine.
>
> The code is at http://github.com/theProphet/Social-Garden under the
> pangaia directly called graph.py.  It has a default dictionary
> (defdict.py) dependency that I made before Python came up with it on
> it's own (another place for refactoring).

Graph is too abstract conception. There are a lot of implementations of 
graphs. Every non-trivial program contains some (may be implicit) graphs.

See also for some implementations: Magnus Lie Hetland, "Python 
Algorithms. Mastering Basic Algorithms in the Python Language".



From dreamingforward at gmail.com  Sun Dec  9 02:29:56 2012
From: dreamingforward at gmail.com (Mark Adam)
Date: Sat, 8 Dec 2012 19:29:56 -0600
Subject: [Python-ideas] Graph class
In-Reply-To: <CAOvn4qgkX4+4B9PmmoRmHC+HG-6s+HHb6BiQfzEo+RTaFVdz+Q@mail.gmail.com>
References: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>
	<CAOvn4qgkX4+4B9PmmoRmHC+HG-6s+HHb6BiQfzEo+RTaFVdz+Q@mail.gmail.com>
Message-ID: <CAMjeLr9wymGhGaOEa=qK-=nK6_A6ZCz+9ivBnz2rwAFh3AmX8A@mail.gmail.com>

On Fri, Dec 7, 2012 at 4:22 PM, Thomas Kluyver <thomas at kluyver.me.uk> wrote:
> On 7 December 2012 21:45, Mark Adam <dreamingforward at gmail.com> wrote:
>>
>> I have a decent semi-recursive Graph class that I think could be a
>> good addition to the Collections module.  It probably needs some
>> refactoring, but I'm posting here to see if there's any interest.
>
> For reference, there was a previous idea to make some kind of standard
Graph
> API:
> http://wiki.python.org/moin/PythonGraphApi

All very interesting.  I'm going to suggest a sort of "meta-discussion"
about why -- despite the power of graphs as a data structure -- such a
feature has not stabilized into a workable solution for inclusion in a
high-level language like Python.

I identity the following points of "wavery":

1) the naming of methods (add_edge, vs add(1,2)):  *aesthetic grounds,*
2) what methods to include (degree + neighbors or the standard dict's
__len__ + __getitem__):  *API grounds*
3) how much flexibility to be offered (directed, multi-graphs, edge weights
with arbitrary labeling, etc.):  *functionality grounds*
3) what underlying data structure to use (sparse adjacency dicts, matrices,
etc):  *representation conflicts*.

And upon further thought, it looks like only a killer application could
ever settle the issue(s) to make it part of the standard library.

mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121208/abdf95d8/attachment.html>

From p.f.moore at gmail.com  Sun Dec  9 12:40:05 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun, 9 Dec 2012 11:40:05 +0000
Subject: [Python-ideas] Graph class
In-Reply-To: <CAMjeLr9wymGhGaOEa=qK-=nK6_A6ZCz+9ivBnz2rwAFh3AmX8A@mail.gmail.com>
References: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>
	<CAOvn4qgkX4+4B9PmmoRmHC+HG-6s+HHb6BiQfzEo+RTaFVdz+Q@mail.gmail.com>
	<CAMjeLr9wymGhGaOEa=qK-=nK6_A6ZCz+9ivBnz2rwAFh3AmX8A@mail.gmail.com>
Message-ID: <CACac1F-K7iOEf60uR9x5SwSPEzXqHkbbFw-zmVxguogF=Qy-hg@mail.gmail.com>

On 9 December 2012 01:29, Mark Adam <dreamingforward at gmail.com> wrote:
> All very interesting.  I'm going to suggest a sort of "meta-discussion"
> about why -- despite the power of graphs as a data structure -- such a
> feature has not stabilized into a workable solution for inclusion in a
> high-level language like Python.
>
> I identity the following points of "wavery":
>
> 1) the naming of methods (add_edge, vs add(1,2)):  aesthetic grounds,
> 2) what methods to include (degree + neighbors or the standard dict's
> __len__ + __getitem__):  API grounds
> 3) how much flexibility to be offered (directed, multi-graphs, edge weights
> with arbitrary labeling, etc.):  functionality grounds
> 3) what underlying data structure to use (sparse adjacency dicts, matrices,
> etc):  representation conflicts.

4) Whether the library requires some sort of "Vertex" type, or works
with arbitrary values, similarly whether there is a defined "Edge"
class or edges can be labelled, weighted, etc with arbitrary Python
values.
5) Granularity - if all I want is a depth-first search algorithm, why
pull in a dependency on 100 graph algorithms I'm not interested in?

My feeling is that graphs are right on the borderline of a data
structure that is simple enough that people invent their own rather
than bother conforming to a "standard" model but complex enough that
it's worth using library functions rather than getting the details
wrong. In C, there are many examples of this type of "borderline"
stuff - linked lists, maps, sorting and searching algorithms, etc. In
Python, lists, dictionaries, sorting, etc are all "self evidently"
basic building blocks, but graphs hit that borderline area.

Paul


From allyourcode at gmail.com  Sun Dec  9 20:33:43 2012
From: allyourcode at gmail.com (Daniel Wong)
Date: Sun, 9 Dec 2012 11:33:43 -0800 (PST)
Subject: [Python-ideas] Conventions for function annotations
In-Reply-To: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
References: <CAOvn4qi4pvDDmcTwafTTLeLX7H6BhVb+eJ+9XgT6cHr_gxA7sQ@mail.gmail.com>
Message-ID: <2c69dfe4-5c90-4670-b747-7734e37ccb83@googlegroups.com>

This proposal looks great. The only thing is that I don't understand the 
point of annotations in the first place, since Python has decorators. As 
the last part of your post describes, decorators can be used to do the same 
thing. With decorators, it is even possible to use annotation-like syntax:

def defaults_as_parameter_metadata(f):
  names, args_name, kwargs_name, defaults = inspect.getargspec(f) 
  assert len(names) == len(defaults)  # To keep this example simple...

  f.parameter_metadata = {}
  for name, meta in zip(names, defaults):
    f.parameter_metadata[name] = meta
  f.__defaults__ = ()  # Again, for simplicity.
  return f


@defaults_as_parameter_metadata
def make_ice_cream(flavor=(options('vanilla', 'chocolate', ...),
                           str, "What kind of delicious do you want?"),
                   quantity=(positive, double,
                             "How much (in pounds) do you want?")):
  ...


I know this addresses a different issue, but I was directed to this thread 
from an answer that I got on StackOverflow<http://stackoverflow.com/questions/13784713/what-good-are-python-function-annotations/13785381#13785381>, 
and this thread seems related enough. Sorry if I'm going off the rails here.

On Saturday, December 1, 2012 4:28:50 AM UTC-8, Thomas Kluyver wrote:
>
> Function annotations (PEP 3107) are a very interesting new feature, but so 
> far have gone largely unused. The only project I've seen using them is 
> plac, a command-line option parser. One reason for this is that because 
> function annotations can be used to mean anything, we're wary of doing 
> anything in case we interfere with some other use case. A recent thread on 
> ipython-dev touched on this [1], and we'd like to suggest some conventions 
> to make annotations useful for everyone.
>
> 1. Code inspecting annotations should be prepared to ignore annotations it 
> can't understand.
>
> 2. Code creating annotations should use wrapper classes to indicate what 
> the annotation means. For instance, we are contemplating a way to specify 
> options for a parameter, to be used in tab completion, so we would do 
> something like this:
>
> from IPython.core.completer import options
> def my_io(filename, mode: options('read','write') ='read'):
>     ...
>
> 3. There are a couple of important exceptions to 2:
> - Annotations that are simply a string can be used like a docstring, to be 
> displayed to the user. Inspecting code should not expect to be able to 
> parse any machine-readable information out of these strings.
> - Annotations that are a built-in type (int, str, etc.) indicate that the 
> value should always be an instance of that type. Inspecting code may use 
> these for type checking, introspection, optimisation, or other such 
> purposes. Note that for now, I have limited this to built-in types, so 
> other types can be used for other purposes, but this could be extended. For 
> instance, the ABCs from collections (collections.Mapping et al.) could well 
> be added to this category.
>
> 4. There should be a convention for attaching multiple annotations to one 
> value. I propose that all code using annotations expects to handle 
> tuples/lists of annotations. (We also considered dictionaries, but the 
> result is long and ugly). So in this definition:
>
> def my_io(filename, mode: (options('read','write'), str, 'The mode in 
> which to open the file') ='read'):
>     ...
>
> the mode parameter has a set of options (ignored by frameworks that don't 
> recognise it), should always be a string, and has a description.
>
> Any thoughts and suggestions are welcome.
>
> As an aside, we may also create a couple of decorators to fill in 
> __annotations__ on Python 2, something like:
>
> @return_annotation('A file obect')
> @annotations(mode=(options('read','write'), str, 'The mode in which to 
> open the file'))
> def my_io(filename, mode='read'):
>     ...
>
> [1] http://mail.scipy.org/pipermail/ipython-dev/2012-November/010697.html
>
>
> Thanks,
> Thomas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121209/1f0d0782/attachment.html>

From dreamingforward at gmail.com  Sun Dec  9 21:31:32 2012
From: dreamingforward at gmail.com (Mark Adam)
Date: Sun, 9 Dec 2012 14:31:32 -0600
Subject: [Python-ideas] Fwd:  Graph class
In-Reply-To: <CAMjeLr9KdF6n5U29BTYt7wmQd1GQv0n-3zs8V+n6UfoEuoy=Jg@mail.gmail.com>
References: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>
	<CAOvn4qgkX4+4B9PmmoRmHC+HG-6s+HHb6BiQfzEo+RTaFVdz+Q@mail.gmail.com>
	<CAMjeLr9wymGhGaOEa=qK-=nK6_A6ZCz+9ivBnz2rwAFh3AmX8A@mail.gmail.com>
	<CACac1F-K7iOEf60uR9x5SwSPEzXqHkbbFw-zmVxguogF=Qy-hg@mail.gmail.com>
	<CAMjeLr9KdF6n5U29BTYt7wmQd1GQv0n-3zs8V+n6UfoEuoy=Jg@mail.gmail.com>
Message-ID: <CAMjeLr-hmit-GKuicXSXv8xuZASFqWAK6pLGjNvJ_+gMhq0vpQ@mail.gmail.com>

Meant this to go to the whole list.  Sorry.

On Sun, Dec 9, 2012 at 5:40 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> On 9 December 2012 01:29, Mark Adam <dreamingforward at gmail.com> wrote:
>> All very interesting.  I'm going to suggest a sort of "meta-discussion"
>> about why -- despite the power of graphs as a data structure -- such a
>> feature has not stabilized into a workable solution for inclusion in a
>> high-level language like Python.
>>
>> I identity the following points of "wavery":
>>
>> 1) the naming of methods (add_edge, vs add(1,2)):  aesthetic grounds,
>> 2) what methods to include (degree + neighbors or the standard dict's
>> __len__ + __getitem__):  API grounds
>> 3) how much flexibility to be offered (directed, multi-graphs, edge weights
>> with arbitrary labeling, etc.):  functionality grounds
>> 4) what underlying data structure to use (sparse adjacency dicts, matrices,
>> etc):  representation conflicts.
>
> 4) Whether the library requires some sort of "Vertex" type, or works
> with arbitrary values, similarly whether there is a defined "Edge"
> class or edges can be labelled, weighted, etc with arbitrary Python
> values.

This I put under #3 (functionality grounds) "edge weights with
arbitrary labeling", Vertex's with abitrary values i think would be
included.

> 5) Granularity - if all I want is a depth-first search algorithm, why
> pull in a dependency on 100 graph algorithms I'm not interested in?

Hmm, I would call this "5) comprehensiveness: whether to include every
graph algorithm known to mankind."

> My feeling is that graphs are right on the borderline of a data
> structure that is simple enough that people invent their own rather
> than bother conforming to a "standard" model but complex enough that
> it's worth using library functions rather than getting the details
> wrong.

But this is also why (on both counts) it would be good to include it
in the standard library.  The *simplicity* of a graph makes everyone
re-implement it, or (worse) work with some cruder work-around (like a
dict of dicts but not at all clear you're dealing with an actual
graph).  But imagine if, for example, Subversion used a python graph
class to track all branches and nodes in it's distributed revision
control system.  Then how easy it would be for third parties to make
tools to view repos or other developers to come in and work with the
dev team:  they're already familiar with the standard graph class
structure.

And for the *complex enough* case, obviously it helps to have a
standard library help you out and just provide the sophistication of a
graph class for you.   There are a lot of obvious uses for a graph,
but if you don't know of it, a beginning programmer won't *think* of
it and make some crude work-around.

Mark


From solipsis at pitrou.net  Sun Dec  9 21:53:45 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 9 Dec 2012 21:53:45 +0100
Subject: [Python-ideas] Graph class
References: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>
Message-ID: <20121209215345.496fc6ef@pitrou.net>

On Fri, 7 Dec 2012 15:45:25 -0600
Mark Adam <dreamingforward at gmail.com>
wrote:
> I have a decent semi-recursive Graph class that I think could be a
> good addition to the Collections module.  It probably needs some
> refactoring, but I'm posting here to see if there's any interest.
> 
> For those who aren't too abreast of CS theory, a graph is one of the
> most abstract data structures in computer science, encompassing trees,
> and lists. I'm a bit surprised that no one's offered one up yet, so
> I'll present mine.
> 
> The code is at http://github.com/theProphet/Social-Garden under the
> pangaia directly called graph.py.  It has a default dictionary
> (defdict.py) dependency that I made before Python came up with it on
> it's own (another place for refactoring).

Do you know networkx?
http://networkx.lanl.gov/

Regards

Antoine.




From stephen at xemacs.org  Mon Dec 10 03:08:00 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 10 Dec 2012 11:08:00 +0900
Subject: [Python-ideas]  Fwd:  Graph class
In-Reply-To: <CAMjeLr-hmit-GKuicXSXv8xuZASFqWAK6pLGjNvJ_+gMhq0vpQ@mail.gmail.com>
References: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>
	<CAOvn4qgkX4+4B9PmmoRmHC+HG-6s+HHb6BiQfzEo+RTaFVdz+Q@mail.gmail.com>
	<CAMjeLr9wymGhGaOEa=qK-=nK6_A6ZCz+9ivBnz2rwAFh3AmX8A@mail.gmail.com>
	<CACac1F-K7iOEf60uR9x5SwSPEzXqHkbbFw-zmVxguogF=Qy-hg@mail.gmail.com>
	<CAMjeLr9KdF6n5U29BTYt7wmQd1GQv0n-3zs8V+n6UfoEuoy=Jg@mail.gmail.com>
	<CAMjeLr-hmit-GKuicXSXv8xuZASFqWAK6pLGjNvJ_+gMhq0vpQ@mail.gmail.com>
Message-ID: <87txru1wxr.fsf@uwakimon.sk.tsukuba.ac.jp>

Mark Adam writes:

 > graph).  But imagine if, for example, Subversion used a python graph
 > class to track all branches and nodes in it's distributed revision
 > control system.  Then how easy it would be for third parties to make
 > tools to view repos or other developers to come in and work with the
 > dev team:  they're already familiar with the standard graph class
 > structure.

This is a fallacy.  As has been pointed out, there is a variety of
graphs, a large variety of computations to be done on and in them, and
a huge variety in algorithms for dealing with those varied tasks.  For
a "standard" graph class to be useful enough to become the OOWTDI, it
would need to deal with a large fraction of those aspects of graph
theory.

Even so, people would only really internalize the parts they need for
the present task, forgetting or (worse) misremembering functionality
that doesn't work for them right now.  Corner cases would force many
tasks to be done outside of the standard class.  Differences in taste
would surely result in a large number of API variants to reflect
users' preferred syntaxes for representing graphs, and so on.  I think
making a "Graph" class that has a chance of becoming the OOWTDI is a
big task.  Not as big as SciPy, say, but then, SciPy isn't being
proposed for stdlib inclusion, either.

As usual for stdlib additions, I think this discussion would best be
advanced not by "going all meta", but rather by proposing specific
packages (either already available, perhaps on PyPI, or new ones --
but with actual code) for inclusion.  The "meta" discussion should be
conducted with specific reference to the advantages or shortcomings of
those specific packages.  N.B. A reasonably comprehensive package that
has seen significant real-world use, and preferably has a primary
distribution point of PyPI, would be the shortest path to inclusion.



From techtonik at gmail.com  Wed Dec 12 10:14:21 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Wed, 12 Dec 2012 12:14:21 +0300
Subject: [Python-ideas] Python is not perfect - let's add 'Wart' status to
	track
Message-ID: <CAPkN8xJBteq_eMAO=OyMqbG1vGCE5XkWm_nnLhUonR1ZDHUBww@mail.gmail.com>

I want to query all warts for specific Python 2.x versions to see how are
they fixed in 3.x.

Right now these warts are hidden beneath the "invalid" labels, which IMHO
does as much damage to the language development as BC breaks.

How about adding 'Wart' resolution to the closed status on tracker?
-- 
anatoly t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121212/e665776c/attachment.html>

From solipsis at pitrou.net  Wed Dec 12 10:45:44 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 12 Dec 2012 10:45:44 +0100
Subject: [Python-ideas] Python is not perfect - let's add 'Wart' status
	to track
References: <CAPkN8xJBteq_eMAO=OyMqbG1vGCE5XkWm_nnLhUonR1ZDHUBww@mail.gmail.com>
Message-ID: <20121212104544.06203118@pitrou.net>

Le Wed, 12 Dec 2012 12:14:21 +0300,
anatoly techtonik <techtonik at gmail.com> a
?crit :

> I want to query all warts for specific Python 2.x versions to see how
> are they fixed in 3.x.
> 
> Right now these warts are hidden beneath the "invalid" labels, which
> IMHO does as much damage to the language development as BC breaks.
> 
> How about adding 'Wart' resolution to the closed status on tracker?

That's what "won't fix" is for: things that we agree should ideally be
fixed but that we keep it frozen for compatibility / other reasons.

Regards

Antoine.




From storchaka at gmail.com  Wed Dec 12 18:40:03 2012
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 12 Dec 2012 19:40:03 +0200
Subject: [Python-ideas] Docstrings for namedtuple
Message-ID: <kaafhm$b7n$1@ger.gmane.org>

What interface is better for specifying namedtuple field docstrings?

     Point = namedtuple('Point', 'x y',
                        doc='Point: 2-dimensional coordinate',
                        field_docs=['abscissa', 'ordinate'])

or

     Point = namedtuple('Point', [('x', 'absciss'), ('y', 'ordinate')],
                        doc='Point: 2-dimensional coordinate')

?

http://bugs.python.org/issue16669



From solipsis at pitrou.net  Wed Dec 12 20:35:47 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 12 Dec 2012 20:35:47 +0100
Subject: [Python-ideas] Docstrings for namedtuple
References: <kaafhm$b7n$1@ger.gmane.org>
Message-ID: <20121212203547.1d08a044@pitrou.net>

On Wed, 12 Dec 2012 19:40:03 +0200
Serhiy Storchaka <storchaka at gmail.com>
wrote:
> What interface is better for specifying namedtuple field docstrings?
> 
>      Point = namedtuple('Point', 'x y',
>                         doc='Point: 2-dimensional coordinate',
>                         field_docs=['abscissa', 'ordinate'])

field_docs={'x': 'abscissa', 'y': 'ordinate'} perhaps?




From mal at egenix.com  Wed Dec 12 20:56:40 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 12 Dec 2012 20:56:40 +0100
Subject: [Python-ideas] Docstrings for namedtuple
In-Reply-To: <kaafhm$b7n$1@ger.gmane.org>
References: <kaafhm$b7n$1@ger.gmane.org>
Message-ID: <50C8E178.3040106@egenix.com>

On 12.12.2012 18:40, Serhiy Storchaka wrote:
> What interface is better for specifying namedtuple field docstrings?
> 
>     Point = namedtuple('Point', 'x y',
>                        doc='Point: 2-dimensional coordinate',
>                        field_docs=['abscissa', 'ordinate'])
> 
> or
> 
>     Point = namedtuple('Point', [('x', 'absciss'), ('y', 'ordinate')],
>                        doc='Point: 2-dimensional coordinate')
> 
> ?
>
> http://bugs.python.org/issue16669

IMO, attributes should be documented in the existing doc parameter,
not separately. This makes the intention clear and the code overall
more readable.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 12 2012)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-12-05: Released eGenix pyOpenSSL 0.13 ...    http://egenix.com/go37
2012-11-28: Released eGenix mx Base 3.2.5 ...     http://egenix.com/go36
2013-01-22: Python Meeting Duesseldorf ...                 41 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From storchaka at gmail.com  Wed Dec 12 20:57:58 2012
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 12 Dec 2012 21:57:58 +0200
Subject: [Python-ideas] Docstrings for namedtuple
In-Reply-To: <20121212203547.1d08a044@pitrou.net>
References: <kaafhm$b7n$1@ger.gmane.org> <20121212203547.1d08a044@pitrou.net>
Message-ID: <kaanka$q2p$1@ger.gmane.org>

On 12.12.12 21:35, Antoine Pitrou wrote:
> field_docs={'x': 'abscissa', 'y': 'ordinate'} perhaps?

This will force repeat the field names twice.

If we have such docs_dict, we can use it as:

field_names = ['x', 'y']
Point = namedtuple('Point', field_names,
     field_docs=list(map(docs_dict.get, field_names)))

or as

Point = namedtuple('Point',
     [(f, docs_dict.get(f)) for f in field_names])

In case of ordered dict it can be even simpler:

Point = namedtuple('Point', ordered_dict.keys(),
     field_docs=list(ordered_dict.values()))

or

Point = namedtuple('Point', ordered_dict.items())




From storchaka at gmail.com  Wed Dec 12 21:12:24 2012
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 12 Dec 2012 22:12:24 +0200
Subject: [Python-ideas] Docstrings for namedtuple
In-Reply-To: <50C8E178.3040106@egenix.com>
References: <kaafhm$b7n$1@ger.gmane.org> <50C8E178.3040106@egenix.com>
Message-ID: <kaaofa$22u$1@ger.gmane.org>

On 12.12.12 21:56, M.-A. Lemburg wrote:
> IMO, attributes should be documented in the existing doc parameter,
> not separately. This makes the intention clear and the code overall
> more readable.

Sorry, I didn't understand what you mean. There is no doc parameter for 
namedtuple yet.

For overloading class docstring we can use inheritance idiom. But there 
is no way to change field docstring. All field docstrings generated 
using template 'Alias for field number {index:d}'.




From mal at egenix.com  Wed Dec 12 21:19:19 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 12 Dec 2012 21:19:19 +0100
Subject: [Python-ideas] Docstrings for namedtuple
In-Reply-To: <kaaofa$22u$1@ger.gmane.org>
References: <kaafhm$b7n$1@ger.gmane.org> <50C8E178.3040106@egenix.com>
	<kaaofa$22u$1@ger.gmane.org>
Message-ID: <50C8E6C7.6010807@egenix.com>

On 12.12.2012 21:12, Serhiy Storchaka wrote:
> On 12.12.12 21:56, M.-A. Lemburg wrote:
>> IMO, attributes should be documented in the existing doc parameter,
>> not separately. This makes the intention clear and the code overall
>> more readable.
> 
> Sorry, I didn't understand what you mean. There is no doc parameter for namedtuple yet.

Ah, sorry. Please scratch the "existing" in my reply :-)

+1 on a doc parameter on namedtuple() - property() already has such
   a parameter, which is probably why I got confused.

-0 on having separate doc strings for the fields. Their meaning will
   usually be clear from the main doc string.

> For overloading class docstring we can use inheritance idiom. But there is no way to change field
> docstring. All field docstrings generated using template 'Alias for field number {index:d}'.

Yes, I've seen that:
http://docs.python.org/2/library/collections.html?highlight=namedtuple#collections.namedtuple

It may not be too helpful, but it's an accurate description of the
field's purpose :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 12 2012)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-12-05: Released eGenix pyOpenSSL 0.13 ...    http://egenix.com/go37
2012-11-28: Released eGenix mx Base 3.2.5 ...     http://egenix.com/go36
2013-01-22: Python Meeting Duesseldorf ...                 41 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From dreamingforward at gmail.com  Sat Dec 15 07:19:41 2012
From: dreamingforward at gmail.com (Mark Adam)
Date: Sat, 15 Dec 2012 00:19:41 -0600
Subject: [Python-ideas] Fwd: Graph class
In-Reply-To: <87txru1wxr.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>
	<CAOvn4qgkX4+4B9PmmoRmHC+HG-6s+HHb6BiQfzEo+RTaFVdz+Q@mail.gmail.com>
	<CAMjeLr9wymGhGaOEa=qK-=nK6_A6ZCz+9ivBnz2rwAFh3AmX8A@mail.gmail.com>
	<CACac1F-K7iOEf60uR9x5SwSPEzXqHkbbFw-zmVxguogF=Qy-hg@mail.gmail.com>
	<CAMjeLr9KdF6n5U29BTYt7wmQd1GQv0n-3zs8V+n6UfoEuoy=Jg@mail.gmail.com>
	<CAMjeLr-hmit-GKuicXSXv8xuZASFqWAK6pLGjNvJ_+gMhq0vpQ@mail.gmail.com>
	<87txru1wxr.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAMjeLr_2aSj44hsBa1P2J47Xh-DzHVcr0etcmHpKQnXn0yRQMw@mail.gmail.com>

On Sun, Dec 9, 2012 at 8:08 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Mark Adam writes:
>
>  > graph).  But imagine if, for example, Subversion used a python graph
>  > class to track all branches and nodes in it's distributed revision
>  > control system.  Then how easy it would be for third parties to make
>  > tools to view repos or other developers to come in and work with the
>  > dev team:  they're already familiar with the standard graph class
>  > structure.
>
> This is a fallacy.  As has been pointed out, there is a variety of
> graphs, a large variety of computations to be done on and in them, and
> a huge variety in algorithms for dealing with those varied tasks.

Yes, but the basic data structure concept is NOT in development, it is
already well-developed.  The remaining issues of API are wholly
secondary.   The usefulness of a graph is unquestioned and creates
cross-functionality across a large number of possible interesting
domains.   Really its like having a car when everyone else is walking
(...but should the car be a buick or a toyota? a four-door or two? --
is all besides the point)

But your issue of proposing an actual implementation is well-taken
rather than spend a lot of time arguing over it all.  With any luck,
I'll try to distill networkx with my work and put it all together.
haha

mark


From tack at urandom.ca  Sun Dec 16 02:36:18 2012
From: tack at urandom.ca (Jason Tackaberry)
Date: Sat, 15 Dec 2012 20:36:18 -0500
Subject: [Python-ideas] Late to the async party (PEP 3156)
Message-ID: <50CD2592.5010507@urandom.ca>

Hi python-ideas,

I've been somewhat living under a rock for the past few months and 
consequently I missed the ideal window of opportunity to weigh in on the 
async discussions this fall that culminated into PEP 3156.

I've been reading through those discussions in the archives.  I've not 
finished digesting it all, and I'm somewhat torn in that I feel I should 
shut up until I read everything to date so as not to decrease the SNR, 
but on the other hand, knowing myself, I strongly suspect this would 
result in my never speaking up.  And so, at risk of lowering the SNR ...

First let me say that PEP 3156 makes me very, very happy.

Over the past few years I've been exploring these very ideas with a 
little-used library called Kaa.  I'm not offering it up as a paragon of 
proper async library design, but I wanted to share some of my 
experiences in case they could be useful to the PEP.

     https://github.com/freevo/kaa-base/
     http://api.freevo.org/kaa-base/

It does seem like many similar design choices were made.  In particular, 
I'm happy that an explicit yield will be used rather than the greenlet 
style of implicit suspension/reentry.   Even after I've been using them 
for years, coroutines often feel like a form of magic, and an explicit 
yield is more aligned with the principle of least surprise.

With Kaa, our future-style object is called an InProgress (so forgive 
the differing terminology in the remainder of this post):

     http://api.freevo.org/kaa-base/async/inprogress.html

A couple properties of InProgress objects that I've found have practical 
value:

  * they can be aborted, which raises a special InProgressAborted inside
    the coroutine function so it can perform cleanup actions
      o what makes this tricky is the question of what to do to any
        currently yielded tasks?  If A yields B and A is aborted, should
        B be aborted?  What if the same B task is being yielded by C? 
        Should C also be aborted, even if it's considered a sibling of
        A?  (For example, suppose B is a task that is refreshing some
        common cache that both A and C want to make sure is up-to-date
        before they move on.)
      o if the decision is B should be aborted, then within A, 'yield B'
        will raise an exception because A is aborted, but 'yield B'
        within C will raise because B was aborted.  So there needs to be
        some mechanism to distinguish between these cases.  (My approach
        was to have an origin attribute on the exception.)
      o if A yields B, it may want to prevent B from being aborted if A
        is aborted.  (My approach was to have a noabort() method in
        InProgress objects to return a new, unabortable InProgress
        object that A can then yield.)
      o alternatively, the saner implementation may be to do nothing to
        B when A is aborted and require A catch InProgressAborted and
        explicitly abort B if that's the desired behaviour
      o discussion in the PEP on cancellation has some TBDs so perhaps
        the above will be food for thought
  * they have a timeout() method, which returns a new InProgress object
    representing the task that will abort when the timeout elapses if
    the task doesn't finish
      o it's noteworthy that timeout() returns a /new/ InProgress and
        the original task continues on even if the timeout occurs -- by
        default that is, unless you do timeout(abort=True)
      o I didn't see much discussion in the PEP on timeouts, but I think
        this is an important feature that should be standardized


Coroutines in Kaa use "yield" rather than "yield from" but the general 
approach looks very similar to what's been proposed:

     http://api.freevo.org/kaa-base/async/coroutines.html

The @coroutine decorator causes the decorated function to return an 
InProgress.  Coroutines can of course yield other coroutines, but, more 
fundamentally, anything else that returns an InProgress object, which 
could be a @threaded function, or even an ordinary function that 
explicitly creates and returns an InProgress object.

There are some features of Kaa's implementation that could be worth 
considering:

  * it is possible to yield a special object (called NotFinished) that
    allows a coroutine to "time slice" as a form of cooperative multitasking
  * coroutines can have certain policies that control invocation
    behaviour.  The most obvious ones to describe are
    POLICY_SYNCHRONIZED which ensures that multiple invocations of the
    same coroutine are serialized, and POLICY_SINGLETON which
    effectively ignores subsequent invocations if it's already running
  * it is possible to have a special progress object passed into the
    coroutine function so that the coroutine's progress can be
    communicated to an outside observer


Once you've standardized on a way to manage the lifecycle of an 
in-progress asynchronous task, threads are a natural extension:

     http://api.freevo.org/kaa-base/async/threads.html

The important element here is that @threaded decorated functions can be 
yielded by coroutines.  This means that truly blocking tasks can be 
wrapped in a thread but invocation from a coroutine is identical to any 
other coroutine.  Consequently, a threaded task could later be 
implemented as a coroutine (or more generally via event loop hooks) 
without any API changes.

I think I'll stop here.  There's plenty more definition, discussion, and 
examples in the links above.  Hopefully some ideas can be salvaged for 
PEP 3156, but even if that's not the case, I'll be happy to know they 
were considered and rejected rather than not considered at all.

Cheers,
Jason.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121215/802f41fe/attachment.html>

From guido at python.org  Sun Dec 16 06:37:15 2012
From: guido at python.org (Guido van Rossum)
Date: Sat, 15 Dec 2012 21:37:15 -0800
Subject: [Python-ideas] Late to the async party (PEP 3156)
In-Reply-To: <50CD2592.5010507@urandom.ca>
References: <50CD2592.5010507@urandom.ca>
Message-ID: <CAP7+vJJ0RbHS0fU1zPNAsLt83RDtyVmH1GeWFfNiUZva_t+5ow@mail.gmail.com>

Hi Jason,

I don't think you've missed anything. I had actually planned to keep
PEP 3156 unpublished for a bit longer, since I'm not done writing the
reference implementation -- I'm sure that many of the issues currently
marked open or TBD will be resolved that way. There hasn't been any
public discussion since the last threads on python-ideas some weeks
ago -- however I've met in person with some Twisted folks and
exchanged private emails with some other interested parties.

You've also correctly noticed that the PEP is weakest in the area of
cancellation (and timeouts aren't even mentioned in the current
draft). I'm glad you have some experience in this area, and I'll try
to study your solutions and suggestions in more detail soon.

For integration with threads, I'm thinking that the PEP currently has
the minimum needed with wrap_future() and run_in_executor() -- but
I'll read your link on threads and see what I may be missing.

(More later, but I don't want you to think you posted into a black hole!)

--Guido

On Sat, Dec 15, 2012 at 5:36 PM, Jason Tackaberry <tack at urandom.ca> wrote:
> Hi python-ideas,
>
> I've been somewhat living under a rock for the past few months and
> consequently I missed the ideal window of opportunity to weigh in on the
> async discussions this fall that culminated into PEP 3156.
>
> I've been reading through those discussions in the archives.  I've not
> finished digesting it all, and I'm somewhat torn in that I feel I should
> shut up until I read everything to date so as not to decrease the SNR, but
> on the other hand, knowing myself, I strongly suspect this would result in
> my never speaking up.  And so, at risk of lowering the SNR ...
>
> First let me say that PEP 3156 makes me very, very happy.
>
> Over the past few years I've been exploring these very ideas with a
> little-used library called Kaa.  I'm not offering it up as a paragon of
> proper async library design, but I wanted to share some of my experiences in
> case they could be useful to the PEP.
>
>     https://github.com/freevo/kaa-base/
>     http://api.freevo.org/kaa-base/
>
> It does seem like many similar design choices were made.  In particular, I'm
> happy that an explicit yield will be used rather than the greenlet style of
> implicit suspension/reentry.   Even after I've been using them for years,
> coroutines often feel like a form of magic, and an explicit yield is more
> aligned with the principle of least surprise.
>
> With Kaa, our future-style object is called an InProgress (so forgive the
> differing terminology in the remainder of this post):
>
>     http://api.freevo.org/kaa-base/async/inprogress.html
>
> A couple properties of InProgress objects that I've found have practical
> value:
>
> they can be aborted, which raises a special InProgressAborted inside the
> coroutine function so it can perform cleanup actions
>
> what makes this tricky is the question of what to do to any currently
> yielded tasks?  If A yields B and A is aborted, should B be aborted?  What
> if the same B task is being yielded by C?  Should C also be aborted, even if
> it's considered a sibling of A?  (For example, suppose B is a task that is
> refreshing some common cache that both A and C want to make sure is
> up-to-date before they move on.)
> if the decision is B should be aborted, then within A, 'yield B' will raise
> an exception because A is aborted, but 'yield B' within C will raise because
> B was aborted.  So there needs to be some mechanism to distinguish between
> these cases.  (My approach was to have an origin attribute on the
> exception.)
> if A yields B, it may want to prevent B from being aborted if A is aborted.
> (My approach was to have a noabort() method in InProgress objects to return
> a new, unabortable InProgress object that A can then yield.)
> alternatively, the saner implementation may be to do nothing to B when A is
> aborted and require A catch InProgressAborted and explicitly abort B if
> that's the desired behaviour
> discussion in the PEP on cancellation has some TBDs so perhaps the above
> will be food for thought
>
> they have a timeout() method, which returns a new InProgress object
> representing the task that will abort when the timeout elapses if the task
> doesn't finish
>
> it's noteworthy that timeout() returns a new InProgress and the original
> task continues on even if the timeout occurs -- by default that is, unless
> you do timeout(abort=True)
> I didn't see much discussion in the PEP on timeouts, but I think this is an
> important feature that should be standardized
>
>
> Coroutines in Kaa use "yield" rather than "yield from" but the general
> approach looks very similar to what's been proposed:
>
>     http://api.freevo.org/kaa-base/async/coroutines.html
>
> The @coroutine decorator causes the decorated function to return an
> InProgress.  Coroutines can of course yield other coroutines, but, more
> fundamentally, anything else that returns an InProgress object, which could
> be a @threaded function, or even an ordinary function that explicitly
> creates and returns an InProgress object.
>
> There are some features of Kaa's implementation that could be worth
> considering:
>
> it is possible to yield a special object (called NotFinished) that allows a
> coroutine to "time slice" as a form of cooperative multitasking
> coroutines can have certain policies that control invocation behaviour.  The
> most obvious ones to describe are POLICY_SYNCHRONIZED which ensures that
> multiple invocations of the same coroutine are serialized, and
> POLICY_SINGLETON which effectively ignores subsequent invocations if it's
> already running
> it is possible to have a special progress object passed into the coroutine
> function so that the coroutine's progress can be communicated to an outside
> observer
>
>
> Once you've standardized on a way to manage the lifecycle of an in-progress
> asynchronous task, threads are a natural extension:
>
>     http://api.freevo.org/kaa-base/async/threads.html
>
> The important element here is that @threaded decorated functions can be
> yielded by coroutines.  This means that truly blocking tasks can be wrapped
> in a thread but invocation from a coroutine is identical to any other
> coroutine.  Consequently, a threaded task could later be implemented as a
> coroutine (or more generally via event loop hooks) without any API changes.
>
> I think I'll stop here.  There's plenty more definition, discussion, and
> examples in the links above.  Hopefully some ideas can be salvaged for PEP
> 3156, but even if that's not the case, I'll be happy to know they were
> considered and rejected rather than not considered at all.
>
> Cheers,
> Jason.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
--Guido van Rossum (python.org/~guido)


From solipsis at pitrou.net  Sun Dec 16 11:16:02 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 16 Dec 2012 11:16:02 +0100
Subject: [Python-ideas] Late to the async party (PEP 3156)
References: <50CD2592.5010507@urandom.ca>
	<CAP7+vJJ0RbHS0fU1zPNAsLt83RDtyVmH1GeWFfNiUZva_t+5ow@mail.gmail.com>
Message-ID: <20121216111602.383ebf4d@pitrou.net>

On Sat, 15 Dec 2012 21:37:15 -0800
Guido van Rossum <guido at python.org> wrote:
> Hi Jason,
> 
> I don't think you've missed anything. I had actually planned to keep
> PEP 3156 unpublished for a bit longer, since I'm not done writing the
> reference implementation -- I'm sure that many of the issues currently
> marked open or TBD will be resolved that way. There hasn't been any
> public discussion since the last threads on python-ideas some weeks
> ago -- however I've met in person with some Twisted folks and
> exchanged private emails with some other interested parties.

For the record, have you looked at the pyuv API? It's rather nicely
orthogonal, although it lacks a way to stop the event loop.
https://pyuv.readthedocs.org/en

Regards

Antoine.




From eliben at gmail.com  Sun Dec 16 14:22:44 2012
From: eliben at gmail.com (Eli Bendersky)
Date: Sun, 16 Dec 2012 05:22:44 -0800
Subject: [Python-ideas] Docstrings for namedtuple
In-Reply-To: <kaafhm$b7n$1@ger.gmane.org>
References: <kaafhm$b7n$1@ger.gmane.org>
Message-ID: <CAF-Rda_HQTehs__49JjMh6UvFDb7mLoCmwFG3+Uz_WhoV7A5AA@mail.gmail.com>

On Wed, Dec 12, 2012 at 9:40 AM, Serhiy Storchaka <storchaka at gmail.com>wrote:

> What interface is better for specifying namedtuple field docstrings?
>
>     Point = namedtuple('Point', 'x y',
>                        doc='Point: 2-dimensional coordinate',
>                        field_docs=['abscissa', 'ordinate'])
>
> or
>
>     Point = namedtuple('Point', [('x', 'absciss'), ('y', 'ordinate')],
>                        doc='Point: 2-dimensional coordinate')
>
> ?
>
>
This may be a good time to say that personally I always disliked
namedtuple's creation syntax. It is unpleasant in two respects:

1. You have to repeat the name
2. You have to specify the fields in a space-separated string

I wish there was an alternative of something like:

@namedtuple
class Point:
  x = 0
  y = 0

Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121216/80cf1a35/attachment.html>

From eliben at gmail.com  Sun Dec 16 14:24:08 2012
From: eliben at gmail.com (Eli Bendersky)
Date: Sun, 16 Dec 2012 05:24:08 -0800
Subject: [Python-ideas] Docstrings for namedtuple
In-Reply-To: <CAF-Rda_HQTehs__49JjMh6UvFDb7mLoCmwFG3+Uz_WhoV7A5AA@mail.gmail.com>
References: <kaafhm$b7n$1@ger.gmane.org>
	<CAF-Rda_HQTehs__49JjMh6UvFDb7mLoCmwFG3+Uz_WhoV7A5AA@mail.gmail.com>
Message-ID: <CAF-Rda_AssLvPM5i3LjGZttYQ4moTHY4QrnN0dqMQqYwRTe5gQ@mail.gmail.com>

This may be a good time to say that personally I always disliked
> namedtuple's creation syntax. It is unpleasant in two respects:
>
> 1. You have to repeat the name
> 2. You have to specify the fields in a space-separated string
>
> I wish there was an alternative of something like:
>
> @namedtuple
> class Point:
>   x = 0
>   y = 0
>
>
And to the point of Serhiy's original topic, with this syntax there would
be no need to invent yet another non-standard way to specify things like
docstrings.

Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121216/d03a7704/attachment.html>

From vinay_sajip at yahoo.co.uk  Sun Dec 16 14:44:46 2012
From: vinay_sajip at yahoo.co.uk (Vinay Sajip)
Date: Sun, 16 Dec 2012 13:44:46 +0000 (UTC)
Subject: [Python-ideas] Late to the async party (PEP 3156)
References: <50CD2592.5010507@urandom.ca>
	<CAP7+vJJ0RbHS0fU1zPNAsLt83RDtyVmH1GeWFfNiUZva_t+5ow@mail.gmail.com>
	<20121216111602.383ebf4d@pitrou.net>
Message-ID: <loom.20121216T144354-160@post.gmane.org>

Antoine Pitrou <solipsis at ...> writes:

> For the record, have you looked at the pyuv API? It's rather nicely
> orthogonal, although it lacks a way to stop the event loop.
> https://pyuv.readthedocs.org/en

That link gives a 404, but you can use

https://pyuv.readthedocs.org/en/latest/

Regards,

Vinay Sajip



From jsbueno at python.org.br  Sun Dec 16 15:06:03 2012
From: jsbueno at python.org.br (Joao S. O. Bueno)
Date: Sun, 16 Dec 2012 12:06:03 -0200
Subject: [Python-ideas] Docstrings for namedtuple
In-Reply-To: <CAF-Rda_AssLvPM5i3LjGZttYQ4moTHY4QrnN0dqMQqYwRTe5gQ@mail.gmail.com>
References: <kaafhm$b7n$1@ger.gmane.org>
	<CAF-Rda_HQTehs__49JjMh6UvFDb7mLoCmwFG3+Uz_WhoV7A5AA@mail.gmail.com>
	<CAF-Rda_AssLvPM5i3LjGZttYQ4moTHY4QrnN0dqMQqYwRTe5gQ@mail.gmail.com>
Message-ID: <CAH0mxTTn=7CxZX97n9g6yUNv1DVEQQsYMV4po_AspEpm+LZx-A@mail.gmail.com>

On 16 December 2012 11:24, Eli Bendersky <eliben at gmail.com> wrote:

>
>
> This may be a good time to say that personally I always disliked
>> namedtuple's creation syntax. It is unpleasant in two respects:
>>
>> 1. You have to repeat the name
>> 2. You have to specify the fields in a space-separated string
>>
>> I wish there was an alternative of something like:
>>
>> @namedtuple
>> class Point:
>>   x = 0
>>   y = 0
>>
>>
> And to the point of Serhiy's original topic, with this syntax there would
> be no need to invent yet another non-standard way to specify things like
> docstrings.
>

While we are at it,
why nto simply:

class Point(namedtuple):
   x = 0
   y = 0

?

>
> Eli
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121216/0f8a339e/attachment.html>

From _ at lvh.cc  Sun Dec 16 15:39:10 2012
From: _ at lvh.cc (Laurens Van Houtven)
Date: Sun, 16 Dec 2012 15:39:10 +0100
Subject: [Python-ideas] Docstrings for namedtuple
In-Reply-To: <CAH0mxTTn=7CxZX97n9g6yUNv1DVEQQsYMV4po_AspEpm+LZx-A@mail.gmail.com>
References: <kaafhm$b7n$1@ger.gmane.org>
	<CAF-Rda_HQTehs__49JjMh6UvFDb7mLoCmwFG3+Uz_WhoV7A5AA@mail.gmail.com>
	<CAF-Rda_AssLvPM5i3LjGZttYQ4moTHY4QrnN0dqMQqYwRTe5gQ@mail.gmail.com>
	<CAH0mxTTn=7CxZX97n9g6yUNv1DVEQQsYMV4po_AspEpm+LZx-A@mail.gmail.com>
Message-ID: <CAE_Hg6bVB0KVjCdoMfnkHYkn1W3AEBK-9EpqeVFE=JQ5fubxsg@mail.gmail.com>

Err, can class bodies ever be order-sensitive? I was under the impression
names bound there work just like names bound anywhere...

Unless of course that magical decorator is secretly an AST hack, in which
case, yes, it can do whatever it wants :)


On Sun, Dec 16, 2012 at 3:06 PM, Joao S. O. Bueno <jsbueno at python.org.br>wrote:

>
>
> On 16 December 2012 11:24, Eli Bendersky <eliben at gmail.com> wrote:
>
>>
>>
>>  This may be a good time to say that personally I always disliked
>>> namedtuple's creation syntax. It is unpleasant in two respects:
>>>
>>> 1. You have to repeat the name
>>> 2. You have to specify the fields in a space-separated string
>>>
>>> I wish there was an alternative of something like:
>>>
>>> @namedtuple
>>> class Point:
>>>   x = 0
>>>   y = 0
>>>
>>>
>> And to the point of Serhiy's original topic, with this syntax there would
>> be no need to invent yet another non-standard way to specify things like
>> docstrings.
>>
>
> While we are at it,
> why nto simply:
>
> class Point(namedtuple):
>    x = 0
>    y = 0
>
> ?
>
>>
>> Eli
>>
>>
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>>
>>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>


-- 
cheers
lvh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121216/fc3ec9f1/attachment.html>

From pyideas at rebertia.com  Sun Dec 16 15:49:12 2012
From: pyideas at rebertia.com (Chris Rebert)
Date: Sun, 16 Dec 2012 06:49:12 -0800
Subject: [Python-ideas] Docstrings for namedtuple
In-Reply-To: <CAE_Hg6bVB0KVjCdoMfnkHYkn1W3AEBK-9EpqeVFE=JQ5fubxsg@mail.gmail.com>
References: <kaafhm$b7n$1@ger.gmane.org>
	<CAF-Rda_HQTehs__49JjMh6UvFDb7mLoCmwFG3+Uz_WhoV7A5AA@mail.gmail.com>
	<CAF-Rda_AssLvPM5i3LjGZttYQ4moTHY4QrnN0dqMQqYwRTe5gQ@mail.gmail.com>
	<CAH0mxTTn=7CxZX97n9g6yUNv1DVEQQsYMV4po_AspEpm+LZx-A@mail.gmail.com>
	<CAE_Hg6bVB0KVjCdoMfnkHYkn1W3AEBK-9EpqeVFE=JQ5fubxsg@mail.gmail.com>
Message-ID: <CAMZYqRQ8vo4BUj2P1opKKpC9XN-inNQMCrMqU+QtdSjo-TEBvw@mail.gmail.com>

> On Sun, Dec 16, 2012 at 3:06 PM, Joao S. O. Bueno <jsbueno at python.org.br>
> wrote:
>> On 16 December 2012 11:24, Eli Bendersky <eliben at gmail.com> wrote:
>>>> This may be a good time to say that personally I always disliked
>>>> namedtuple's creation syntax. It is unpleasant in two respects:
>>>>
>>>> 1. You have to repeat the name
>>>> 2. You have to specify the fields in a space-separated string
>>>>
>>>> I wish there was an alternative of something like:
>>>>
>>>> @namedtuple
>>>> class Point:
>>>>   x = 0
>>>>   y = 0
>>>>
>>>
>>> And to the point of Serhiy's original topic, with this syntax there would
>>> be no need to invent yet another non-standard way to specify things like
>>> docstrings.
>>
>>
>> While we are at it,
>> why nto simply:
>>
>> class Point(namedtuple):
>>    x = 0
>>    y = 0
>>
>> ?
>>
On Sun, Dec 16, 2012 at 6:39 AM, Laurens Van Houtven <_ at lvh.cc> wrote:
> Err, can class bodies ever be order-sensitive?

Yep. You just have to define a metaclass with a __prepare__() that
returns an OrderedDict (or similar).

http://docs.python.org/3.4/reference/datamodel.html#preparing-the-class-namespace
http://docs.python.org/2/library/collections.html#ordereddict-objects

Cheers,
Chris
--
http://rebertia.com


From solipsis at pitrou.net  Sun Dec 16 15:52:07 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 16 Dec 2012 15:52:07 +0100
Subject: [Python-ideas] Docstrings for namedtuple
References: <kaafhm$b7n$1@ger.gmane.org>
	<CAF-Rda_HQTehs__49JjMh6UvFDb7mLoCmwFG3+Uz_WhoV7A5AA@mail.gmail.com>
Message-ID: <20121216155207.072707c1@pitrou.net>

On Sun, 16 Dec 2012 05:22:44 -0800
Eli Bendersky <eliben at gmail.com> wrote:
> On Wed, Dec 12, 2012 at 9:40 AM, Serhiy Storchaka <storchaka at gmail.com>wrote:
> 
> > What interface is better for specifying namedtuple field docstrings?
> >
> >     Point = namedtuple('Point', 'x y',
> >                        doc='Point: 2-dimensional coordinate',
> >                        field_docs=['abscissa', 'ordinate'])
> >
> > or
> >
> >     Point = namedtuple('Point', [('x', 'absciss'), ('y', 'ordinate')],
> >                        doc='Point: 2-dimensional coordinate')
> >
> > ?
> >
> >
> This may be a good time to say that personally I always disliked
> namedtuple's creation syntax. It is unpleasant in two respects:
> 
> 1. You have to repeat the name
> 2. You have to specify the fields in a space-separated string
> 
> I wish there was an alternative of something like:
> 
> @namedtuple
> class Point:
>   x = 0
>   y = 0

+1, this would be very nice. It would also allow default values as
shown above, which is a useful feature.

Regards

Antoine.




From vinay_sajip at yahoo.co.uk  Sun Dec 16 16:36:26 2012
From: vinay_sajip at yahoo.co.uk (Vinay Sajip)
Date: Sun, 16 Dec 2012 15:36:26 +0000 (UTC)
Subject: [Python-ideas] Graph class
References: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>
	<CAOvn4qgkX4+4B9PmmoRmHC+HG-6s+HHb6BiQfzEo+RTaFVdz+Q@mail.gmail.com>
	<CAMjeLr9wymGhGaOEa=qK-=nK6_A6ZCz+9ivBnz2rwAFh3AmX8A@mail.gmail.com>
	<CACac1F-K7iOEf60uR9x5SwSPEzXqHkbbFw-zmVxguogF=Qy-hg@mail.gmail.com>
	<CAMjeLr9KdF6n5U29BTYt7wmQd1GQv0n-3zs8V+n6UfoEuoy=Jg@mail.gmail.com>
	<CAMjeLr-hmit-GKuicXSXv8xuZASFqWAK6pLGjNvJ_+gMhq0vpQ@mail.gmail.com>
	<87txru1wxr.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CAMjeLr_2aSj44hsBa1P2J47Xh-DzHVcr0etcmHpKQnXn0yRQMw@mail.gmail.com>
Message-ID: <loom.20121216T144624-97@post.gmane.org>

Mark Adam <dreamingforward at ...> writes:

> But your issue of proposing an actual implementation is well-taken
> rather than spend a lot of time arguing over it all.  With any luck,
> I'll try to distill networkx with my work and put it all together.

In terms of use cases, you might be interested in potential users of any stdlib
graph library. I'm working on distlib [1], which evolved out of distutils2 and
uses graphs in a couple of places:

1. A dependency graph for distributions. This came directly from distutils2,
   though I've added a couple of bits to it such as topological sorting and
   determination of strongly-connected components.
2. A lightweight sequencer for build steps, added to avoid the approach in
   distutils/distutils2 which makes it harder than necessary to handle custom
   build steps. I didn't use the graph system used in point 1, as it was too
   specific, and I haven't had time to look at refactoring it.

There's another potential use case in the area of packaging, though perhaps not
in distlib itself: the idea of generating build artifacts based on their
dependencies. Ideally, this would consider not only build artifacts and their
dependencies, but also the builders themselves as part of the graph.

Regards,

Vinay Sajip

[1] https://distlib.readthedocs.org/en/latest/




From guido at python.org  Sun Dec 16 16:39:14 2012
From: guido at python.org (Guido van Rossum)
Date: Sun, 16 Dec 2012 07:39:14 -0800
Subject: [Python-ideas] Late to the async party (PEP 3156)
In-Reply-To: <loom.20121216T144354-160@post.gmane.org>
References: <50CD2592.5010507@urandom.ca>
	<CAP7+vJJ0RbHS0fU1zPNAsLt83RDtyVmH1GeWFfNiUZva_t+5ow@mail.gmail.com>
	<20121216111602.383ebf4d@pitrou.net>
	<loom.20121216T144354-160@post.gmane.org>
Message-ID: <CAP7+vJKp79Aq8dsnazbXVqfbQY4s65mr8F4GZyx2mWTXbNi5aA@mail.gmail.com>

I have to ask someone who has experience with libuv to comment on my PEP --
those docs are very low level and don't explain how things work together or
why features are needed.

I also have to explain my goals and motivations. But not now.

--Guido

On Sunday, December 16, 2012, Vinay Sajip wrote:

> Antoine Pitrou <solipsis at ...> writes:
>
> > For the record, have you looked at the pyuv API? It's rather nicely
> > orthogonal, although it lacks a way to stop the event loop.
> > https://pyuv.readthedocs.org/en
>
> That link gives a 404, but you can use
>
> https://pyuv.readthedocs.org/en/latest/
>
> Regards,
>
> Vinay Sajip
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org <javascript:;>
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 
--Guido van Rossum (on iPad)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121216/8819820a/attachment.html>

From guido at python.org  Sun Dec 16 16:41:07 2012
From: guido at python.org (Guido van Rossum)
Date: Sun, 16 Dec 2012 07:41:07 -0800
Subject: [Python-ideas] Graph class
In-Reply-To: <loom.20121216T144624-97@post.gmane.org>
References: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>
	<CAOvn4qgkX4+4B9PmmoRmHC+HG-6s+HHb6BiQfzEo+RTaFVdz+Q@mail.gmail.com>
	<CAMjeLr9wymGhGaOEa=qK-=nK6_A6ZCz+9ivBnz2rwAFh3AmX8A@mail.gmail.com>
	<CACac1F-K7iOEf60uR9x5SwSPEzXqHkbbFw-zmVxguogF=Qy-hg@mail.gmail.com>
	<CAMjeLr9KdF6n5U29BTYt7wmQd1GQv0n-3zs8V+n6UfoEuoy=Jg@mail.gmail.com>
	<CAMjeLr-hmit-GKuicXSXv8xuZASFqWAK6pLGjNvJ_+gMhq0vpQ@mail.gmail.com>
	<87txru1wxr.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CAMjeLr_2aSj44hsBa1P2J47Xh-DzHVcr0etcmHpKQnXn0yRQMw@mail.gmail.com>
	<loom.20121216T144624-97@post.gmane.org>
Message-ID: <CAP7+vJL7kLAkqMjkWhaqEeaLVec3Q8HpvRRVmdRikOOgb_k0wg@mail.gmail.com>

I think of graphs and trees as patterns, not data structures.



-- 
--Guido van Rossum (on iPad)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121216/d0e7695d/attachment.html>

From guido at python.org  Sun Dec 16 17:27:53 2012
From: guido at python.org (Guido van Rossum)
Date: Sun, 16 Dec 2012 08:27:53 -0800
Subject: [Python-ideas] Late to the async party (PEP 3156)
In-Reply-To: <50CD2592.5010507@urandom.ca>
References: <50CD2592.5010507@urandom.ca>
Message-ID: <CAP7+vJLAPT46qzAa5g7iFVdkDvr-QxDUKC1Nnee2QuH3s-=8qA@mail.gmail.com>

On Sat, Dec 15, 2012 at 5:36 PM, Jason Tackaberry <tack at urandom.ca> wrote:

> With Kaa, our future-style object is called an InProgress (so forgive the
> differing terminology in the remainder of this post):
>
>     http://api.freevo.org/kaa-base/async/inprogress.html
>
> A couple properties of InProgress objects that I've found have practical
> value:
>
>    - they can be aborted, which raises a special InProgressAborted inside
>    the coroutine function so it can perform cleanup actions
>       - what makes this tricky is the question of what to do to any
>       currently yielded tasks?  If A yields B and A is aborted, should B be
>       aborted?  What if the same B task is being yielded by C?  Should C also be
>       aborted, even if it's considered a sibling of A?  (For example, suppose B
>       is a task that is refreshing some common cache that both A and C want to
>       make sure is up-to-date before they move on.)
>       - if the decision is B should be aborted, then within A, 'yield B'
>       will raise an exception because A is aborted, but 'yield B' within C will
>       raise because B was aborted.  So there needs to be some mechanism to
>       distinguish between these cases.  (My approach was to have an origin
>       attribute on the exception.)
>        - if A yields B, it may want to prevent B from being aborted if A
>       is aborted.  (My approach was to have a noabort() method in InProgress
>       objects to return a new, unabortable InProgress object that A can then
>       yield.)
>       - alternatively, the saner implementation may be to do nothing to B
>       when A is aborted and require A catch InProgressAborted and explicitly
>       abort B if that's the desired behaviour
>       - discussion in the PEP on cancellation has some TBDs so perhaps
>       the above will be food for thought
>
> The PEP is definitely weak. Here are some thoughts/proposals though:

   - You can't cancel a coroutine; however you can cancel a Task, which is
   a Future wrapping a stack of coroutines linked via yield-from.
   - Cancellation only takes effect when a task is suspended.
   - When you cancel a Task, the most deeply nested coroutine (the one that
   caused it to be suspended) receives a special exception (I propose to reuse
   concurrent.futures.CancelledError from PEP 3148). If it doesn't catch this
   it bubbles all the way to the Task, and then out from there.
   - However when a coroutine in one Task uses yield-from to wait for
   another Task, the latter does not automatically get cancelled. So this is a
   difference between "yield from foo()" and "yield from Task(foo())", which
   otherwise behave pretty similarly. Of course the first Task could catch the
   exception and cancel the second task -- that is its responsibility though
   and not the default behavior.
   - PEP 3156 has a par() helper which lets you block for multiple
   tasks/coroutines in parallel. It takes arguments which are either
   coroutines, Tasks, or other Futures; it wraps the coroutines in Tasks to
   run them independently an just waits for the other arguments. Proposal:
   when the Task containing the par() call is cancelled, the par() call
   intercepts the cancellation and by default cancels those coroutines that
   were passed in "bare" but not the arguments that were passed in as Tasks or
   Futures. Some keyword argument to par() may be used to change this behavior
   to "cancel none" or "cancel all" (exact API spec TBD).



>    - they have a timeout() method, which returns a new InProgress object
>    representing the task that will abort when the timeout elapses if the task
>    doesn't finish
>       - it's noteworthy that timeout() returns a *new* InProgress and the
>       original task continues on even if the timeout occurs -- by default that
>       is, unless you do timeout(abort=True)
>        - I didn't see much discussion in the PEP on timeouts, but I think
>       this is an important feature that should be standardized
>
> Interesting. In Tulip v1 (the experimental version I wrote before PEP
3156) the Task() constructor has an optional timeout argument. It works by
scheduling a callback at the given time in the future, and the callback
simply cancel the task (which is a no-op if the task has already
completed). It works okay, except it generates tracebacks that are
sometimes logged and sometimes not properly caught -- though some of that
may be my messy test code. The exception raised by a timeout is the same
CancelledError, which is somewhat confusing. I wonder if Task.cancel()
shouldn't take an exception with which to cancel the task with.
(TimeoutError in PEP 3148 has a different role, it is when the timeout on a
specific wait expires, so e.g. fut.result(timeout=2) waits up to 2 seconds
for fut to complete, and if not, the call raises TimeoutError, but the code
running in the executor is unaffected.)



>
> Coroutines in Kaa use "yield" rather than "yield from" but the general
> approach looks very similar to what's been proposed:
>
>     http://api.freevo.org/kaa-base/async/coroutines.html
>
> The @coroutine decorator causes the decorated function to return an
> InProgress.  Coroutines can of course yield other coroutines, but, more
> fundamentally, anything else that returns an InProgress object, which could
> be a @threaded function, or even an ordinary function that explicitly
> creates and returns an InProgress object.
>

We've had long discussions about yield vs. yield-from. The latter is way
more efficient and that's enough for me to push it through. When using
yield, each yield causes you to bounce to the scheduler, which has to do a
lot of work to decide what to do next, even if that is just resuming the
suspended generator; and the scheduler is responsible for keeping track of
the stack of generators. When using yield-from, calling another coroutine
as a subroutine is almost free and doesn't involve the scheduler at all;
thus it's much cheaper, and the scheduler can be simpler (doesn't need to
keep track of the stack). Also stack traces and debugging are better.


>
> There are some features of Kaa's implementation that could be worth
> considering:
>
>    - it is possible to yield a special object (called NotFinished) that
>    allows a coroutine to "time slice" as a form of cooperative multitasking
>
> I can recommend yield from tulip.sleep(0) for that.


>
>    - coroutines can have certain policies that control invocation
>    behaviour.  The most obvious ones to describe are POLICY_SYNCHRONIZED which
>    ensures that multiple invocations of the same coroutine are serialized, and
>    POLICY_SINGLETON which effectively ignores subsequent invocations if it's
>    already running
>    - it is possible to have a special progress object passed into the
>    coroutine function so that the coroutine's progress can be communicated to
>    an outside observer
>
>
These seem pretty esoteric and can probably implemented in user code if
needed.


>
>
>
> Once you've standardized on a way to manage the lifecycle of an
> in-progress asynchronous task, threads are a natural extension:
>
>     http://api.freevo.org/kaa-base/async/threads.html
>
> The important element here is that @threaded decorated functions can be
> yielded by coroutines.  This means that truly blocking tasks can be wrapped
> in a thread but invocation from a coroutine is identical to any other
> coroutine.  Consequently, a threaded task could later be implemented as a
> coroutine (or more generally via event loop hooks) without any API changes.
>

As I said, I think wait_for_future() and run_in_executor() in the PEP give
you all you need. The @threaded decorator you propose is just sugar; if a
user wants to take an existing API and convert it from a coroutine to
threaded without requiring changes to the caller, they can just introduce a
helper that is run in a thread with run_in_executor().


> I think I'll stop here.  There's plenty more definition, discussion, and
> examples in the links above.  Hopefully some ideas can be salvaged for PEP
> 3156, but even if that's not the case, I'll be happy to know they were
> considered and rejected rather than not considered at all.
>

Thanks for your very useful contribution! Kaa looks like an interesting
system. Is it ported to Python 3 yet? Maybe you could look into integrating
with the PEP 3156 event loop and/or scheduler.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121216/c235c3c1/attachment.html>

From tack at urandom.ca  Sun Dec 16 20:11:48 2012
From: tack at urandom.ca (Jason Tackaberry)
Date: Sun, 16 Dec 2012 14:11:48 -0500
Subject: [Python-ideas] Late to the async party (PEP 3156)
In-Reply-To: <CAP7+vJLAPT46qzAa5g7iFVdkDvr-QxDUKC1Nnee2QuH3s-=8qA@mail.gmail.com>
References: <50CD2592.5010507@urandom.ca>
	<CAP7+vJLAPT46qzAa5g7iFVdkDvr-QxDUKC1Nnee2QuH3s-=8qA@mail.gmail.com>
Message-ID: <50CE1CF4.4080704@urandom.ca>

On 12-12-16 11:27 AM, Guido van Rossum wrote:
> The PEP is definitely weak. Here are some thoughts/proposals though:
>
>   * You can't cancel a coroutine; however you can cancel a Task, which
>     is a Future wrapping a stack of coroutines linked via yield-from.
>

I'll just underline your statement that "you can't cancel a coroutine" 
here, since I'm referencing it later.

This distinction between "bare" coroutines, Futures, and Tasks is a bit 
foreign to me, since in Kaa all coroutines return (a subclass of) 
InProgress objects.

The Tasks section in the PEP says that a bare coroutine (is this the 
same as the previously defined "coroutine object"?) has much less 
overhead than a Task but it's not clear to me why that would be, as both 
would ultimately need to be managed by the scheduler, wouldn't they?

I could imagine that a coroutine object is implemented as a C object for 
performance, and a Task is a Python class, and maybe that explains the 
difference.  But then why differentiate between Future and Task 
(particularly because they have the same interface, so I can't draw an 
analogy with jQuery's Deferreds and Promises, where Promises are a 
restricted form of Deferreds for public consumption to attach callbacks).


>   * Cancellation only takes effect when a task is suspended.
>

Yes, this is intuitive.


>   * When you cancel a Task, the most deeply nested coroutine (the one
>     that caused it to be suspended) receives a special exception (I
>     propose to reuse concurrent.futures.CancelledError from PEP 3148).
>     If it doesn't catch this it bubbles all the way to the Task, and
>     then out from there.
>

So if the most deeply nested coroutine catches the CancelledError and 
doesn't reraise, it can prevent its cancellation?

I took a similar appoach, except that coroutines can't abort their own 
cancellation, and whether or not the nested coroutines actually get 
cancelled depends on whether something else was interested in their result.

Consider a coroutine chain where A yields B yields C yields D, and we do 
B.abort()

  * if only C was interested in D's result, then D will get an
    InProgressAborted raised inside it (at whatever point it's currently
    suspended).  If something other than C was also waiting on D, D will
    not be affected
  * similarly, if only B was interested in C's result, then C will get
    an InProgressAborted raised inside it (at yield D).
  * B will get InProgressAborted raised inside it (at yield C)
  * for B, C and D, the coroutines will not be reentered and they are
    not allowed to yield a value that suggests they expect reentry. 
    There's nothing a coroutine can do to prevent its own demise.
  * A will get an InProgressAborted raised inside it (at yield B)
  * In all the above cases, the InProgressAborted instance has an origin
    attribute that is B's InProgress object
  * Although B, C, and D are now aborted, A isn't aborted.  It's allowed
    to yield again.
  * with Kaa, coroutines are abortable by default (so they are like
    Tasks always).  But in this example, B can present C from being
    aborted by yielding C().noabort()


There are quite a few scenarios to consider: A yields B and B is 
cancelled or raises; A yields B and A is cancelled or raises; A yields 
B, C yields B, and A is cancelled or raises; A yields B, C yields B, and 
A or C is cancelled or raises; A yields par(B,C,D) and B is cancelled or 
raises; etc, etc.

In my experience, there's no one-size-fits-all behaviour, and the best 
we can do is have sensible default behaviour with some API (different 
functions, kwargs, etc.) to control the cancellation propagation logic.


>   * However when a coroutine in one Task uses yield-from to wait for
>     another Task, the latter does not automatically get cancelled. So
>     this is a difference between "yield from foo()" and "yield from
>     Task(foo())", which otherwise behave pretty similarly. Of course
>     the first Task could catch the exception and cancel the second
>     task -- that is its responsibility though and not the default
>     behavior.
>

Ok, so nested bare coroutines will get cancelled implicitly, but nested 
Tasks won't?

I'm having a bit of difficulty with this one.  You said that coroutines 
can't be cancelled, but Tasks can be.  But here, if they are being 
yielded, the opposite behaviour applies: yielded coroutines /are/ 
cancelled if a Task is cancelled, but yielded tasks /aren't/.

Or have I misunderstood?


>   * PEP 3156 has a par() helper which lets you block for multiple
>     tasks/coroutines in parallel. It takes arguments which are either
>     coroutines, Tasks, or other Futures; it wraps the coroutines in
>     Tasks to run them independently an just waits for the other
>     arguments. Proposal: when the Task containing the par() call is
>     cancelled, the par() call intercepts the cancellation and by
>     default cancels those coroutines that were passed in "bare" but
>     not the arguments that were passed in as Tasks or Futures. Some
>     keyword argument to par() may be used to change this behavior to
>     "cancel none" or "cancel all" (exact API spec TBD).
>

Here again, par() would cancel a bare coroutine but not Tasks.  It's 
consistent with your previous bullet but seems to contradict your first 
bullet that you can't cancel a coroutine.

I guess the distinction is you can't explicitly cancel a coroutine, but 
coroutines can be implicitly cancelled?

As I discussed previously, one of those tasks might be yielded by some 
other active coroutine, and so cancelling it may not be the right thing 
to do.  Being able to control this behaviour is important, whether 
that's a par() kwarg, or special method like noabort() that constructs 
an unabortable Task instance.

Kaa has similar constructs to allow yielding a collection of InProgress 
objects (whatever they might represent: coroutines, threaded functions, 
etc.).  In particular, it allows you to yield multiple tasks and resume 
when ALL of them complete (InProgressAll), or when ANY of them complete 
(InProgressAny).  For example:

     @kaa.coroutine()
     def is_any_host_up(*hosts):
         try:
             # ping() is a coroutine
             yield kaa.InProgressAny(ping(host) for host in hosts).timeout(5, abort=True)
         except kaa.TimeoutException:
             yield False
         else:
             yield True


More details here:

http://api.freevo.org/kaa-base/async/inprogress.html#inprogress-collections

 From what I understand of the proposed par() it would require//ALL of 
the supplied futures to complete, but there are many use-cases for the 
ANY variant as well.


> Interesting. In Tulip v1 (the experimental version I wrote before PEP 
> 3156) the Task() constructor has an optional timeout argument. It 
> works by scheduling a callback at the given time in the future, and 
> the callback simply cancel the task (which is a no-op if the task has 
> already completed). It works okay, except it generates tracebacks that 
> are sometimes logged and sometimes not properly caught -- though some 
> of that may be my messy test code. The exception raised by a timeout 
> is the same CancelledError, which is somewhat confusing. I wonder if 
> Task.cancel() shouldn't take an exception with which to cancel the 
> task with. (TimeoutError in PEP 3148 has a different role, it is when 
> the timeout on a specific wait expires, so e.g. fut.result(timeout=2) 
> waits up to 2 seconds for fut to complete, and if not, the call raises 
> TimeoutError, but the code running in the executor is unaffected.)

FWIW, the equivalent in Kaa which is InProgress.abort() does take an 
optional exception, which must subclass InProgressAborted.  If None, a 
new InProgressAborted is created.  InProgress.timeout(t) will start a 
timer that invokes InProgress.abort(TimeoutException()) 
(TimeoutException subclasses InProgressAborted).

It sounds like your proposed implementation works like:

    @tulip.coroutine()
    def foo():
       try:
          result = yield from Task(othercoroutine()).result(timeout=2)
       except TimeoutError:
          # ... othercoroutine() still lives on

I think Kaa's syntax is cleaner but it seems functionally the same:

    @kaa.coroutine()
    def foo():
       try:
          result = yield othercoroutine().timeout(2)
       except kaa.TimeoutException:
          # ... othercoroutine() still lives on

It's also possible to conveniently ensure that othercoroutine() is 
aborted if the timeout elapses:

    try:
       result = yield othercoroutine().timeout(2, abort=True)
    except kaa.TimeoutException:
       # ... othercoroutine() is aborted


> We've had long discussions about yield vs. yield-from. The latter is 
> way more efficient and that's enough for me to push it through. When 
> using yield, each yield causes you to bounce to the scheduler, which 
> has to do a lot of work to decide what to do next, even if that is 
> just resuming the suspended generator; and the scheduler is 
> responsible for keeping track of the stack of generators. When using 
> yield-from, calling another coroutine as a subroutine is almost free 
> and doesn't involve the scheduler at all; thus it's much cheaper, and 
> the scheduler can be simpler (doesn't need to keep track of the 
> stack). Also stack traces and debugging are better. 

But this sounds like a consequence of a particular implementation, isn't it?

A @kaa.coroutine() decorated function is entered right away when 
invoked, and the decorator logic does as much as it can until the 
underlying generator yields an unfinished InProgress that needs to wait 
for (or kaa.NotFinished).  Once it yields, /then/ the decorator sets up 
the necessary hooks with the scheduler / event loop.

This means you can nest a stack of coroutines without involving the 
scheduler until something truly asynchronous needs to take place.

Have I misunderstood?


>       * coroutines can have certain policies that control invocation
>         behaviour.  The most obvious ones to describe are
>         POLICY_SYNCHRONIZED which ensures that multiple invocations of
>         the same coroutine are serialized, and POLICY_SINGLETON which
>         effectively ignores subsequent invocations if it's already running
>       * it is possible to have a special progress object passed into
>         the coroutine function so that the coroutine's progress can be
>         communicated to an outside observer
>
>
> These seem pretty esoteric and can probably implemented in user code 
> if needed.

I'm fine with that, provided the flexibility is there to allow for it.


> As I said, I think wait_for_future() and run_in_executor() in the PEP 
> give you all you need. The @threaded decorator you propose is just 
> sugar; if a user wants to take an existing API and convert it from a 
> coroutine to threaded without requiring changes to the caller, they 
> can just introduce a helper that is run in a thread with 
> run_in_executor().

Also works for me. :)


> Thanks for your very useful contribution! Kaa looks like an 
> interesting system. Is it ported to Python 3 yet? Maybe you could look 
> into integrating with the PEP 3156 event loop and/or scheduler.

Kaa does work with Python 3, yes, although it still lacks very much 
needed unit tests so I'm not completely confident it has the same 
functional coverage as Python 2.

I'm definitely interested in having it conform to whatever shakes out of 
PEP 3156, which is why I'm speaking up now. :)


I've a couple other subjects I should bring up:

Tasks/Futures as "signals": it's often necessary to be able to resume a 
coroutine based on some condition other than e.g. any IO tasks it's 
waiting on.  For example, in one application, I have a 
(POLICY_SINGLETON) coroutine that works off a download queue.  If 
there's nothing in the queue, it's suspended at a yield.  It's the 
coroutine equivalent of a dedicated thread. [1]

It must be possible to "wake" the queue manager when I enqueue a job for 
it.  Kaa has this notion of "signals" which is similar to the gtk+ style 
of signals in that you can attach callbacks to them and emit them.  
Signals can be represented as InProgress objects, which means they can 
be yielded from coroutines and used in InProgressAny/All objects.

So my download manager coroutine can yield an InProgressAny of all the 
active download coroutines /and/ the "new job enqueued" signal, and 
execution will resume as long as any of those conditions are met.

Is there anything in your current proposal that would allow for this 
use-case?

[1] https://github.com/jtackaberry/stagehand/blob/master/src/manager.py#L390


Another pain point for me has been this notion of unhandled asynchronous 
exceptions.  Asynchronous tasks are represented as an InProgress object, 
and if a task fails, accessing InProgress.result will raise the 
exception at which point it's considered handled. This attribute access 
could happen at any time during the lifetime of the InProgress object, 
outside the task's call stack.

The desirable behaviour is that when the InProgress object is destroyed, 
if there's an exception attached to it from a failed task that hasn't 
been accessed, we should output the stack as an unhandled exception.  In 
Kaa, I do this with a weakref destroy callback, but this isn't ideal 
because with GC, the InProgress might not be destroyed until well after 
the exception is relevant.

I make every effort to remove reference cycles and generally get the 
InProgress object destroyed as early as possible, but this changes 
subtly between Python versions.

How will unhandled asynchronous exceptions be handled with tulip?

Thanks!
Jason.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121216/53d2fc32/attachment.html>

From tjreedy at udel.edu  Sun Dec 16 21:05:36 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 16 Dec 2012 15:05:36 -0500
Subject: [Python-ideas] Docstrings for namedtuple
In-Reply-To: <CAF-Rda_HQTehs__49JjMh6UvFDb7mLoCmwFG3+Uz_WhoV7A5AA@mail.gmail.com>
References: <kaafhm$b7n$1@ger.gmane.org>
	<CAF-Rda_HQTehs__49JjMh6UvFDb7mLoCmwFG3+Uz_WhoV7A5AA@mail.gmail.com>
Message-ID: <kal9if$70i$1@ger.gmane.org>

On 12/16/2012 8:22 AM, Eli Bendersky wrote:

> This may be a good time to say that personally I always disliked
> namedtuple's creation syntax. It is unpleasant in two respects:
>
> 1. You have to repeat the name
> 2. You have to specify the fields in a space-separated string
>
> I wish there was an alternative of something like:
>
> @namedtuple
> class Point:
>    x = 0
>    y = 0

Pretty easy, once one figures out metaclass basics.

import collections as co

class ntmeta():
     def __prepare__(name, bases, **kwds):
         return co.OrderedDict()
     def __new__(cls, name, bases, namespace):
         print(namespace) # shows why filter is needed
         return co.namedtuple(name,
                 filter(lambda s: s[0] != '_', namespace))

class Point(metaclass=ntmeta):
     x = 0
     y = 0

p = Point(1,2)
print(p)
#
OrderedDict([('__module__', '__main__'), ('__qualname__', 'Point'), 
('x', 0), ('y', 0)])
Point(x=1, y=2)

To use the filtered namespace values as defaults (Antoine's suggestion), 
first replace namedtuple() with its body.
Then modify the header of generated name.__new__. For Point, change

def __new__(_cls, x, y):
#to
def __new__(_cls, x=0, y=0):

Also change the newclass docstring. For Point, change
     'Point(x, y)'
to
     'Point(x=0, y=0)'

-- 
Terry Jan Reedy



From timothy.c.delaney at gmail.com  Sun Dec 16 22:08:18 2012
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Mon, 17 Dec 2012 08:08:18 +1100
Subject: [Python-ideas] Docstrings for namedtuple
In-Reply-To: <kal9if$70i$1@ger.gmane.org>
References: <kaafhm$b7n$1@ger.gmane.org>
	<CAF-Rda_HQTehs__49JjMh6UvFDb7mLoCmwFG3+Uz_WhoV7A5AA@mail.gmail.com>
	<kal9if$70i$1@ger.gmane.org>
Message-ID: <CAN8CLgkVoWS-OX9vpLRxWoGV9XJ0=cCcwzQKw2LySEhtcxsM=A@mail.gmail.com>

It can be made a bit more intelligent. I haven't done anything with
docstrings here, but it wouldn't be hard to add. This automatically handles
defaults (you can call the namedtuple with either zero parameters or the
exact number). You can specify __rename__ = True, which will then only
exclude __dunder_names__ (otherwise all names starting with an underscore
are excluded). You can also pass verbose=[True|False] to the subclass
constructor.

import collections

class NamedTupleMetaClass(type):
    # The prepare function
    @classmethod
    def __prepare__(metacls, name, bases): # No keywords in this case
        return collections.OrderedDict()

    # The metaclass invocation
    def __new__(cls, name, bases, classdict):
        fields = collections.OrderedDict()
        rename = False
        verbose = False

        for f in classdict:
            if f == '__rename__':
                rename = classdict[f]
            elif f == '__verbose__':
                verbose = classdict[f]

        for f in classdict:
            if f.startswith('_'):
                if not rename:
                    continue

                if f.startswith('__') and f.endswith('__'):
                    continue

            fields[f] = classdict[f]

        result = type.__new__(cls, name, bases, classdict)
        result.fields = fields
        result.rename = rename
        result.verbose = verbose
        return result

class NamedTuple(metaclass=NamedTupleMetaClass):
    def __new__(cls, *p, **kw):
        print(p)
        if not p:
            p = cls.fields.values()

        try:
            verbose = kw['verbose']
        except KeyError:
            verbose = cls.verbose

        return collections.namedtuple(cls.__name__, list(cls.fields),
rename=cls.rename, verbose=verbose)(*p)

Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:57:17) [MSC v.1600 64
bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import namedtuple_baseclass
>>> class Point(namedtuple_baseclass.NamedTuple):
...     x = 0
...     y = 0
...
>>> print(Point())
Point(x=0, y=0)
>>> print(Point(1, 2))
Point(x=1, y=2)
>>> print(Point(1))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".\namedtuple_baseclass.py", line 38, in __new__
    return collections.namedtuple(cls.__name__, list(cls.fields),
rename=cls.rename, verbose=cls.verbose)(*p)
TypeError: __new__() missing 1 required positional argument: 'y'
>>> print(Point(1, 2, 3))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".\namedtuple_baseclass.py", line 38, in __new__
    return collections.namedtuple(cls.__name__, list(cls.fields),
rename=cls.rename, verbose=cls.verbose)(*p)
TypeError: __new__() takes 3 positional arguments but 4 were given
>>>

Tim Delaney


On 17 December 2012 07:05, Terry Reedy <tjreedy at udel.edu> wrote:

> On 12/16/2012 8:22 AM, Eli Bendersky wrote:
>
>  This may be a good time to say that personally I always disliked
>> namedtuple's creation syntax. It is unpleasant in two respects:
>>
>> 1. You have to repeat the name
>> 2. You have to specify the fields in a space-separated string
>>
>> I wish there was an alternative of something like:
>>
>> @namedtuple
>> class Point:
>>    x = 0
>>    y = 0
>>
>
> Pretty easy, once one figures out metaclass basics.
>
> import collections as co
>
> class ntmeta():
>     def __prepare__(name, bases, **kwds):
>         return co.OrderedDict()
>     def __new__(cls, name, bases, namespace):
>         print(namespace) # shows why filter is needed
>         return co.namedtuple(name,
>                 filter(lambda s: s[0] != '_', namespace))
>
> class Point(metaclass=ntmeta):
>
>     x = 0
>     y = 0
>
> p = Point(1,2)
> print(p)
> #
> OrderedDict([('__module__', '__main__'), ('__qualname__', 'Point'), ('x',
> 0), ('y', 0)])
> Point(x=1, y=2)
>
> To use the filtered namespace values as defaults (Antoine's suggestion),
> first replace namedtuple() with its body.
> Then modify the header of generated name.__new__. For Point, change
>
> def __new__(_cls, x, y):
> #to
> def __new__(_cls, x=0, y=0):
>
> Also change the newclass docstring. For Point, change
>     'Point(x, y)'
> to
>     'Point(x=0, y=0)'
>
> --
> Terry Jan Reedy
>
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121217/a3346a54/attachment.html>

From timothy.c.delaney at gmail.com  Sun Dec 16 22:09:21 2012
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Mon, 17 Dec 2012 08:09:21 +1100
Subject: [Python-ideas] Docstrings for namedtuple
In-Reply-To: <CAN8CLgkVoWS-OX9vpLRxWoGV9XJ0=cCcwzQKw2LySEhtcxsM=A@mail.gmail.com>
References: <kaafhm$b7n$1@ger.gmane.org>
	<CAF-Rda_HQTehs__49JjMh6UvFDb7mLoCmwFG3+Uz_WhoV7A5AA@mail.gmail.com>
	<kal9if$70i$1@ger.gmane.org>
	<CAN8CLgkVoWS-OX9vpLRxWoGV9XJ0=cCcwzQKw2LySEhtcxsM=A@mail.gmail.com>
Message-ID: <CAN8CLgmco1opghFTpeYcYNLVftNAEXFMJAiucQUzDLj16NLh9A@mail.gmail.com>

And ignore that extra debugging print in there ;)

class NamedTuple(metaclass=NamedTupleMetaClass):
    def __new__(cls, *p, **kw):
        if not p:
            p = cls.fields.values()

        try:
            verbose = kw['verbose']
        except KeyError:
            verbose = cls.verbose

        return collections.namedtuple(cls.__name__, list(cls.fields),
rename=cls.rename, verbose=verbose)(*p)

Tim Delaney


On 17 December 2012 08:08, Tim Delaney <timothy.c.delaney at gmail.com> wrote:

> It can be made a bit more intelligent. I haven't done anything with
> docstrings here, but it wouldn't be hard to add. This automatically handles
> defaults (you can call the namedtuple with either zero parameters or the
> exact number). You can specify __rename__ = True, which will then only
> exclude __dunder_names__ (otherwise all names starting with an underscore
> are excluded). You can also pass verbose=[True|False] to the subclass
> constructor.
>
> import collections
>
> class NamedTupleMetaClass(type):
>     # The prepare function
>     @classmethod
>     def __prepare__(metacls, name, bases): # No keywords in this case
>         return collections.OrderedDict()
>
>     # The metaclass invocation
>     def __new__(cls, name, bases, classdict):
>         fields = collections.OrderedDict()
>         rename = False
>         verbose = False
>
>         for f in classdict:
>             if f == '__rename__':
>                 rename = classdict[f]
>             elif f == '__verbose__':
>                 verbose = classdict[f]
>
>         for f in classdict:
>             if f.startswith('_'):
>                 if not rename:
>                     continue
>
>                 if f.startswith('__') and f.endswith('__'):
>                     continue
>
>             fields[f] = classdict[f]
>
>         result = type.__new__(cls, name, bases, classdict)
>         result.fields = fields
>         result.rename = rename
>         result.verbose = verbose
>         return result
>
> class NamedTuple(metaclass=NamedTupleMetaClass):
>     def __new__(cls, *p, **kw):
>         print(p)
>         if not p:
>             p = cls.fields.values()
>
>         try:
>             verbose = kw['verbose']
>         except KeyError:
>             verbose = cls.verbose
>
>         return collections.namedtuple(cls.__name__, list(cls.fields),
> rename=cls.rename, verbose=verbose)(*p)
>
> Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:57:17) [MSC v.1600 64
> bit (AMD64)] on win32
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import namedtuple_baseclass
> >>> class Point(namedtuple_baseclass.NamedTuple):
> ...     x = 0
> ...     y = 0
> ...
> >>> print(Point())
> Point(x=0, y=0)
> >>> print(Point(1, 2))
> Point(x=1, y=2)
> >>> print(Point(1))
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File ".\namedtuple_baseclass.py", line 38, in __new__
>     return collections.namedtuple(cls.__name__, list(cls.fields),
> rename=cls.rename, verbose=cls.verbose)(*p)
> TypeError: __new__() missing 1 required positional argument: 'y'
> >>> print(Point(1, 2, 3))
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File ".\namedtuple_baseclass.py", line 38, in __new__
>     return collections.namedtuple(cls.__name__, list(cls.fields),
> rename=cls.rename, verbose=cls.verbose)(*p)
> TypeError: __new__() takes 3 positional arguments but 4 were given
> >>>
>
> Tim Delaney
>
>
> On 17 December 2012 07:05, Terry Reedy <tjreedy at udel.edu> wrote:
>
>> On 12/16/2012 8:22 AM, Eli Bendersky wrote:
>>
>>  This may be a good time to say that personally I always disliked
>>> namedtuple's creation syntax. It is unpleasant in two respects:
>>>
>>> 1. You have to repeat the name
>>> 2. You have to specify the fields in a space-separated string
>>>
>>> I wish there was an alternative of something like:
>>>
>>> @namedtuple
>>> class Point:
>>>    x = 0
>>>    y = 0
>>>
>>
>> Pretty easy, once one figures out metaclass basics.
>>
>> import collections as co
>>
>> class ntmeta():
>>     def __prepare__(name, bases, **kwds):
>>         return co.OrderedDict()
>>     def __new__(cls, name, bases, namespace):
>>         print(namespace) # shows why filter is needed
>>         return co.namedtuple(name,
>>                 filter(lambda s: s[0] != '_', namespace))
>>
>> class Point(metaclass=ntmeta):
>>
>>     x = 0
>>     y = 0
>>
>> p = Point(1,2)
>> print(p)
>> #
>> OrderedDict([('__module__', '__main__'), ('__qualname__', 'Point'), ('x',
>> 0), ('y', 0)])
>> Point(x=1, y=2)
>>
>> To use the filtered namespace values as defaults (Antoine's suggestion),
>> first replace namedtuple() with its body.
>> Then modify the header of generated name.__new__. For Point, change
>>
>> def __new__(_cls, x, y):
>> #to
>> def __new__(_cls, x=0, y=0):
>>
>> Also change the newclass docstring. For Point, change
>>     'Point(x, y)'
>> to
>>     'Point(x=0, y=0)'
>>
>> --
>> Terry Jan Reedy
>>
>>
>> ______________________________**_________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121217/de611d23/attachment.html>

From timothy.c.delaney at gmail.com  Sun Dec 16 22:21:39 2012
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Mon, 17 Dec 2012 08:21:39 +1100
Subject: [Python-ideas] Docstrings for namedtuple
In-Reply-To: <CAN8CLgmco1opghFTpeYcYNLVftNAEXFMJAiucQUzDLj16NLh9A@mail.gmail.com>
References: <kaafhm$b7n$1@ger.gmane.org>
	<CAF-Rda_HQTehs__49JjMh6UvFDb7mLoCmwFG3+Uz_WhoV7A5AA@mail.gmail.com>
	<kal9if$70i$1@ger.gmane.org>
	<CAN8CLgkVoWS-OX9vpLRxWoGV9XJ0=cCcwzQKw2LySEhtcxsM=A@mail.gmail.com>
	<CAN8CLgmco1opghFTpeYcYNLVftNAEXFMJAiucQUzDLj16NLh9A@mail.gmail.com>
Message-ID: <CAN8CLgk_1gpMi4NSxjJDv+XTGeKkgpx5zSBHUFBjNbD-pw7gpQ@mail.gmail.com>

An improvement would be to cache the namedtuple types so that each only
gets created once.

Tim Delaney


On 17 December 2012 08:09, Tim Delaney <timothy.c.delaney at gmail.com> wrote:

> And ignore that extra debugging print in there ;)
>
> class NamedTuple(metaclass=NamedTupleMetaClass):
>     def __new__(cls, *p, **kw):
>         if not p:
>             p = cls.fields.values()
>
>         try:
>             verbose = kw['verbose']
>         except KeyError:
>             verbose = cls.verbose
>
>         return collections.namedtuple(cls.__name__, list(cls.fields),
> rename=cls.rename, verbose=verbose)(*p)
>
> Tim Delaney
>
>
> On 17 December 2012 08:08, Tim Delaney <timothy.c.delaney at gmail.com>wrote:
>
>> It can be made a bit more intelligent. I haven't done anything with
>> docstrings here, but it wouldn't be hard to add. This automatically handles
>> defaults (you can call the namedtuple with either zero parameters or the
>> exact number). You can specify __rename__ = True, which will then only
>> exclude __dunder_names__ (otherwise all names starting with an underscore
>> are excluded). You can also pass verbose=[True|False] to the subclass
>> constructor.
>>
>> import collections
>>
>> class NamedTupleMetaClass(type):
>>     # The prepare function
>>     @classmethod
>>     def __prepare__(metacls, name, bases): # No keywords in this case
>>         return collections.OrderedDict()
>>
>>     # The metaclass invocation
>>     def __new__(cls, name, bases, classdict):
>>         fields = collections.OrderedDict()
>>         rename = False
>>         verbose = False
>>
>>         for f in classdict:
>>             if f == '__rename__':
>>                 rename = classdict[f]
>>             elif f == '__verbose__':
>>                 verbose = classdict[f]
>>
>>         for f in classdict:
>>             if f.startswith('_'):
>>                 if not rename:
>>                     continue
>>
>>                 if f.startswith('__') and f.endswith('__'):
>>                     continue
>>
>>             fields[f] = classdict[f]
>>
>>         result = type.__new__(cls, name, bases, classdict)
>>         result.fields = fields
>>         result.rename = rename
>>         result.verbose = verbose
>>         return result
>>
>> class NamedTuple(metaclass=NamedTupleMetaClass):
>>     def __new__(cls, *p, **kw):
>>         print(p)
>>         if not p:
>>             p = cls.fields.values()
>>
>>         try:
>>             verbose = kw['verbose']
>>         except KeyError:
>>             verbose = cls.verbose
>>
>>         return collections.namedtuple(cls.__name__, list(cls.fields),
>> rename=cls.rename, verbose=verbose)(*p)
>>
>> Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:57:17) [MSC v.1600 64
>> bit (AMD64)] on win32
>> Type "help", "copyright", "credits" or "license" for more information.
>> >>> import namedtuple_baseclass
>> >>> class Point(namedtuple_baseclass.NamedTuple):
>> ...     x = 0
>> ...     y = 0
>> ...
>> >>> print(Point())
>>  Point(x=0, y=0)
>> >>> print(Point(1, 2))
>> Point(x=1, y=2)
>> >>> print(Point(1))
>> Traceback (most recent call last):
>>   File "<stdin>", line 1, in <module>
>>   File ".\namedtuple_baseclass.py", line 38, in __new__
>>     return collections.namedtuple(cls.__name__, list(cls.fields),
>> rename=cls.rename, verbose=cls.verbose)(*p)
>> TypeError: __new__() missing 1 required positional argument: 'y'
>> >>> print(Point(1, 2, 3))
>> Traceback (most recent call last):
>>   File "<stdin>", line 1, in <module>
>>   File ".\namedtuple_baseclass.py", line 38, in __new__
>>     return collections.namedtuple(cls.__name__, list(cls.fields),
>> rename=cls.rename, verbose=cls.verbose)(*p)
>> TypeError: __new__() takes 3 positional arguments but 4 were given
>> >>>
>>
>> Tim Delaney
>>
>>
>> On 17 December 2012 07:05, Terry Reedy <tjreedy at udel.edu> wrote:
>>
>>> On 12/16/2012 8:22 AM, Eli Bendersky wrote:
>>>
>>>  This may be a good time to say that personally I always disliked
>>>> namedtuple's creation syntax. It is unpleasant in two respects:
>>>>
>>>> 1. You have to repeat the name
>>>> 2. You have to specify the fields in a space-separated string
>>>>
>>>> I wish there was an alternative of something like:
>>>>
>>>> @namedtuple
>>>> class Point:
>>>>    x = 0
>>>>    y = 0
>>>>
>>>
>>> Pretty easy, once one figures out metaclass basics.
>>>
>>> import collections as co
>>>
>>> class ntmeta():
>>>     def __prepare__(name, bases, **kwds):
>>>         return co.OrderedDict()
>>>     def __new__(cls, name, bases, namespace):
>>>         print(namespace) # shows why filter is needed
>>>         return co.namedtuple(name,
>>>                 filter(lambda s: s[0] != '_', namespace))
>>>
>>> class Point(metaclass=ntmeta):
>>>
>>>     x = 0
>>>     y = 0
>>>
>>> p = Point(1,2)
>>> print(p)
>>> #
>>> OrderedDict([('__module__', '__main__'), ('__qualname__', 'Point'),
>>> ('x', 0), ('y', 0)])
>>> Point(x=1, y=2)
>>>
>>> To use the filtered namespace values as defaults (Antoine's suggestion),
>>> first replace namedtuple() with its body.
>>> Then modify the header of generated name.__new__. For Point, change
>>>
>>> def __new__(_cls, x, y):
>>> #to
>>> def __new__(_cls, x=0, y=0):
>>>
>>> Also change the newclass docstring. For Point, change
>>>     'Point(x, y)'
>>> to
>>>     'Point(x=0, y=0)'
>>>
>>> --
>>> Terry Jan Reedy
>>>
>>>
>>> ______________________________**_________________
>>> Python-ideas mailing list
>>> Python-ideas at python.org
>>> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121217/714bac0f/attachment.html>

From timothy.c.delaney at gmail.com  Sun Dec 16 22:55:10 2012
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Mon, 17 Dec 2012 08:55:10 +1100
Subject: [Python-ideas] Docstrings for namedtuple
In-Reply-To: <CAN8CLgk_1gpMi4NSxjJDv+XTGeKkgpx5zSBHUFBjNbD-pw7gpQ@mail.gmail.com>
References: <kaafhm$b7n$1@ger.gmane.org>
	<CAF-Rda_HQTehs__49JjMh6UvFDb7mLoCmwFG3+Uz_WhoV7A5AA@mail.gmail.com>
	<kal9if$70i$1@ger.gmane.org>
	<CAN8CLgkVoWS-OX9vpLRxWoGV9XJ0=cCcwzQKw2LySEhtcxsM=A@mail.gmail.com>
	<CAN8CLgmco1opghFTpeYcYNLVftNAEXFMJAiucQUzDLj16NLh9A@mail.gmail.com>
	<CAN8CLgk_1gpMi4NSxjJDv+XTGeKkgpx5zSBHUFBjNbD-pw7gpQ@mail.gmail.com>
Message-ID: <CAN8CLgmjzOXcYM3ChPCO6WQAUj5pJD7PDUT64s=X7aA5xRySYQ@mail.gmail.com>

Improved version, with caching (verbose and non-verbose versions are
different classes) and only parsing the fields once per class.

import collections

class NamedTupleMetaClass(type):
    # The prepare function
    @classmethod
    def __prepare__(metacls, name, bases): # No keywords in this case
        return collections.OrderedDict()

    # The metaclass invocation
    def __new__(cls, name, bases, classdict):
        result = type.__new__(cls, name, bases, classdict)
        result._classdict = classdict
        return result

class NamedTuple(metaclass=NamedTupleMetaClass):
    _cache = {}

    def __new__(cls, *p, **kw):
        verbose = False

        try:
            verbose = kw_verbose = kw['verbose']
        except KeyError:
            kw_verbose = None

        try:
            nt, fields = cls._cache[cls.__module__, cls.__qualname__,
verbose]
        except KeyError:
            classdict = cls._classdict
            fields = collections.OrderedDict()
            rename = False

            for f in classdict:
                if f == '__rename__':
                    rename = classdict[f]
                elif f == '__verbose__':
                    verbose = classdict[f]

            for f in classdict:
                if f.startswith('_'):
                    if not rename:
                        continue

                    if f.startswith('__') and f.endswith('__'):
                        continue

                fields[f] = classdict[f]

            if kw_verbose is not None:
                verbose = kw_verbose

            nt = collections.namedtuple(cls.__name__, fields.keys(),
rename=rename, verbose=verbose)
            nt, fields = cls._cache[cls.__module__, cls.__qualname__,
verbose] = nt, list(fields.values())

        if not p:
            p = fields

        return nt(*p)

Tim Delaney


On 17 December 2012 08:21, Tim Delaney <timothy.c.delaney at gmail.com> wrote:

> An improvement would be to cache the namedtuple types so that each only
> gets created once.
>
> Tim Delaney
>
>
> On 17 December 2012 08:09, Tim Delaney <timothy.c.delaney at gmail.com>wrote:
>
>> And ignore that extra debugging print in there ;)
>>
>> class NamedTuple(metaclass=NamedTupleMetaClass):
>>     def __new__(cls, *p, **kw):
>>         if not p:
>>             p = cls.fields.values()
>>
>>         try:
>>             verbose = kw['verbose']
>>         except KeyError:
>>             verbose = cls.verbose
>>
>>         return collections.namedtuple(cls.__name__, list(cls.fields),
>> rename=cls.rename, verbose=verbose)(*p)
>>
>> Tim Delaney
>>
>>
>> On 17 December 2012 08:08, Tim Delaney <timothy.c.delaney at gmail.com>wrote:
>>
>>> It can be made a bit more intelligent. I haven't done anything with
>>> docstrings here, but it wouldn't be hard to add. This automatically handles
>>> defaults (you can call the namedtuple with either zero parameters or the
>>> exact number). You can specify __rename__ = True, which will then only
>>> exclude __dunder_names__ (otherwise all names starting with an underscore
>>> are excluded). You can also pass verbose=[True|False] to the subclass
>>> constructor.
>>>
>>> import collections
>>>
>>> class NamedTupleMetaClass(type):
>>>     # The prepare function
>>>     @classmethod
>>>     def __prepare__(metacls, name, bases): # No keywords in this case
>>>         return collections.OrderedDict()
>>>
>>>     # The metaclass invocation
>>>     def __new__(cls, name, bases, classdict):
>>>         fields = collections.OrderedDict()
>>>         rename = False
>>>         verbose = False
>>>
>>>         for f in classdict:
>>>             if f == '__rename__':
>>>                 rename = classdict[f]
>>>             elif f == '__verbose__':
>>>                 verbose = classdict[f]
>>>
>>>         for f in classdict:
>>>             if f.startswith('_'):
>>>                 if not rename:
>>>                     continue
>>>
>>>                 if f.startswith('__') and f.endswith('__'):
>>>                     continue
>>>
>>>             fields[f] = classdict[f]
>>>
>>>         result = type.__new__(cls, name, bases, classdict)
>>>         result.fields = fields
>>>         result.rename = rename
>>>         result.verbose = verbose
>>>         return result
>>>
>>> class NamedTuple(metaclass=NamedTupleMetaClass):
>>>     def __new__(cls, *p, **kw):
>>>         print(p)
>>>         if not p:
>>>             p = cls.fields.values()
>>>
>>>         try:
>>>             verbose = kw['verbose']
>>>         except KeyError:
>>>             verbose = cls.verbose
>>>
>>>         return collections.namedtuple(cls.__name__, list(cls.fields),
>>> rename=cls.rename, verbose=verbose)(*p)
>>>
>>> Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:57:17) [MSC v.1600 64
>>> bit (AMD64)] on win32
>>> Type "help", "copyright", "credits" or "license" for more information.
>>> >>> import namedtuple_baseclass
>>> >>> class Point(namedtuple_baseclass.NamedTuple):
>>> ...     x = 0
>>> ...     y = 0
>>> ...
>>> >>> print(Point())
>>>  Point(x=0, y=0)
>>> >>> print(Point(1, 2))
>>> Point(x=1, y=2)
>>> >>> print(Point(1))
>>> Traceback (most recent call last):
>>>   File "<stdin>", line 1, in <module>
>>>   File ".\namedtuple_baseclass.py", line 38, in __new__
>>>     return collections.namedtuple(cls.__name__, list(cls.fields),
>>> rename=cls.rename, verbose=cls.verbose)(*p)
>>> TypeError: __new__() missing 1 required positional argument: 'y'
>>> >>> print(Point(1, 2, 3))
>>> Traceback (most recent call last):
>>>   File "<stdin>", line 1, in <module>
>>>   File ".\namedtuple_baseclass.py", line 38, in __new__
>>>     return collections.namedtuple(cls.__name__, list(cls.fields),
>>> rename=cls.rename, verbose=cls.verbose)(*p)
>>> TypeError: __new__() takes 3 positional arguments but 4 were given
>>> >>>
>>>
>>> Tim Delaney
>>>
>>>
>>> On 17 December 2012 07:05, Terry Reedy <tjreedy at udel.edu> wrote:
>>>
>>>> On 12/16/2012 8:22 AM, Eli Bendersky wrote:
>>>>
>>>>  This may be a good time to say that personally I always disliked
>>>>> namedtuple's creation syntax. It is unpleasant in two respects:
>>>>>
>>>>> 1. You have to repeat the name
>>>>> 2. You have to specify the fields in a space-separated string
>>>>>
>>>>> I wish there was an alternative of something like:
>>>>>
>>>>> @namedtuple
>>>>> class Point:
>>>>>    x = 0
>>>>>    y = 0
>>>>>
>>>>
>>>> Pretty easy, once one figures out metaclass basics.
>>>>
>>>> import collections as co
>>>>
>>>> class ntmeta():
>>>>     def __prepare__(name, bases, **kwds):
>>>>         return co.OrderedDict()
>>>>     def __new__(cls, name, bases, namespace):
>>>>         print(namespace) # shows why filter is needed
>>>>         return co.namedtuple(name,
>>>>                 filter(lambda s: s[0] != '_', namespace))
>>>>
>>>> class Point(metaclass=ntmeta):
>>>>
>>>>     x = 0
>>>>     y = 0
>>>>
>>>> p = Point(1,2)
>>>> print(p)
>>>> #
>>>> OrderedDict([('__module__', '__main__'), ('__qualname__', 'Point'),
>>>> ('x', 0), ('y', 0)])
>>>> Point(x=1, y=2)
>>>>
>>>> To use the filtered namespace values as defaults (Antoine's
>>>> suggestion), first replace namedtuple() with its body.
>>>> Then modify the header of generated name.__new__. For Point, change
>>>>
>>>> def __new__(_cls, x, y):
>>>> #to
>>>> def __new__(_cls, x=0, y=0):
>>>>
>>>> Also change the newclass docstring. For Point, change
>>>>     'Point(x, y)'
>>>> to
>>>>     'Point(x=0, y=0)'
>>>>
>>>> --
>>>> Terry Jan Reedy
>>>>
>>>>
>>>> ______________________________**_________________
>>>> Python-ideas mailing list
>>>> Python-ideas at python.org
>>>> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121217/55da8e82/attachment.html>

From oscar.j.benjamin at gmail.com  Sun Dec 16 23:41:28 2012
From: oscar.j.benjamin at gmail.com (Oscar Benjamin)
Date: Sun, 16 Dec 2012 22:41:28 +0000
Subject: [Python-ideas] Fwd: Graph class
In-Reply-To: <CAMjeLr-hmit-GKuicXSXv8xuZASFqWAK6pLGjNvJ_+gMhq0vpQ@mail.gmail.com>
References: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>
	<CAOvn4qgkX4+4B9PmmoRmHC+HG-6s+HHb6BiQfzEo+RTaFVdz+Q@mail.gmail.com>
	<CAMjeLr9wymGhGaOEa=qK-=nK6_A6ZCz+9ivBnz2rwAFh3AmX8A@mail.gmail.com>
	<CACac1F-K7iOEf60uR9x5SwSPEzXqHkbbFw-zmVxguogF=Qy-hg@mail.gmail.com>
	<CAMjeLr9KdF6n5U29BTYt7wmQd1GQv0n-3zs8V+n6UfoEuoy=Jg@mail.gmail.com>
	<CAMjeLr-hmit-GKuicXSXv8xuZASFqWAK6pLGjNvJ_+gMhq0vpQ@mail.gmail.com>
Message-ID: <CAHVvXxRYAHJkBBnxXkdqo9yKpoOP3hz_UpgmTUb+Y6aoqxbYtw@mail.gmail.com>

On 9 December 2012 20:31, Mark Adam <dreamingforward at gmail.com> wrote:
>
> On Sun, Dec 9, 2012 at 5:40 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> > On 9 December 2012 01:29, Mark Adam <dreamingforward at gmail.com> wrote:
> >> All very interesting.  I'm going to suggest a sort of "meta-discussion"
> >> about why -- despite the power of graphs as a data structure -- such a
> >> feature has not stabilized into a workable solution for inclusion in a
> >> high-level language like Python.
> >>
> >> I identity the following points of "wavery":
> >>
> >> 1) the naming of methods (add_edge, vs add(1,2)):  aesthetic grounds,
> >> 2) what methods to include (degree + neighbors or the standard dict's
> >> __len__ + __getitem__):  API grounds
> >> 3) how much flexibility to be offered (directed, multi-graphs, edge weights
> >> with arbitrary labeling, etc.):  functionality grounds
> >> 4) what underlying data structure to use (sparse adjacency dicts, matrices,
> >> etc):  representation conflicts.

For all the reasons above I don't much see the utility of implementing
some kind of standard graph class. There are too many possibilities
for any one implementation to be generally applicable. I have
implemented graphs in Python many times and I very often find that the
detail of what I want to do leads me to create a different
implementation. Another consideration that you've not mentioned is the
occasional need for a OrderedGraph that keeps track of some kind of
order for its vertices.

> > 4) Whether the library requires some sort of "Vertex" type, or works
> > with arbitrary values, similarly whether there is a defined "Edge"
> > class or edges can be labelled, weighted, etc with arbitrary Python
> > values.
>
> This I put under #3 (functionality grounds) "edge weights with
> arbitrary labeling", Vertex's with abitrary values i think would be
> included.

Having implemented graphs a few times now, I have come to the
conclusion that it is a good idea to make the restriction that the
vertices should be hashable. Otherwise, how would you get O(1)
behaviour for methods like has_edge()?

> > 5) Granularity - if all I want is a depth-first search algorithm, why
> > pull in a dependency on 100 graph algorithms I'm not interested in?
>
> Hmm, I would call this "5) comprehensiveness: whether to include every
> graph algorithm known to mankind."

This is the one part of a graph library that is really useful.
Creating a class or a data structure that represents a graph in some
way is trivially easy. Creating trustworthy implementations of all the
graph-theoretic algorithms with the right kind of big-O behaviour is
not.

> > My feeling is that graphs are right on the borderline of a data
> > structure that is simple enough that people invent their own rather
> > than bother conforming to a "standard" model but complex enough that
> > it's worth using library functions rather than getting the details
> > wrong.

What details would you get wrong? I contend that it is very easy to
implement a graph without getting any of the details wrong. It is the
graph algorithms that are hard, not the data structure. Here's a
couple of examples:

G = {
    'A':{'B', 'C'},
    'B':{'A'},
    'C':{'B'}
}

M = [[0, 1, 1],
     [1, 0, 0],
     [0, 1, 0]]

You may want to wrap the above in some kind of class in which case
you'll end up with something like the following (from a private
project - modified a little before posting so it may not work now):

class Graph:

    def __init__(self, nodes, edges):
        self._nodes = frozenset(nodes)
        self._edges = defaultdict(set)
        for n1, n2 in edges:
            self._edges[n1].add(n2)

    @property
    def nodes(self):
        return iter(self._nodes)

    @property
    def edges(self):
        for n1 in self._nodes:
            for n2 in self.edges_node(n1):
                yield (n1, n2)

    def edges_node(self, node):
        return iter(self._edges[node])

    def has_edge(self, nfrom, nto):
        return nto in self._edges[nfrom]

    def __str__(self):
        return '\n'.join(self._iterdot())

    def _iterdot(self):
        yield 'digraph G {'
        for n in self.nodes:
            yield '    %s;' % n
        for nfrom, nto in self.edges:
            yield '    %s -> %s;' % (nfrom, nto)
        yield '}'

G2 = Graph('ABC', [('A', 'B'), ('A', 'C'), ('B', 'C'), ('C', 'A')])

The above class is unusual in the sense that it is a pure Graph class.
Normally I would simply be adding a few graphy methods onto a class
that represents a network of some kind.

The problem that I found with current support for graphs in Python is
not the lack of an appropriate data structure. Rather the problem is
that implementations of graph-theoretic algorithms (as in e.g.
pygraph) are tied to a specific Graph class that I didn't want to or
couldn't use in my own project. This means that to determine if you
have something that represents a strongly connected graph you first
need to create a separate redundant data structure and then pass that
into the algorithm.

What would be more useful than a new Graph class would be
implementations of graph algorithms that can easily be applied to any
representation of a graph. As an example, I can write a function for
determining if a graph is a DAG using a small subset of the possible
methods that a Graph class would have:

def is_dag(nodes, edges_node):
    '''Determine if a directed graph is acyclic

    nodes is an iterable yielding all vertices in the graph
    edges_node(node) is an iterable giving all nodes that node connects to
    '''
    visited = set()
    visiting = set()
    for node in nodes:
        if node not in visited:
            if has_backedge(node, edges_node, visited, visiting):
                return False
    else:
        return True

def has_backedge(node, edges_node, visited, visiting):
    '''Helper for is_dag()'''
    if node in visiting:
        return True
    visited.add(node)
    visiting.add(node)
    for childnode in edges_node(node):
        if has_backedge(childnode, edges_node, visited, visiting):
            return True
    visiting.remove(node)
    return False

This can be used with the Graph class or just as easily with the
dict-of-sets like so:

is_dag(G2.nodes, G2.edges_node)
is_dag(G, G.__getitem__)

It is possible to do something like this for all of the graph
algorithms and I think that a library like this would be more useful
than a new Graph type.


From hannu at krosing.net  Mon Dec 17 00:28:04 2012
From: hannu at krosing.net (Hannu Krosing)
Date: Mon, 17 Dec 2012 00:28:04 +0100
Subject: [Python-ideas] Graph class
In-Reply-To: <CAP7+vJL7kLAkqMjkWhaqEeaLVec3Q8HpvRRVmdRikOOgb_k0wg@mail.gmail.com>
References: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>
	<CAOvn4qgkX4+4B9PmmoRmHC+HG-6s+HHb6BiQfzEo+RTaFVdz+Q@mail.gmail.com>
	<CAMjeLr9wymGhGaOEa=qK-=nK6_A6ZCz+9ivBnz2rwAFh3AmX8A@mail.gmail.com>
	<CACac1F-K7iOEf60uR9x5SwSPEzXqHkbbFw-zmVxguogF=Qy-hg@mail.gmail.com>
	<CAMjeLr9KdF6n5U29BTYt7wmQd1GQv0n-3zs8V+n6UfoEuoy=Jg@mail.gmail.com>
	<CAMjeLr-hmit-GKuicXSXv8xuZASFqWAK6pLGjNvJ_+gMhq0vpQ@mail.gmail.com>
	<87txru1wxr.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CAMjeLr_2aSj44hsBa1P2J47Xh-DzHVcr0etcmHpKQnXn0yRQMw@mail.gmail.com>
	<loom.20121216T144624-97@post.gmane.org>
	<CAP7+vJL7kLAkqMjkWhaqEeaLVec3Q8HpvRRVmdRikOOgb_k0wg@mail.gmail.com>
Message-ID: <50CE5904.9090102@krosing.net>

On 12/16/2012 04:41 PM, Guido van Rossum wrote:
> I think of graphs and trees as patterns, not data structures.

How do you draw line between what is data structure and what is pattern ?

Do you have any ideas on how to represent "patterns" in
python standard library ?

By a set of samples ?

By (a set of) classes realising the patterns ?

By a set of functions working on existing structures which
implement the pattern ?

Duck-typing should lend itself well to this last approach.

Do we currently have any modules in standard library which are
more patterns and less data structures ?

-----------------------
Hannu
>
>
>
> -- 
> --Guido van Rossum (on iPad)
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121217/858d7f4b/attachment.html>

From geertj at gmail.com  Mon Dec 17 12:08:27 2012
From: geertj at gmail.com (Geert Jansen)
Date: Mon, 17 Dec 2012 12:08:27 +0100
Subject: [Python-ideas] async: feedback on EventLoop API
Message-ID: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>

Hi,

below is some feedback on the EventLoop API as implemented in tulip.
I am interested in this for an (alternate) dbus interface that I've written
for Python that supports evented IO. I'm hoping tulip's EventLoop could be an
abstraction as well as a default implementation that allows me to support
just one event interface.

I looked at it from two angles:

 1. Does EventLoop provide everything that is needed from a library writer
    point of view?
 2. Can EventLoop efficiently expose a subset of the functionality of
    some of the main event loop implementations out there today
    (i looked at libuv, libev and Qt).

First some code pointers...

 * https://github.com/geertj/looping - Here i've implemented the EventLoop
   interface for libuv, libev and Qt. It includes a slightly modified version of
   tulip's "polling.py" where I've implemented some of the suggestions below.
   It also adds support for Python 2.6/2.7 as the Python Qt interface (PySide)
   doesn't support Python 3 yet.

 * https://github.com/geertj/python-dbusx - A Python interface for libdbus that
   supports evented IO using an EventLoop interface. This module is also
   tests all the different loops from "looping" by doing D-BUS tests with them
   (looping itself doesn't have tests yet).

My main points of feedback are below:

* It would be nice to have repeatable timers. Repeatable timers are expected
  for example by libdbus when integrating it with an event loop.

  Without repeatable timers, I could emulate a repeatable timer by using
  call_later() and adding a new timer every time the timer fires. This would
  be an inefficient interface though for event loops that natively support
  repeatable timers.

  This could possibly be done by adding a "repeat" argument to call_later().

* It would be nice to be a way to call a callback once per loop iteration.
  An example here is dispatching in libdbus. The easiest way to do this is
  to call dbus_connection_dispatch() every iteration of the loop (a more
  complicated way exists to get notifications when the dispatch status
  changes, but it is edge triggered and difficult to get right).

  This could possibly be implemented by adding a "repeat" argument to
  call_soon().

* A useful semantic for run_once() would be to run the callbacks for
  readers and writers in the same iteration as when the FD got ready.

  This allows for the idiom below when expecting a single event to happen
  on a file descriptor from outside the event loop:

    # handle_read() sets the "ready" flag
    loop.add_reader(fd, handle_read)
    while not ready:
        loop.run_once()

  I use this idiom for example in a blocking method_call() method that calls
  into a D-BUS method.

  Currently, the handle_read() callback would be called in the iteration
  *after* the FD became readable. So this would not work, unless some more
  IO becomes available.

  As far as I can see libev, libuv and Qt all work like this.

* If remove_reader() / remove_writer() would accept the DelayedCall instance
  returned by their add_xxx() cousins, then that would allow for multiple
  callbacks per FD. Not all event loops support this (libuv doesn't, libev
  and Qt do), but for the ones that do could have their functionality could
  be exposed like this. For event loops that don't support this, an exception
  could be raised when adding multiple callbacks per FD.

  Support for multiple callbacks per FD could be advertised as a capability.

* After a DelayedCall is cancelled, it would also be very useful to have a
  second method to enable it again. Having that functionality is more
  efficient than creating a new event. For example, the D-BUS event loop
  integration API has specific methods for toggling events on and off that
  you need to provide.

* (Nitpick) Multiplexing absolute and relative timeouts for the "when"
  argument in call_later() is a little too smart in my view and can lead
  to bugs.

With some input, I'd be happy to produce patches.

Regards,
Geert Jansen


From guido at python.org  Sun Dec 16 23:23:51 2012
From: guido at python.org (Guido van Rossum)
Date: Sun, 16 Dec 2012 14:23:51 -0800
Subject: [Python-ideas] Late to the async party (PEP 3156)
In-Reply-To: <50CE1CF4.4080704@urandom.ca>
References: <50CD2592.5010507@urandom.ca>
	<CAP7+vJLAPT46qzAa5g7iFVdkDvr-QxDUKC1Nnee2QuH3s-=8qA@mail.gmail.com>
	<50CE1CF4.4080704@urandom.ca>
Message-ID: <CAP7+vJKZH-34k4VOAoHPmwaawcVEM4H9=ZeHeK6MTKNFvwzbPw@mail.gmail.com>

On Sun, Dec 16, 2012 at 11:11 AM, Jason Tackaberry <tack at urandom.ca> wrote:

>  On 12-12-16 11:27 AM, Guido van Rossum wrote:
>
> The PEP is definitely weak. Here are some thoughts/proposals though:
>
>    - You can't cancel a coroutine; however you can cancel a Task, which
>    is a Future wrapping a stack of coroutines linked via yield-from.
>
>
> I'll just underline your statement that "you can't cancel a coroutine"
> here, since I'm referencing it later.
>
> This distinction between "bare" coroutines, Futures, and Tasks is a bit
> foreign to me, since in Kaa all coroutines return (a subclass of)
> InProgress objects.
>

Task is a subclass of Future; a Future may wrap some I/O or some other
system call, but a Task wraps a coroutine. Bare coroutines are introduced
by PEP 380, so it's no surprise you have to get used to them. But trust me
they are useful.

I have a graphical representation in my head; drawing with a computer is
not my strong point, but here's some ASCII art:

  [Task: coroutine -> coroutine -> ... -> coroutine)

The -> arrow represents a yield from, and each coroutine has its own stack
frame (the frame's back pointer points left). The leftmost coroutine is the
one you pass to Task(); the rightmost one is the one whose code is
currently running. When it blocks for I/O, the entire stack is suspended;
the Task object is given to the scheduler for resumption when the I/O
completes.

I'm drawing a '[' to the left of the Task because it is a definite end
point; I'm drawing a ')' to the right of the last coroutine because
whenever the coroutine uses yield from another one gets added to the right.

When a coroutine blocks for a Future, it looks like this:

  [Task: coroutine -> coroutine -> ... -> coroutine -> Future]

(I'm using ']' here to suggest that the Future is also an end point.)

When it blocks for a Task, it ends up looking like this:

  [Task 1: coroutine -> ... -> coroutine -> [Task 2: coroutine -> ... ->
coroutine)]


>  The Tasks section in the PEP says that a bare coroutine (is this the same
> as the previously defined "coroutine object"?)
>

Yes.


>  has much less overhead than a Task but it's not clear to me why that
> would be, as both would ultimately need to be managed by the scheduler,
> wouldn't they?
>

No.

This takes a lot of time to wrap your head around but it is important to
get this. This is because "yield from" is built into the language, and
because of the way it is defined to behave. Suppose you have this:

  def inner():
    yield 1
    yield 2

  def outer():
    yield 'A'
    yield from inner()
    yield 'B'

  def main():  # Not a generator -- no yield in sight!
    for x in outer():
      print(x)

The output of calling main() is as follows:

A
1
2
B

There is no scheduler in sight, this is basic Python 3. The Python 2
equivalent would have the middle line of outer() replaced by

    for x in inner():
      yield x

(It's more complicated when 'yield from' is used as an expression and when
sending values or throwing exceptions into the outer generator, but that
doesn't matter for this part of the explanation.)

Given that a coroutine function (despite being marked with
@tulip.coroutine) is just a generator, when one coroutine invokes another
via 'yield from', the scheduler doesn't find out about this at all.

However, if a coroutine uses 'yield' instead of 'yield from', the scheduler
*does* hear about it. The mechanism for this is best understood by looking
at the Python 2 equivalent: each arrow in my diagrams stands for 'yield
from', which you can replace by a for loop yielding each value, and thus
the value yielded by the innermost coroutine ends up being yielded by the
outermost one to the scheduler.

The trick is that 'yield from' is implemented more efficiently that the
equivalent for loop.

Another thing to keep in mind is that when you use yield from with a Future
(or a Task, which is a subclass of Future), the Future has an __iter__()
method that uses 'yield' (*not* 'yield from') to signal the scheduler that
it is waiting for some I/O. (There's debate about whether you tell the
scheduler what kind of I/O it should perform before invoking 'yield' or as
a parameter to 'yield', but that's immaterial for understanding this part
of the explanation.)


> I could imagine that a coroutine object is implemented as a C object for
> performance,
>

Kind of -- the transfer is built into the Python interpreter.


> and a Task is a Python class, and maybe that explains the difference.  But
> then why differentiate between Future and Task (particularly because they
> have the same interface, so I can't draw an analogy with jQuery's Deferreds
> and Promises, where Promises are a restricted form of Deferreds for public
> consumption to attach callbacks).
>

You'll have to ask a Twisted person about Deferreds; they have all kinds of
extra useful functionality related to error handling and chaining.
(Apparently Deferreds became popular in the JS world after they were
introduced in Twisted.) I find Deferreds elusive, and my PEP won't have
them. (Coroutines take their place as the preferred way to write user
code.) AFAICT a Promise is more like a Future, which is a much simpler
thing.

Another difference between bare coroutines and Tasks: a bare coroutine
*only* runs when another coroutine that is running is waiting for it using
'yield from'. But a coroutine wrapped in a Task will be run by the
schedulereven when nobody is waiting for it. (In Kaa's world, which is
similar to Twisted's @inlineCallbacks, Monocle, and Google App Engine's
NDB, every coroutine is wrapped in something like a task.)

This is the reason why the par() operation needs to wrap bare coroutine
arguments in Tasks.


>
>    - Cancellation only takes effect when a task is suspended.
>
>
> Yes, this is intuitive.
>
>
>
>
>    - When you cancel a Task, the most deeply nested coroutine (the one
>    that caused it to be suspended) receives a special exception (I propose to
>    reuse concurrent.futures.CancelledError from PEP 3148). If it doesn't catch
>    this it bubbles all the way to the Task, and then out from there.
>
>
> So if the most deeply nested coroutine catches the CancelledError and
> doesn't reraise, it can prevent its cancellation?
>

Yes. That's probably something you shouldn't be doing though. Also,
cancel() sets a flag on the Task that remains set, and when the coroutine
suspends itself in response to the CancelledError, the scheduler will just
throw the exception into it again.

Or perhaps it should throw something that's harder to catch? There's some
similarity with the close() method on generators introduced by PEP 342;
this causes GeneratorExit to be thrown into the generator (if it's not
terminated), and if the generator chooses to catch and ignore this, the
generator is declared dead anyway.


> I took a similar appoach, except that coroutines can't abort their own
> cancellation, and whether or not the nested coroutines actually get
> cancelled depends on whether something else was interested in their result.
>

Yeah, since you have a Task/Future at every level you are forced to do it
that way.


>  Consider a coroutine chain where A yields B yields C yields D, and we do
> B.abort()
>
>    - if only C was interested in D's result, then D will get an
>    InProgressAborted raised inside it (at whatever point it's currently
>    suspended).  If something other than C was also waiting on D, D will not be
>    affected
>     - similarly, if only B was interested in C's result, then C will get
>    an InProgressAborted raised inside it (at yield D).
>     - B will get InProgressAborted raised inside it (at yield C)
>     - for B, C and D, the coroutines will not be reentered and they are
>    not allowed to yield a value that suggests they expect reentry.  There's
>    nothing a coroutine can do to prevent its own demise.
>    - A will get an InProgressAborted raised inside it (at yield B)
>    - In all the above cases, the InProgressAborted instance has an origin
>    attribute that is B's InProgress object
>    - Although B, C, and D are now aborted, A isn't aborted.  It's allowed
>    to yield again.
>    - with Kaa, coroutines are abortable by default (so they are like
>    Tasks always).  But in this example, B can present C from being aborted by
>    yielding C().noabort()
>
>
> There are quite a few scenarios to consider: A yields B and B is cancelled
> or raises; A yields B and A is cancelled or raises; A yields B, C yields B,
> and A is cancelled or raises; A yields B, C yields B, and A or C is
> cancelled or raises; A yields par(B,C,D) and B is cancelled or raises; etc,
> etc.
>
> In my experience, there's no one-size-fits-all behaviour, and the best we
> can do is have sensible default behaviour with some API (different
> functions, kwargs, etc.) to control the cancellation propagation logic.
>

Yeah, I think that the default behavior I sketched in my previous message
is fine, and the user can implement other behaviors through a combination
of Task wrappers, catching exceptions, and explicitly cancelling tasks.


>
>    - However when a coroutine in one Task uses yield-from to wait for
>    another Task, the latter does not automatically get cancelled. So this is a
>    difference between "yield from foo()" and "yield from Task(foo())", which
>    otherwise behave pretty similarly. Of course the first Task could catch the
>    exception and cancel the second task -- that is its responsibility though
>    and not the default behavior.
>
>
> Ok, so nested bare coroutines will get cancelled implicitly, but nested
> Tasks won't?
>

Correct. If you have a simple stack-like usage pattern there's no need to
introduce a Task; Tasks are useful if you want to decouple the stacks, e.g.
have two other places both wait for the same Task (or for some other
Future, for that matter).


> I'm having a bit of difficulty with this one.  You said that coroutines
> can't be cancelled, but Tasks can be.  But here, if they are being yielded,
> the opposite behaviour applies: yielded coroutines *are* cancelled if a
> Task is cancelled, but yielded tasks *aren't*.
>
> Or have I misunderstood?
>

I hope my explanation above of the relationship between Tasks and bare
coroutines helps. I can see how it gets confusing if you are used to
thinking in terms of a system where there is always a Task involved when
one coroutine waits for another.


>    - PEP 3156 has a par() helper which lets you block for multiple
>    tasks/coroutines in parallel. It takes arguments which are either
>    coroutines, Tasks, or other Futures; it wraps the coroutines in Tasks to
>    run them independently an just waits for the other arguments. Proposal:
>    when the Task containing the par() call is cancelled, the par() call
>    intercepts the cancellation and by default cancels those coroutines that
>    were passed in "bare" but not the arguments that were passed in as Tasks or
>    Futures. Some keyword argument to par() may be used to change this behavior
>    to "cancel none" or "cancel all" (exact API spec TBD).
>
>
> Here again, par() would cancel a bare coroutine but not Tasks.  It's
> consistent with your previous bullet but seems to contradict your first
> bullet that you can't cancel a coroutine.
>
> I guess the distinction is you can't explicitly cancel a coroutine, but
> coroutines can be implicitly cancelled?
>

Right.


 As I discussed previously, one of those tasks might be yielded by some
> other active coroutine, and so cancelling it may not be the right thing to
> do.  Being able to control this behaviour is important, whether that's a
> par() kwarg, or special method like noabort() that constructs an
> unabortable Task instance.
>

I think we're in violent agreement. :-)


 Kaa has similar constructs to allow yielding a collection of InProgress
> objects (whatever they might represent: coroutines, threaded functions,
> etc.).  In particular, it allows you to yield multiple tasks and resume
> when ALL of them complete (InProgressAll), or when ANY of them complete
> (InProgressAny).  For example:
>
>     @kaa.coroutine()
>     def is_any_host_up(*hosts):
>         try:
>             # ping() is a coroutine
>             yield kaa.InProgressAny(ping(host) for host in hosts).timeout(5, abort=True)
>         except kaa.TimeoutException:
>             yield False
>         else:
>             yield True
>
>
> More details here:
>
>
> http://api.freevo.org/kaa-base/async/inprogress.html#inprogress-collections
>
> From what I understand of the proposed par() it would require* *ALL of
> the supplied futures to complete, but there are many use-cases for the ANY
> variant as well.
>

Good point. I'd forgotten about this while writing the PEP, but Tulip v1
has this. The way to spell it is a little awkward and I could use some
fresh ideas though. In Tulip v1 you can write

  ready_tasks = yield from wait_any(set_of_tasks)

The result ready_tasks is a set of tasks that are done; it has at least one
element. This is a generalization of

  ready_tasks = yield from wait_for(N, set_of_tasks)

which returns a set of size at least N done tasks; set N to the length of
the input to implement waiting for all. But the semantics of always
returning a set (even when N == 1) are somewhat awkward, and ideally you
probably want something that you can call in a loop until all tasks are
done, e.g.

  todo = <initial list of tasks>
  while todo:
    result = yield from wait_one(todo)
    <use result>

Here wait_one(todo) blocks until at least one task in todo is done, then
removes it from todo, and returns its result (or raises its exception).


>  Interesting. In Tulip v1 (the experimental version I wrote before PEP
> 3156) the Task() constructor has an optional timeout argument. It works by
> scheduling a callback at the given time in the future, and the callback
> simply cancel the task (which is a no-op if the task has already
> completed). It works okay, except it generates tracebacks that are
> sometimes logged and sometimes not properly caught -- though some of that
> may be my messy test code. The exception raised by a timeout is the same
> CancelledError, which is somewhat confusing. I wonder if Task.cancel()
> shouldn't take an exception with which to cancel the task with.
> (TimeoutError in PEP 3148 has a different role, it is when the timeout on a
> specific wait expires, so e.g. fut.result(timeout=2) waits up to 2 seconds
> for fut to complete, and if not, the call raises TimeoutError, but the code
> running in the executor is unaffected.)
>
>
> FWIW, the equivalent in Kaa which is InProgress.abort() does take an
> optional exception, which must subclass InProgressAborted.  If None, a new
> InProgressAborted is created.  InProgress.timeout(t) will start a timer
> that invokes InProgress.abort(TimeoutException()) (TimeoutException
> subclasses InProgressAborted).
>
> It sounds like your proposed implementation works like:
>
>    @tulip.coroutine()
>    def foo():
>       try:
>          result = yield from Task(othercoroutine()).result(timeout=2)
>
>
Actually in Tulip you never combine result() with 'yield from' and you
never use timeout=N with result(); this line would be written as follows:

    result = yield from Task(othercoroutine(), timeout=2)


>       except TimeoutError:
>          # ... othercoroutine() still lives on
>
>
> I think Kaa's syntax is cleaner but it seems functionally the same:
>
>    @kaa.coroutine()
>    def foo():
>       try:
>          result = yield othercoroutine().timeout(2)
>       except kaa.TimeoutException:
>          # ... othercoroutine() still lives on
>
>
> It's also possible to conveniently ensure that othercoroutine() is aborted
> if the timeout elapses:
>
>    try:
>       result = yield othercoroutine().timeout(2, abort=True)
>    except kaa.TimeoutException:
>       # ... othercoroutine() is aborted
>
>
When do you use that?

 We've had long discussions about yield vs. yield-from. The latter is way
> more efficient and that's enough for me to push it through. When using
> yield, each yield causes you to bounce to the scheduler, which has to do a
> lot of work to decide what to do next, even if that is just resuming the
> suspended generator; and the scheduler is responsible for keeping track of
> the stack of generators. When using yield-from, calling another coroutine
> as a subroutine is almost free and doesn't involve the scheduler at all;
> thus it's much cheaper, and the scheduler can be simpler (doesn't need to
> keep track of the stack). Also stack traces and debugging are better.
>
>
> But this sounds like a consequence of a particular implementation, isn't
> it?
>

These semantics are built into the language as of Python 3.3; sure, it's a
quality of implementation issue to make it as fast as possible, but the
stack trace semantics are not optional, and if CPython can make it
efficient then other implementations will try to compete by making it even
more efficient. :-)

There are still some optimizations possibly beyond what Python 3.3
currently does, maybe 3.3.1 or 3.4 will speed it up even more. (In
particular, in the ideal implementation, a yield at a deeply nested
coroutine should reach the caller of the outermost coroutine in O(1) time
rather than O(N) where N is the stack depth. I think it is currently O(N)
with a rather small constant factor.


> A @kaa.coroutine() decorated function is entered right away when invoked,
> and the decorator logic does as much as it can until the underlying
> generator yields an unfinished InProgress that needs to wait for (or
> kaa.NotFinished).  Once it yields, *then* the decorator sets up the
> necessary hooks with the scheduler / event loop.
>

That's a good optimization if your semantics require a Task at every level.
But IIRC (from implementing something like this myself for NDB) it is quite
subtle to get it right in all edge cases. And you still have at least two
Python function invocations for every level of coroutine invocation.


> This means you can nest a stack of coroutines without involving the
> scheduler until something truly asynchronous needs to take place.
>
> Have I misunderstood?
>

Misunderstood what? You are describing Kaa here. :-)


>
>>    - coroutines can have certain policies that control invocation
>>    behaviour.  The most obvious ones to describe are POLICY_SYNCHRONIZED which
>>    ensures that multiple invocations of the same coroutine are serialized, and
>>    POLICY_SINGLETON which effectively ignores subsequent invocations if it's
>>    already running
>>    - it is possible to have a special progress object passed into the
>>    coroutine function so that the coroutine's progress can be communicated to
>>    an outside observer
>>
>>
>  These seem pretty esoteric and can probably implemented in user code if
> needed.
>
>
> I'm fine with that, provided the flexibility is there to allow for it.
>
>
>
>  As I said, I think wait_for_future() and run_in_executor() in the PEP
> give you all you need. The @threaded decorator you propose is just sugar;
> if a user wants to take an existing API and convert it from a coroutine to
> threaded without requiring changes to the caller, they can just introduce a
> helper that is run in a thread with run_in_executor().
>
>
> Also works for me. :)
>
>
>
>  Thanks for your very useful contribution! Kaa looks like an interesting
> system. Is it ported to Python 3 yet? Maybe you could look into integrating
> with the PEP 3156 event loop and/or scheduler.
>
>
> Kaa does work with Python 3, yes, although it still lacks very much needed
> unit tests so I'm not completely confident it has the same functional
> coverage as Python 2.
>
> I'm definitely interested in having it conform to whatever shakes out of
> PEP 3156, which is why I'm speaking up now. :)
>

I'm sorry I don't have a reference implementation available yet. I hope to
finish one before Christmas.


> I've a couple other subjects I should bring up:
>
> Tasks/Futures as "signals": it's often necessary to be able to resume a
> coroutine based on some condition other than e.g. any IO tasks it's waiting
> on.  For example, in one application, I have a (POLICY_SINGLETON) coroutine
> that works off a download queue.  If there's nothing in the queue, it's
> suspended at a yield.  It's the coroutine equivalent of a dedicated thread.
> [1]
>
> It must be possible to "wake" the queue manager when I enqueue a job for
> it.  Kaa has this notion of "signals" which is similar to the gtk+ style of
> signals in that you can attach callbacks to them and emit them.  Signals
> can be represented as InProgress objects, which means they can be yielded
> from coroutines and used in InProgressAny/All objects.
>

(Aside: I can never get used to that terminology; I am too used to the UNIX
meaning of "signal". It sounds like a publish-subscribe mechanism.)


>
> So my download manager coroutine can yield an InProgressAny of all the
> active download coroutines *and* the "new job enqueued" signal, and
> execution will resume as long as any of those conditions are met.
>
> Is there anything in your current proposal that would allow for this
> use-case?
>
> [1]
> https://github.com/jtackaberry/stagehand/blob/master/src/manager.py#L390
>

That example is a little beyond my comprehension. I'm guessing though that
you could probably cobble something like this together from the wait_one()
primitive I described above. Or perhaps we need a set of synchronization
primitives similar to those provided by threading.py: Lock, Condition,
Semaphore, Event, Barrier, and some variations.



> Another pain point for me has been this notion of unhandled asynchronous
> exceptions.  Asynchronous tasks are represented as an InProgress object,
> and if a task fails, accessing InProgress.result will raise the exception
> at which point it's considered handled.  This attribute access could happen
> at any time during the lifetime of the InProgress object, outside the
> task's call stack.
>
> The desirable behaviour is that when the InProgress object is destroyed,
> if there's an exception attached to it from a failed task that hasn't been
> accessed, we should output the stack as an unhandled exception.  In Kaa, I
> do this with a weakref destroy callback, but this isn't ideal because with
> GC, the InProgress might not be destroyed until well after the exception is
> relevant.
>
> I make every effort to remove reference cycles and generally get the
> InProgress object destroyed as early as possible, but this changes subtly
> between Python versions.
>
> How will unhandled asynchronous exceptions be handled with tulip?
>

That's actually a clever idea: log the exception when the Task object is
destroyed if it hasn't been raised (from result()) or inspected (using
exception()) at least once. I know these have been haunting me in NDB -- it
logs all, some or none of the exceptions depending on the log settings, but
that's not right, and your approach is much better.

So it may come down to implementation cleverness to try and GC Task objects
sooner rather than later -- which will also depend on the Python
implementation. In the end, debugging convenience cannot help but depend on
the implementation.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121216/2d609418/attachment.html>

From guido at python.org  Mon Dec 17 18:47:22 2012
From: guido at python.org (Guido van Rossum)
Date: Mon, 17 Dec 2012 09:47:22 -0800
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
Message-ID: <CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>

On Mon, Dec 17, 2012 at 3:08 AM, Geert Jansen <geertj at gmail.com> wrote:
> below is some feedback on the EventLoop API as implemented in tulip.

Great feedback! I hope you will focus on PEP 3156
(http://www.python.org/dev/peps/pep-3156/) and Tulip v2 next; Tulip v2
isn't written but is quickly taking shape in the 'tulip' subdirectory
of the Tulip project.

> I am interested in this for an (alternate) dbus interface that I've written
> for Python that supports evented IO. I'm hoping tulip's EventLoop could be an
> abstraction as well as a default implementation that allows me to support
> just one event interface.

Nice. The more interop this event loop offers the better. I don't know
much about dbus, though, so occasionally my responses may not make any
sense -- please be gentle and educate me when my ignorance gets in the
way of understanding.

> I looked at it from two angles:
>
>  1. Does EventLoop provide everything that is needed from a library writer
>     point of view?
>  2. Can EventLoop efficiently expose a subset of the functionality of
>     some of the main event loop implementations out there today
>     (i looked at libuv, libev and Qt).
>
> First some code pointers...
>
>  * https://github.com/geertj/looping - Here i've implemented the EventLoop
>    interface for libuv, libev and Qt. It includes a slightly modified version of
>    tulip's "polling.py" where I've implemented some of the suggestions below.
>    It also adds support for Python 2.6/2.7 as the Python Qt interface (PySide)
>    doesn't support Python 3 yet.

Cool. For me, right now, Python 2 compatibility is a distraction, but
I am not against others adding it. I'll be happy to consider small
tweaks to the PEP to make this easier. Exception: I'm not about to
give up on 'yield from'; but that doesn't seem your focus anyway.

>  * https://github.com/geertj/python-dbusx - A Python interface for libdbus that
>    supports evented IO using an EventLoop interface. This module is also
>    tests all the different loops from "looping" by doing D-BUS tests with them
>    (looping itself doesn't have tests yet).

I'm actually glad to see there are so many event loop implementations
around. This suggests to me that there's a real demand for this type
of functionality, and I'd be real happy if PEP 3156 and Tulip came to
improve the interop situation (especially for Python 3.3 and beyond).

> My main points of feedback are below:
>
> * It would be nice to have repeatable timers. Repeatable timers are expected
>   for example by libdbus when integrating it with an event loop.
>
>   Without repeatable timers, I could emulate a repeatable timer by using
>   call_later() and adding a new timer every time the timer fires. This would
>   be an inefficient interface though for event loops that natively support
>   repeatable timers.
>
>   This could possibly be done by adding a "repeat" argument to call_later().

I've not used repeatable timers myself but I see them in several other
interfaces. I do think they deserve a different method call to set
them up, even if the implementation will just be to add a repeat field
to the DelayedCall. When I start a timer with a 2 second repeat, does
it run now and then 2, 4, 6, ... seconds after, or should the first
run be in 2 seconds? Or are these separate parameters? Strawman
proposal: it runs in 2 seconds and then every 2 seconds. The API would
be event_loop.call_repeatedly(interval, callback, *args), returning a
DelayedCall with an interval attribute set to the interval value.

(BTW, can someone *please* come up with a better name for DelayedCall?
It's tedious and doesn't abbreviate well. But I don't want to name the
class 'Callback' since I already use 'callback' for function objects
that are used as callbacks.)

> * It would be nice to be a way to call a callback once per loop iteration.
>   An example here is dispatching in libdbus. The easiest way to do this is
>   to call dbus_connection_dispatch() every iteration of the loop (a more
>   complicated way exists to get notifications when the dispatch status
>   changes, but it is edge triggered and difficult to get right).
>
>   This could possibly be implemented by adding a "repeat" argument to
>   call_soon().

Again, I'd rather introduce a new method. What should the semantics
be? Is this called just before or after we potentially go to sleep, or
at some other point, or at the very top or bottom of run_once()?

> * A useful semantic for run_once() would be to run the callbacks for
>   readers and writers in the same iteration as when the FD got ready.

Good catch, I've struggled with this. I ended up not needing to call
run_once(), so I've left it out of the PEP. I agree if there's a
strong enough use case for it (what's yours?) it should probably be
redesigned. Another thing I don't like about it is that a callback
that calls call_soon() with itself will starve I/O completely. OTOH
that's perhaps no worse than a callback containing an infinite loop;
and there's something to say for the semantics that if a callback just
schedules another callback as an immediate 'continuation', it's
reasonable to run that before even attempting to poll for I/O.

>   This allows for the idiom below when expecting a single event to happen
>   on a file descriptor from outside the event loop:
>
>     # handle_read() sets the "ready" flag
>     loop.add_reader(fd, handle_read)
>     while not ready:
>         loop.run_once()
>
>   I use this idiom for example in a blocking method_call() method that calls
>   into a D-BUS method.
>
>   Currently, the handle_read() callback would be called in the iteration
>   *after* the FD became readable. So this would not work, unless some more
>   IO becomes available.
>
>   As far as I can see libev, libuv and Qt all work like this.

Hm, okay, it seems reasonable to support that. (My original intent
with run_unce() was to allow mixing multiple event loops -- you'd just
call each event loop's run_once() equivalent in a round-robin
fashion.)

How about the following semantics for run_once():

1. compute deadline as the smallest of:
    - the time until the first event in the timer heap, if non empty
    - 0 if the ready queue is non empty
    - Infinity(*)

2. poll for I/O with the computed deadline, adding anything that is
ready to the ready queue

3. run items from the ready queue until it is empty

(*) Most event loops I've seen use e.g. 30 seconds or 1 hour as
infinity, with the idea that if somehow a race condition added
something to the ready queue just as we went to sleep, and there's no
I/O at all, the system will recover eventually. But I've also heard
people worried about power conservation on mobile devices (or laptops)
complain about servers that wake up regularly even when there is no
work to do. Thoughts? I think I'll leave this out of the PEP, but what
should Tulip do?

> * If remove_reader() / remove_writer() would accept the DelayedCall instance
>   returned by their add_xxx() cousins, then that would allow for multiple
>   callbacks per FD. Not all event loops support this (libuv doesn't, libev
>   and Qt do), but for the ones that do could have their functionality could
>   be exposed like this. For event loops that don't support this, an exception
>   could be raised when adding multiple callbacks per FD.

Hm. The PEP currently states that you can call cancel() on the
DelayedCall returned by e.g. add_reader() and it will act as if you
called remove_reader(). (Though I haven't implemented this yet --
either there would have to be a cancel callback on the DelayedCall or
the effect would be delayed.)

But multiple callbacks per FD seems a different issue -- currently
add_reader() just replaces the previous callback if one is already
set. Since not every event loop can support this, I'm not sure it
ought to be in the PEP, and making it optional sounds like a recipe
for trouble (a library that depends on this may break subtly or only
under pressure). Also, what's the use case? If you really need this
you are free to implement a mechanism on top of the standard in user
code that dispatches to multiple callbacks -- that sounds like a small
amount of work if you really need it, but it sounds like an attractive
nuisance to put this in the spec.

>   Support for multiple callbacks per FD could be advertised as a capability.

I'm not keen on having optional functionality as I explained above.
(In fact, I probably will change the PEP to make those APIs that are
currently marked as optional required -- it will just depend on the
platform which paradigm performs better, but using the
transport/protocol abstraction will automatically select the best
paradigm).

> * After a DelayedCall is cancelled, it would also be very useful to have a
>   second method to enable it again. Having that functionality is more
>   efficient than creating a new event. For example, the D-BUS event loop
>   integration API has specific methods for toggling events on and off that
>   you need to provide.

Really? Doesn't this functionality imply that something (besides user
code) is holding on to the DelayedCall after it is cancelled? It seems
iffy to have to bend over backwards to support this alternate way of
doing something that we can already do, just because (on some
platform?) it might shave a microsecond off callback registration.

> * (Nitpick) Multiplexing absolute and relative timeouts for the "when"
>   argument in call_later() is a little too smart in my view and can lead
>   to bugs.

Agreed; that's why I left it out of the PEP. The v2 implementation
will use time.monotonic(),

> With some input, I'd be happy to produce patches.

I hope I've given you enough input; it's probably better to discuss
the specs first before starting to code. But please do review the
tulip v2 code in the tulip subdirectory; if you want to help you I'll
be happy to give you commit privileges to that repo, or I'll take
patches if you send them.

-- 
--Guido van Rossum (python.org/~guido)


From guido at python.org  Mon Dec 17 19:19:25 2012
From: guido at python.org (Guido van Rossum)
Date: Mon, 17 Dec 2012 10:19:25 -0800
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
Message-ID: <CAP7+vJ+xXN_Y50SDVYYDb7jhp4_29u2mZB-2DG7WBmzmGo0WQw@mail.gmail.com>

On Mon, Dec 17, 2012 at 9:47 AM, Guido van Rossum <guido at python.org> wrote:
> I hope I've given you enough input; it's probably better to discuss
> the specs first before starting to code. But please do review the
> tulip v2 code in the tulip subdirectory; if you want to help you I'll
> be happy to give you commit privileges to that repo, or I'll take
> patches if you send them.

Patches against PEP 3156 are also welcome! (The repo is at hg.python.org/peps)

-- 
--Guido van Rossum (python.org/~guido)


From solipsis at pitrou.net  Mon Dec 17 20:57:34 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 17 Dec 2012 20:57:34 +0100
Subject: [Python-ideas] async: feedback on EventLoop API
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
Message-ID: <20121217205734.103dc4f2@pitrou.net>

On Mon, 17 Dec 2012 09:47:22 -0800
Guido van Rossum <guido at python.org> wrote:
> 
> (BTW, can someone *please* come up with a better name for DelayedCall?
> It's tedious and doesn't abbreviate well. But I don't want to name the
> class 'Callback' since I already use 'callback' for function objects
> that are used as callbacks.)

Does it need to be abbreviated? I don't think users have to spell
"DelayedCall" at all (they just call call_later()).
That said, some proposals:
- Timer (might be mixed up with threading.Timer)
- Deadline
- RDV (French abbrev. for rendez-vous)

Regards

Antoine.




From ronan.lamy at gmail.com  Mon Dec 17 21:33:23 2012
From: ronan.lamy at gmail.com (Ronan Lamy)
Date: Mon, 17 Dec 2012 20:33:23 +0000
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
Message-ID: <50CF8193.4040501@gmail.com>

Le 17/12/2012 17:47, Guido van Rossum a ?crit :

> (BTW, can someone *please* come up with a better name for DelayedCall?
> It's tedious and doesn't abbreviate well. But I don't want to name the
> class 'Callback' since I already use 'callback' for function objects
> that are used as callbacks.)

It seems to me that a DelayedCall is nothing but a frozen, reified 
function call. That it's a reified thing is already obvious from the 
fact that it's an object, so how about naming it just "Call"? "Delayed" 
is actually only one of the possible relations between the object and 
the actual call - it could also represent a cancelled call, or a cached 
one, or ...?

This idea has some implications for the design: in particular, it means 
that .cancel() should be a method of the EventLoop, not of Call. So Call 
would only have the attributes 'callback' (I'd prefer 'func' or similar) 
and 'args', and one method to execute the call.

HTH,
Ronan Lamy



From guido at python.org  Mon Dec 17 21:49:46 2012
From: guido at python.org (Guido van Rossum)
Date: Mon, 17 Dec 2012 12:49:46 -0800
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <20121217205734.103dc4f2@pitrou.net>
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
	<20121217205734.103dc4f2@pitrou.net>
Message-ID: <CAP7+vJLz6FKV0SAne0htWDVbJdQMKFPLnsXtNAejq+LozN=TBA@mail.gmail.com>

On Mon, Dec 17, 2012 at 11:57 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Mon, 17 Dec 2012 09:47:22 -0800
> Guido van Rossum <guido at python.org> wrote:
>>
>> (BTW, can someone *please* come up with a better name for DelayedCall?
>> It's tedious and doesn't abbreviate well. But I don't want to name the
>> class 'Callback' since I already use 'callback' for function objects
>> that are used as callbacks.)
>
> Does it need to be abbreviated? I don't think users have to spell
> "DelayedCall" at all (they just call call_later()).

They save the result in a variable. Naming that variable delayed_call
feels awkward. In my code I've called it 'dcall' but that's not great
either.

> That said, some proposals:
> - Timer (might be mixed up with threading.Timer)

But often there's no time involved...

> - Deadline

Same...

> - RDV (French abbrev. for rendez-vous)

Hmmmm. :-)

Maybe Callback is okay after all? The local variable can be 'cb'.

-- 
--Guido van Rossum (python.org/~guido)


From solipsis at pitrou.net  Mon Dec 17 21:56:26 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 17 Dec 2012 21:56:26 +0100
Subject: [Python-ideas] async: feedback on EventLoop API
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
	<20121217205734.103dc4f2@pitrou.net>
	<CAP7+vJLz6FKV0SAne0htWDVbJdQMKFPLnsXtNAejq+LozN=TBA@mail.gmail.com>
Message-ID: <20121217215626.762ac2d3@pitrou.net>

On Mon, 17 Dec 2012 12:49:46 -0800
Guido van Rossum <guido at python.org> wrote:
> On Mon, Dec 17, 2012 at 11:57 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> > On Mon, 17 Dec 2012 09:47:22 -0800
> > Guido van Rossum <guido at python.org> wrote:
> >>
> >> (BTW, can someone *please* come up with a better name for DelayedCall?
> >> It's tedious and doesn't abbreviate well. But I don't want to name the
> >> class 'Callback' since I already use 'callback' for function objects
> >> that are used as callbacks.)
> >
> > Does it need to be abbreviated? I don't think users have to spell
> > "DelayedCall" at all (they just call call_later()).
> 
> They save the result in a variable. Naming that variable delayed_call
> feels awkward. In my code I've called it 'dcall' but that's not great
> either.
> 
> > That said, some proposals:
> > - Timer (might be mixed up with threading.Timer)
> 
> But often there's no time involved...

Ah, I see you use the same class for add_reader() and friends. I was
assuming that, like in Twisted, DelayedCall was only returned by
call_later().

Is it useful to return a DelayedCall in add_reader()? Is it so that you
can remove the reader? But you already define remove_reader() for that,
so I'm not sure what an alternative way to do it brings :-)

Regards

Antoine.




From guido at python.org  Mon Dec 17 22:01:35 2012
From: guido at python.org (Guido van Rossum)
Date: Mon, 17 Dec 2012 13:01:35 -0800
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <50CF8193.4040501@gmail.com>
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
	<50CF8193.4040501@gmail.com>
Message-ID: <CAP7+vJLUkH3uz7OL-XZeJ=_FdE5=JG_ViEb4Dge57F5GCPJt2w@mail.gmail.com>

On Mon, Dec 17, 2012 at 12:33 PM, Ronan Lamy <ronan.lamy at gmail.com> wrote:
> Le 17/12/2012 17:47, Guido van Rossum a ?crit :
>
>> (BTW, can someone *please* come up with a better name for DelayedCall?
>> It's tedious and doesn't abbreviate well. But I don't want to name the
>> class 'Callback' since I already use 'callback' for function objects
>> that are used as callbacks.)
>
> It seems to me that a DelayedCall is nothing but a frozen, reified function
> call. That it's a reified thing is already obvious from the fact that it's
> an object, so how about naming it just "Call"? "Delayed" is actually only
> one of the possible relations between the object and the actual call - it
> could also represent a cancelled call, or a cached one, or ...

Call is not a bad suggestion for the name. Let me mull that over.

> This idea has some implications for the design: in particular, it means that
> .cancel() should be a method of the EventLoop, not of Call. So Call would
> only have the attributes 'callback' (I'd prefer 'func' or similar) and
> 'args', and one method to execute the call.

Not sure. Cancelling it must set a flag on the object, since the
object could be buried deep inside any number of data structures owned
by the event loop: e.g. the ready queue, the pollster's readers or
writers (dicts mapping FD to DelayedCall), or the timer heap.

When you cancel a call you don't immediately remove it from its data
structure -- instead, when you get to it naturally (e.g. its time
comes up) you notice that it's been cancelled and ignore it. The one
place where this is awkward is when it's a FD reader or writer -- it
won't come up if the FD doesn't get any new I/O, and it's even
possible that the FD is closed. (I don't actually know what epoll(),
kqueue() etc. do when one of the FDs is closed, but none of the
behaviors I can think of are particularly convenient...) I had thought
of giving the DelayedCall a 'cancel callback' that is used if/when it
is cancelled, and for readers/writers it could be something that calls
remove_reader/writer with the right FD. (Maybe I need multiple
cancel-callbacks, in case the same object is used as a callback for
multiple queues.)

Hm, this gets messy.

(Another think in this area: pyftpdlib's event loop keeps track of how
many calls are cancelled, and if a large number are cancelled it
reconstructs the heap. The use case is apparently registering lots of
callbacks far in the future and then cancelling them all. Not sure how
good a use case that it. But I admit that it would be easier if
cancelling was a method on the event loop.)

PS. Cancelling a future is a different thing. There you still want the
callback to be called, you just want it to notice that the operation
was cancelled. Same for tasks.

-- 
--Guido van Rossum (python.org/~guido)


From guido at python.org  Mon Dec 17 22:07:21 2012
From: guido at python.org (Guido van Rossum)
Date: Mon, 17 Dec 2012 13:07:21 -0800
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <20121217215626.762ac2d3@pitrou.net>
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
	<20121217205734.103dc4f2@pitrou.net>
	<CAP7+vJLz6FKV0SAne0htWDVbJdQMKFPLnsXtNAejq+LozN=TBA@mail.gmail.com>
	<20121217215626.762ac2d3@pitrou.net>
Message-ID: <CAP7+vJJ=fJXmG1i4kzR-j01Tn_2ASYPEWnajMbQz1LQuncWYtg@mail.gmail.com>

On Mon, Dec 17, 2012 at 12:56 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Mon, 17 Dec 2012 12:49:46 -0800
> Guido van Rossum <guido at python.org> wrote:
>> On Mon, Dec 17, 2012 at 11:57 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:
>> > On Mon, 17 Dec 2012 09:47:22 -0800
>> > Guido van Rossum <guido at python.org> wrote:
>> >>
>> >> (BTW, can someone *please* come up with a better name for DelayedCall?
>> >> It's tedious and doesn't abbreviate well. But I don't want to name the
>> >> class 'Callback' since I already use 'callback' for function objects
>> >> that are used as callbacks.)
>> >
>> > Does it need to be abbreviated? I don't think users have to spell
>> > "DelayedCall" at all (they just call call_later()).
>>
>> They save the result in a variable. Naming that variable delayed_call
>> feels awkward. In my code I've called it 'dcall' but that's not great
>> either.
>>
>> > That said, some proposals:
>> > - Timer (might be mixed up with threading.Timer)
>>
>> But often there's no time involved...
>
> Ah, I see you use the same class for add_reader() and friends. I was
> assuming that, like in Twisted, DelayedCall was only returned by
> call_later().
>
> Is it useful to return a DelayedCall in add_reader()? Is it so that you
> can remove the reader? But you already define remove_reader() for that,
> so I'm not sure what an alternative way to do it brings :-)

I'm not sure myself. I added it to the PEP (with a question mark)
because I use DelayedCalls to represent I/O callbacks internally --
it's handy to have an object that represents a function plus its
arguments, and I also have a shortcut for adding such objects to the
ready queue (the ready queue *also* stores DelayedCalls).

It is probably a mistake offering two ways to cancel an I/O callback;
but I'm not sure whether to drop remove_{reader,writer} or whether to
drop cancelling the callback. (The latter would means that
add_{reader,writer} should not return anything.) I *think* I'll keep
remove_* and drop callback cacellation, because the entity that most
likely wants to revoke the callback already has the file descriptor in
hand (it comes with the socket, which they need anyway so they can
call its recv/send method), but they would have to hold on to the
callback object separately. OTOH callback objects might make it
possible to have multiple callbacks per FD, which I currently don't
support. (See discussion earlier in this thread.)

-- 
--Guido van Rossum (python.org/~guido)


From greg.ewing at canterbury.ac.nz  Mon Dec 17 23:00:35 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 18 Dec 2012 11:00:35 +1300
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
Message-ID: <50CF9603.6040409@canterbury.ac.nz>

Guido van Rossum wrote:
> (*) Most event loops I've seen use e.g. 30 seconds or 1 hour as
> infinity, with the idea that if somehow a race condition added
> something to the ready queue just as we went to sleep, and there's no
> I/O at all, the system will recover eventually.

I don't see how such a race condition can occur in a
cooperative multitasking system. There are no true
interrupts that can cause something to happen when
you're not expecting it. So I'd say let infinity
really mean infinity.

-- 
Greg


From solipsis at pitrou.net  Mon Dec 17 23:11:34 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 17 Dec 2012 23:11:34 +0100
Subject: [Python-ideas] async: feedback on EventLoop API
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
	<50CF9603.6040409@canterbury.ac.nz>
Message-ID: <20121217231134.19ede507@pitrou.net>

On Tue, 18 Dec 2012 11:00:35 +1300
Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Guido van Rossum wrote:
> > (*) Most event loops I've seen use e.g. 30 seconds or 1 hour as
> > infinity, with the idea that if somehow a race condition added
> > something to the ready queue just as we went to sleep, and there's no
> > I/O at all, the system will recover eventually.
> 
> I don't see how such a race condition can occur in a
> cooperative multitasking system. There are no true
> interrupts that can cause something to happen when
> you're not expecting it. So I'd say let infinity
> really mean infinity.

Most event loops out there allow you to schedule callbacks from other
(preemptive, OS-level) threads.

Regards

Antoine.




From geertj at gmail.com  Mon Dec 17 23:57:51 2012
From: geertj at gmail.com (Geert Jansen)
Date: Mon, 17 Dec 2012 23:57:51 +0100
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
Message-ID: <CADbA=FXM0JEirL_DHsO_fh_-ffU_BVON2d903mAc58QaiOa-fg@mail.gmail.com>

On Mon, Dec 17, 2012 at 6:47 PM, Guido van Rossum <guido at python.org> wrote:

> Cool. For me, right now, Python 2 compatibility is a distraction, but
> I am not against others adding it. I'll be happy to consider small
> tweaks to the PEP to make this easier. Exception: I'm not about to
> give up on 'yield from'; but that doesn't seem your focus anyway.

Correct - my focus right now is on the event loop only. I intend to
have a deeper look at the coroutine scheduler as well later (right now
i'm using greenlets for that).

> I've not used repeatable timers myself but I see them in several other
> interfaces. I do think they deserve a different method call to set
> them up, even if the implementation will just be to add a repeat field
> to the DelayedCall. When I start a timer with a 2 second repeat, does
> it run now and then 2, 4, 6, ... seconds after, or should the first
> run be in 2 seconds? Or are these separate parameters? Strawman
> proposal: it runs in 2 seconds and then every 2 seconds. The API would
> be event_loop.call_repeatedly(interval, callback, *args), returning a
> DelayedCall with an interval attribute set to the interval value.

That would work (in 2 secs, then 4, 6, ...). This is the Qt QTimer model.

Both libev and libuv have a slightly more general timer that take a
timeout and a repeat value. When the timeout reaches zero, the timer
will fire, and if repeat != 0, it will re-seed the timeout to that
value.

I haven't seen any real need for such a timer where interval !=
repeat, and in any case it can pretty cheaply be emulated by adding a
new timer on the first expiration only. So your call_repeatedly() call
above should be fine.

> (BTW, can someone *please* come up with a better name for DelayedCall?
> It's tedious and doesn't abbreviate well. But I don't want to name the
> class 'Callback' since I already use 'callback' for function objects
> that are used as callbacks.)

libev uses the generic term "Watcher", libuv uses "Handle". But their
APIs are structured a bit differently from tulip so i'm not sure if
those names would make sense. They support many different types of
events (including more esoteric events like process watches, on-fork
handlers, and wall-clock timer events). Each event has its own class
that named after the event type, and that inherits from "Watcher" or
"Handle". When an event is created, you pass it a reference to its
loop. You manage the event fully through the event instance (e.g.
starting it, setting its callback and other parameters, stopping it).
The loop has only a few methods, notably "run" and "run_once".

So for example, you'd say:

loop = Loop()
timer = Timer(loop)
timer.start(2.0, callback)
loop.run()

The advantages of this approach is that naming is easier, and that you
can also have a natural place to put methods that update the event
after you created it. For example, you might want to temporarily
suspend a timer or change its interval.

I quite liked the fresh approach taken by tulip so that's why i tried
to stay within its design. However, the disadvantage is that modifying
events after you've created them is difficult (unless you create one
DelayedCall subtype per event in which case you're probably better off
creating those events through their constructor in the first place).

>> * It would be nice to be a way to call a callback once per loop iteration.
>>   An example here is dispatching in libdbus. The easiest way to do this is
>>   to call dbus_connection_dispatch() every iteration of the loop (a more
>>   complicated way exists to get notifications when the dispatch status
>>   changes, but it is edge triggered and difficult to get right).
>>
>>   This could possibly be implemented by adding a "repeat" argument to
>>   call_soon().
>
> Again, I'd rather introduce a new method. What should the semantics
> be? Is this called just before or after we potentially go to sleep, or
> at some other point, or at the very top or bottom of run_once()?

That is a good question. Both libuv and libev have both options. The
one that is called before we go to sleep is called a "Prepare"
handler, the one after we come back from sleep a "Check" handler. The
libev documentation has some words on check and prepare handlers here:

http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod#code_ev_prepare_code_and_code_ev_che

I am not sure both are needed, but i can't oversee all the consequences.

> How about the following semantics for run_once():
>
> 1. compute deadline as the smallest of:
>     - the time until the first event in the timer heap, if non empty
>     - 0 if the ready queue is non empty
>     - Infinity(*)
>
> 2. poll for I/O with the computed deadline, adding anything that is
> ready to the ready queue
>
> 3. run items from the ready queue until it is empty

I think doing this would work but i again can't fully oversee all the
consequences. Let me play with this a little.

> (*) Most event loops I've seen use e.g. 30 seconds or 1 hour as
> infinity, with the idea that if somehow a race condition added
> something to the ready queue just as we went to sleep, and there's no
> I/O at all, the system will recover eventually. But I've also heard
> people worried about power conservation on mobile devices (or laptops)
> complain about servers that wake up regularly even when there is no
> work to do. Thoughts? I think I'll leave this out of the PEP, but what
> should Tulip do?

I had a look at libuv and libev. They take two different approaches:

* libev uses a ~60 second timeout by default. This reason is subtle.
Libev supports a wall-clock time event that fires when a certain
wall-clock time has passed. Having a non-infinite timeout will allow
it to pick up changes to the system time (e.g. by NTP), which would
change when the wall-clock timer needs to run.

* libuv does not have a wall-clock timer and uses an infinite timeout.

In my view it would be best for tulip to use an infinite timeout
unless at some point a wall-clock timer will be added. That will help
with power management. Regarding race-conditions, i think they should
be solved in other ways (e.g by having a special method that can post
callbacks to the loop in a thread-safe way and possibly write to a
self-pipe).

> Hm. The PEP currently states that you can call cancel() on the
> DelayedCall returned by e.g. add_reader() and it will act as if you
> called remove_reader(). (Though I haven't implemented this yet --
> either there would have to be a cancel callback on the DelayedCall or
> the effect would be delayed.)

Right now i think that cancelling a DelayedCall is not safe. It could
busy-loop if the fd is ready.

> But multiple callbacks per FD seems a different issue -- currently
> add_reader() just replaces the previous callback if one is already
> set. Since not every event loop can support this, I'm not sure it
> ought to be in the PEP, and making it optional sounds like a recipe
> for trouble (a library that depends on this may break subtly or only
> under pressure). Also, what's the use case? If you really need this
> you are free to implement a mechanism on top of the standard in user
> code that dispatches to multiple callbacks -- that sounds like a small
> amount of work if you really need it, but it sounds like an attractive
> nuisance to put this in the spec.

A not-so-good use case are libraries like libdbus that don't document
their assumptions regarding this. For example, i have to provide an
"add watch" function that creates a new watch (a watch is just a
generic term for an FD event that can be read, write or read|write). I
have observed that it only ever sets one read and one write watch per
FD.

If we go for one reader/writer per FD, then it's probably fine, but it
would be nice if code that does install multiple readers/writers per
FD would get an exception rather than silently updating the callback.
The requirement could be that you need to remove the event before you
can add a new event for the same FD.

>> * After a DelayedCall is cancelled, it would also be very useful to have a
>>   second method to enable it again. Having that functionality is more
>>   efficient than creating a new event. For example, the D-BUS event loop
>>   integration API has specific methods for toggling events on and off that
>>   you need to provide.
>
> Really? Doesn't this functionality imply that something (besides user
> code) is holding on to the DelayedCall after it is cancelled?

Not that i can see. At least not for libuv and libev.

> It seems
> iffy to have to bend over backwards to support this alternate way of
> doing something that we can already do, just because (on some
> platform?) it might shave a microsecond off callback registration.

According to the libdbus documentation there is a separate function to
toggle an event on/off because that could be implemented without
allocating memory.

But actually there's one kind-of idiomatic use for this that i've seen
quite a few times in libraries. Assume you have a library that defines
a connection. Often, you create two events for that connection in the
constructor: a "write_event" and a "read_event". The read_event is
normally enabled, but gets temporarily disabled when you need to
throttle input. The write_event is normally disabled except when you
get a short write on output.

Just enabling/disabling these events is a bit more friendly to the
programmer IMHO than having to cancel and recreate them when needed.

>> * (Nitpick) Multiplexing absolute and relative timeouts for the "when"
>>   argument in call_later() is a little too smart in my view and can lead
>>   to bugs.
>
> Agreed; that's why I left it out of the PEP. The v2 implementation
> will use time.monotonic(),
>
>> With some input, I'd be happy to produce patches.
>
> I hope I've given you enough input; it's probably better to discuss
> the specs first before starting to code. But please do review the
> tulip v2 code in the tulip subdirectory; if you want to help you I'll
> be happy to give you commit privileges to that repo, or I'll take
> patches if you send them.

OK great. Let me work on this over the next couple of days and
hopefully come up with something.

Regards,
Geert


From guido at python.org  Tue Dec 18 01:00:55 2012
From: guido at python.org (Guido van Rossum)
Date: Mon, 17 Dec 2012 16:00:55 -0800
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <20121217231134.19ede507@pitrou.net>
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
	<50CF9603.6040409@canterbury.ac.nz>
	<20121217231134.19ede507@pitrou.net>
Message-ID: <CAP7+vJLy4Nug=oVqZQWLg2B=CZOew6FAC38m37sY4n__dvtiEQ@mail.gmail.com>

On Mon, Dec 17, 2012 at 2:11 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Tue, 18 Dec 2012 11:00:35 +1300
> Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>> Guido van Rossum wrote:
>> > (*) Most event loops I've seen use e.g. 30 seconds or 1 hour as
>> > infinity, with the idea that if somehow a race condition added
>> > something to the ready queue just as we went to sleep, and there's no
>> > I/O at all, the system will recover eventually.
>>
>> I don't see how such a race condition can occur in a
>> cooperative multitasking system. There are no true
>> interrupts that can cause something to happen when
>> you're not expecting it. So I'd say let infinity
>> really mean infinity.
>
> Most event loops out there allow you to schedule callbacks from other
> (preemptive, OS-level) threads.

That's what call_soon_threadsafe() is for. But bugs happen (in either
user code or library code). And yes, call_soon_threadsafe() will use a
self-pipe on UNIX. (I hope someone else will write the Windows main
loop.)

-- 
--Guido van Rossum (python.org/~guido)


From guido at python.org  Tue Dec 18 01:40:47 2012
From: guido at python.org (Guido van Rossum)
Date: Mon, 17 Dec 2012 16:40:47 -0800
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CADbA=FXM0JEirL_DHsO_fh_-ffU_BVON2d903mAc58QaiOa-fg@mail.gmail.com>
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
	<CADbA=FXM0JEirL_DHsO_fh_-ffU_BVON2d903mAc58QaiOa-fg@mail.gmail.com>
Message-ID: <CAP7+vJKEyFy_XS4iy2fsbHHfso2=eFaN9cevkjjkEwbQThUSiQ@mail.gmail.com>

On Mon, Dec 17, 2012 at 2:57 PM, Geert Jansen <geertj at gmail.com> wrote:
> On Mon, Dec 17, 2012 at 6:47 PM, Guido van Rossum <guido at python.org> wrote:
>> I've not used repeatable timers myself but I see them in several other
>> interfaces. I do think they deserve a different method call to set
>> them up, even if the implementation will just be to add a repeat field
>> to the DelayedCall. When I start a timer with a 2 second repeat, does
>> it run now and then 2, 4, 6, ... seconds after, or should the first
>> run be in 2 seconds? Or are these separate parameters? Strawman
>> proposal: it runs in 2 seconds and then every 2 seconds. The API would
>> be event_loop.call_repeatedly(interval, callback, *args), returning a
>> DelayedCall with an interval attribute set to the interval value.
>
> That would work (in 2 secs, then 4, 6, ...). This is the Qt QTimer model.
>
> Both libev and libuv have a slightly more general timer that take a
> timeout and a repeat value. When the timeout reaches zero, the timer
> will fire, and if repeat != 0, it will re-seed the timeout to that
> value.
>
> I haven't seen any real need for such a timer where interval !=
> repeat, and in any case it can pretty cheaply be emulated by adding a
> new timer on the first expiration only. So your call_repeatedly() call
> above should be fine.

I'm trying to stick to a somewhat minimalistic design here; repeated
timers sound fine; extra complexities seem redundant. (What's next --
built-in support for exponential back-off? :-)

>> (BTW, can someone *please* come up with a better name for DelayedCall?
>> It's tedious and doesn't abbreviate well. But I don't want to name the
>> class 'Callback' since I already use 'callback' for function objects
>> that are used as callbacks.)
>
> libev uses the generic term "Watcher", libuv uses "Handle". But their
> APIs are structured a bit differently from tulip so i'm not sure if
> those names would make sense. They support many different types of
> events (including more esoteric events like process watches, on-fork
> handlers, and wall-clock timer events). Each event has its own class
> that named after the event type, and that inherits from "Watcher" or
> "Handle". When an event is created, you pass it a reference to its
> loop. You manage the event fully through the event instance (e.g.
> starting it, setting its callback and other parameters, stopping it).
> The loop has only a few methods, notably "run" and "run_once".

I see. That's a fundamentally different API style, and one I'm less
familiar with. DelayedCall isn't meant to be that at all -- it's just
meant to be this object that (a) is sortable by time (needed for
heapq) and (b) can be cancelled (useful functionality in general). I
expect that at least one of the reasons for libuv etc. to do it their
way is probably that the languages are different -- Python has keyword
arguments to pass options, while C/C++ must use something else.

Anyway, Handler sounds like a pretty good name. Let me think it over.

> So for example, you'd say:
>
> loop = Loop()
> timer = Timer(loop)
> timer.start(2.0, callback)
> loop.run()
>
> The advantages of this approach is that naming is easier, and that you
> can also have a natural place to put methods that update the event
> after you created it. For example, you might want to temporarily
> suspend a timer or change its interval.

Ah, that's where the desire to cancel and restart a callback comes from.

> I quite liked the fresh approach taken by tulip so that's why i tried
> to stay within its design. However, the disadvantage is that modifying
> events after you've created them is difficult (unless you create one
> DelayedCall subtype per event in which case you're probably better off
> creating those events through their constructor in the first place).

I wonder how often one needs to modify an event after it's been in use
for a while. The mutation API seems mostly useful to separate
construction from setting various parameters (to avoid insane
overloading of the constructor).

>>> * It would be nice to be a way to call a callback once per loop iteration.
>>>   An example here is dispatching in libdbus. The easiest way to do this is
>>>   to call dbus_connection_dispatch() every iteration of the loop (a more
>>>   complicated way exists to get notifications when the dispatch status
>>>   changes, but it is edge triggered and difficult to get right).
>>>
>>>   This could possibly be implemented by adding a "repeat" argument to
>>>   call_soon().
>>
>> Again, I'd rather introduce a new method. What should the semantics
>> be? Is this called just before or after we potentially go to sleep, or
>> at some other point, or at the very top or bottom of run_once()?
>
> That is a good question. Both libuv and libev have both options. The
> one that is called before we go to sleep is called a "Prepare"
> handler, the one after we come back from sleep a "Check" handler. The
> libev documentation has some words on check and prepare handlers here:
>
> http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod#code_ev_prepare_code_and_code_ev_che
>
> I am not sure both are needed, but i can't oversee all the consequences.

I'm still not convinced that both are needed. However they are easy to
add, so if the need really does arise in practical use I am fine with
evolving the API that way. Until then, let's stick to KISS.

>> How about the following semantics for run_once():
>>
>> 1. compute deadline as the smallest of:
>>     - the time until the first event in the timer heap, if non empty
>>     - 0 if the ready queue is non empty
>>     - Infinity(*)
>>
>> 2. poll for I/O with the computed deadline, adding anything that is
>> ready to the ready queue
>>
>> 3. run items from the ready queue until it is empty
>
> I think doing this would work but i again can't fully oversee all the
> consequences. Let me play with this a little.

It's hard to oversee all consequences. But it looks good to me too, so
I'll implement it this way. Maybe the Twisted folks have wisdom in
this area (though quite often, when pressed, they admit that their
APIs are not ideal, and have warts due to backward compatibility :-).

>> (*) Most event loops I've seen use e.g. 30 seconds or 1 hour as
>> infinity, with the idea that if somehow a race condition added
>> something to the ready queue just as we went to sleep, and there's no
>> I/O at all, the system will recover eventually. But I've also heard
>> people worried about power conservation on mobile devices (or laptops)
>> complain about servers that wake up regularly even when there is no
>> work to do. Thoughts? I think I'll leave this out of the PEP, but what
>> should Tulip do?
>
> I had a look at libuv and libev. They take two different approaches:
>
> * libev uses a ~60 second timeout by default. This reason is subtle.
> Libev supports a wall-clock time event that fires when a certain
> wall-clock time has passed. Having a non-infinite timeout will allow
> it to pick up changes to the system time (e.g. by NTP), which would
> change when the wall-clock timer needs to run.
>
> * libuv does not have a wall-clock timer and uses an infinite timeout.

I've not actually ever seen a use case for the wall-clock timer, so
I've taken it out.

> In my view it would be best for tulip to use an infinite timeout
> unless at some point a wall-clock timer will be added. That will help
> with power management. Regarding race-conditions, i think they should
> be solved in other ways (e.g by having a special method that can post
> callbacks to the loop in a thread-safe way and possibly write to a
> self-pipe).

Right, a self-pipe is already there. I'll stick with infinity in
Tulip, but an implementation can of course do what it wants to.

>> Hm. The PEP currently states that you can call cancel() on the
>> DelayedCall returned by e.g. add_reader() and it will act as if you
>> called remove_reader(). (Though I haven't implemented this yet --
>> either there would have to be a cancel callback on the DelayedCall or
>> the effect would be delayed.)
>
> Right now i think that cancelling a DelayedCall is not safe. It could
> busy-loop if the fd is ready.

That's because I'm not done implementing it. :-) But the more I think
about it the more I don't like calling cancel() on a read/write
handler.

>> But multiple callbacks per FD seems a different issue -- currently
>> add_reader() just replaces the previous callback if one is already
>> set. Since not every event loop can support this, I'm not sure it
>> ought to be in the PEP, and making it optional sounds like a recipe
>> for trouble (a library that depends on this may break subtly or only
>> under pressure). Also, what's the use case? If you really need this
>> you are free to implement a mechanism on top of the standard in user
>> code that dispatches to multiple callbacks -- that sounds like a small
>> amount of work if you really need it, but it sounds like an attractive
>> nuisance to put this in the spec.
>
> A not-so-good use case are libraries like libdbus that don't document
> their assumptions regarding this. For example, i have to provide an
> "add watch" function that creates a new watch (a watch is just a
> generic term for an FD event that can be read, write or read|write). I
> have observed that it only ever sets one read and one write watch per
> FD.
>
> If we go for one reader/writer per FD, then it's probably fine, but it
> would be nice if code that does install multiple readers/writers per
> FD would get an exception rather than silently updating the callback.
> The requirement could be that you need to remove the event before you
> can add a new event for the same FD.

That makes sense. If we wanted to be fancy we could have several
different APIs: add (must not be set), set (may be set), replace (must
be set). But I think just offering the add and remove APIs is nicely
minimalistic and lets you do everything else with ease. (I'll make the
remove API return True if it did remove something, False otherwise.)

>>> * After a DelayedCall is cancelled, it would also be very useful to have a
>>>   second method to enable it again. Having that functionality is more
>>>   efficient than creating a new event. For example, the D-BUS event loop
>>>   integration API has specific methods for toggling events on and off that
>>>   you need to provide.
>>
>> Really? Doesn't this functionality imply that something (besides user
>> code) is holding on to the DelayedCall after it is cancelled?
>
> Not that i can see. At least not for libuv and libev.

Never mind, this is just due to the difference in API style. I'm going
to ignore it unless I get a lot more pushback.

>> It seems
>> iffy to have to bend over backwards to support this alternate way of
>> doing something that we can already do, just because (on some
>> platform?) it might shave a microsecond off callback registration.
>
> According to the libdbus documentation there is a separate function to
> toggle an event on/off because that could be implemented without
> allocating memory.

Yeah, not gonna happen in Python. :-)

> But actually there's one kind-of idiomatic use for this that i've seen
> quite a few times in libraries. Assume you have a library that defines
> a connection. Often, you create two events for that connection in the
> constructor: a "write_event" and a "read_event". The read_event is
> normally enabled, but gets temporarily disabled when you need to
> throttle input. The write_event is normally disabled except when you
> get a short write on output.
>
> Just enabling/disabling these events is a bit more friendly to the
> programmer IMHO than having to cancel and recreate them when needed.

The methods on the Transport class take care of this at a higher
level: pause() and resume() to suspend reading, and the write() method
takes care of buffering and so on.

>>> * (Nitpick) Multiplexing absolute and relative timeouts for the "when"
>>>   argument in call_later() is a little too smart in my view and can lead
>>>   to bugs.
>>
>> Agreed; that's why I left it out of the PEP. The v2 implementation
>> will use time.monotonic(),
>>
>>> With some input, I'd be happy to produce patches.
>>
>> I hope I've given you enough input; it's probably better to discuss
>> the specs first before starting to code. But please do review the
>> tulip v2 code in the tulip subdirectory; if you want to help you I'll
>> be happy to give you commit privileges to that repo, or I'll take
>> patches if you send them.
>
> OK great. Let me work on this over the next couple of days and
> hopefully come up with something.

Excellent. Please do check back regularly for additions to the tulip
subdirectory!

-- 
--Guido van Rossum (python.org/~guido)


From ncoghlan at gmail.com  Tue Dec 18 04:20:53 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 18 Dec 2012 13:20:53 +1000
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CAP7+vJKEyFy_XS4iy2fsbHHfso2=eFaN9cevkjjkEwbQThUSiQ@mail.gmail.com>
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
	<CADbA=FXM0JEirL_DHsO_fh_-ffU_BVON2d903mAc58QaiOa-fg@mail.gmail.com>
	<CAP7+vJKEyFy_XS4iy2fsbHHfso2=eFaN9cevkjjkEwbQThUSiQ@mail.gmail.com>
Message-ID: <CADiSq7c1c+nfD6SJFFY_qS5PhHgb8O6dYwdhvLyYwr=QnPbraA@mail.gmail.com>

On Tue, Dec 18, 2012 at 10:40 AM, Guido van Rossum <guido at python.org> wrote:

> I see. That's a fundamentally different API style, and one I'm less
> familiar with. DelayedCall isn't meant to be that at all -- it's just
> meant to be this object that (a) is sortable by time (needed for
> heapq) and (b) can be cancelled (useful functionality in general). I
> expect that at least one of the reasons for libuv etc. to do it their
> way is probably that the languages are different -- Python has keyword
> arguments to pass options, while C/C++ must use something else.
>
> Anyway, Handler sounds like a pretty good name. Let me think it over.
>

Is DelayedCall a subclass of Future, like Task? If so, FutureCall might
work.

>>> * It would be nice to be a way to call a callback once per loop
> iteration.
> >>>   An example here is dispatching in libdbus. The easiest way to do
> this is
> >>>   to call dbus_connection_dispatch() every iteration of the loop (a
> more
> >>>   complicated way exists to get notifications when the dispatch status
> >>>   changes, but it is edge triggered and difficult to get right).
> >>>
> >>>   This could possibly be implemented by adding a "repeat" argument to
> >>>   call_soon().
> >>
> >> Again, I'd rather introduce a new method. What should the semantics
> >> be? Is this called just before or after we potentially go to sleep, or
> >> at some other point, or at the very top or bottom of run_once()?
> >
> > That is a good question. Both libuv and libev have both options. The
> > one that is called before we go to sleep is called a "Prepare"
> > handler, the one after we come back from sleep a "Check" handler. The
> > libev documentation has some words on check and prepare handlers here:
> >
> >
> http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod#code_ev_prepare_code_and_code_ev_che
> >
> > I am not sure both are needed, but i can't oversee all the consequences.
>
> I'm still not convinced that both are needed. However they are easy to
> add, so if the need really does arise in practical use I am fine with
> evolving the API that way. Until then, let's stick to KISS.
>


> > * libev uses a ~60 second timeout by default. This reason is subtle.
> > Libev supports a wall-clock time event that fires when a certain
> > wall-clock time has passed. Having a non-infinite timeout will allow
> > it to pick up changes to the system time (e.g. by NTP), which would
> > change when the wall-clock timer needs to run.
> >
> > * libuv does not have a wall-clock timer and uses an infinite timeout.
>
> I've not actually ever seen a use case for the wall-clock timer, so
> I've taken it out.
>

If someone really does want a wall-clock timer with a given granularity, it
can be handled by adding a repeating timer with that granularity (with the
obvious consequences for low power modes).


> >> But multiple callbacks per FD seems a different issue -- currently
> >> add_reader() just replaces the previous callback if one is already
> >> set. Since not every event loop can support this, I'm not sure it
> >> ought to be in the PEP, and making it optional sounds like a recipe
> >> for trouble (a library that depends on this may break subtly or only
> >> under pressure). Also, what's the use case? If you really need this
> >> you are free to implement a mechanism on top of the standard in user
> >> code that dispatches to multiple callbacks -- that sounds like a small
> >> amount of work if you really need it, but it sounds like an attractive
> >> nuisance to put this in the spec.
> >
> > A not-so-good use case are libraries like libdbus that don't document
> > their assumptions regarding this. For example, i have to provide an
> > "add watch" function that creates a new watch (a watch is just a
> > generic term for an FD event that can be read, write or read|write). I
> > have observed that it only ever sets one read and one write watch per
> > FD.
> >
> > If we go for one reader/writer per FD, then it's probably fine, but it
> > would be nice if code that does install multiple readers/writers per
> > FD would get an exception rather than silently updating the callback.
> > The requirement could be that you need to remove the event before you
> > can add a new event for the same FD.
>
> That makes sense. If we wanted to be fancy we could have several
> different APIs: add (must not be set), set (may be set), replace (must
> be set). But I think just offering the add and remove APIs is nicely
> minimalistic and lets you do everything else with ease. (I'll make the
> remove API return True if it did remove something, False otherwise.)
>

Perhaps the best bet would be to have the standard API allow multiple
callbacks, and emulate that on systems which don't natively support
multiple callbacks for a single event?

Otherwise, I don't see how an event loop could efficiently expose access to
the multiple callback APIs without requiring awkward fallbacks in the code
interacting with the event loop. Given that the natural fallback
implementation is reasonably clear (i.e. a single callback that calls all
of the other callbacks), why force reimplementing that on users rather than
event loop authors?

Related, the protocol/transport API design may end up needing to consider
the gather/scatter problem (i.e. fanning out data from a single transport
to multiple consumers, as well as feeding data from multiple producers into
a single underlying transport). Actual *implementations* of such tools
shouldn't be needed in the standard suite, but at least understanding how
you would go about writing multiplexers and demultiplexers can be a good
test of a stacked I/O design.

> Just enabling/disabling these events is a bit more friendly to the
> > programmer IMHO than having to cancel and recreate them when needed.
>
> The methods on the Transport class take care of this at a higher
> level: pause() and resume() to suspend reading, and the write() method
> takes care of buffering and so on.
>

And the main advantage of handling that at a higher level is that suitable
buffering designs are going to be transport specific.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121218/ccaae0ef/attachment.html>

From ncoghlan at gmail.com  Tue Dec 18 04:26:38 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 18 Dec 2012 13:26:38 +1000
Subject: [Python-ideas] Graph class
In-Reply-To: <50CE5904.9090102@krosing.net>
References: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>
	<CAOvn4qgkX4+4B9PmmoRmHC+HG-6s+HHb6BiQfzEo+RTaFVdz+Q@mail.gmail.com>
	<CAMjeLr9wymGhGaOEa=qK-=nK6_A6ZCz+9ivBnz2rwAFh3AmX8A@mail.gmail.com>
	<CACac1F-K7iOEf60uR9x5SwSPEzXqHkbbFw-zmVxguogF=Qy-hg@mail.gmail.com>
	<CAMjeLr9KdF6n5U29BTYt7wmQd1GQv0n-3zs8V+n6UfoEuoy=Jg@mail.gmail.com>
	<CAMjeLr-hmit-GKuicXSXv8xuZASFqWAK6pLGjNvJ_+gMhq0vpQ@mail.gmail.com>
	<87txru1wxr.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CAMjeLr_2aSj44hsBa1P2J47Xh-DzHVcr0etcmHpKQnXn0yRQMw@mail.gmail.com>
	<loom.20121216T144624-97@post.gmane.org>
	<CAP7+vJL7kLAkqMjkWhaqEeaLVec3Q8HpvRRVmdRikOOgb_k0wg@mail.gmail.com>
	<50CE5904.9090102@krosing.net>
Message-ID: <CADiSq7dU6=wHj8+4fyhRvXw55MZpT=v9TBJw+G_L=eUwF_Q1FQ@mail.gmail.com>

On Mon, Dec 17, 2012 at 9:28 AM, Hannu Krosing <hannu at krosing.net> wrote:

>  On 12/16/2012 04:41 PM, Guido van Rossum wrote:
>
> I think of graphs and trees as patterns, not data structures.
>
>
> How do you draw line between what is data structure and what is pattern ?
>

A rough rule of thumb is that if it's harder to remember the configuration
options in the API than it is to just write a purpose-specific function,
it's probably better as a pattern that can be tweaked for a given use case
than it is as an actual data structure.

More generally, ABCs and magic methods are used to express patterns (like
iteration), which may be implemented by various data structures.

A graph library that focused on defining a good abstraction (and adapters)
that allowed graph algorithms to be written that worked with multiple
existing Python graph data structures could be quite interesting.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121218/cff3927b/attachment.html>

From guido at python.org  Tue Dec 18 05:01:18 2012
From: guido at python.org (Guido van Rossum)
Date: Mon, 17 Dec 2012 20:01:18 -0800
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CADiSq7c1c+nfD6SJFFY_qS5PhHgb8O6dYwdhvLyYwr=QnPbraA@mail.gmail.com>
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
	<CADbA=FXM0JEirL_DHsO_fh_-ffU_BVON2d903mAc58QaiOa-fg@mail.gmail.com>
	<CAP7+vJKEyFy_XS4iy2fsbHHfso2=eFaN9cevkjjkEwbQThUSiQ@mail.gmail.com>
	<CADiSq7c1c+nfD6SJFFY_qS5PhHgb8O6dYwdhvLyYwr=QnPbraA@mail.gmail.com>
Message-ID: <CAP7+vJJOo0kvOiHek59eT-mUBfocw6gP3YpMCR6uhPJR+6LVtA@mail.gmail.com>

On Mon, Dec 17, 2012 at 7:20 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Tue, Dec 18, 2012 at 10:40 AM, Guido van Rossum <guido at python.org> wrote:

[A better name for DelayedCall]
>> Anyway, Handler sounds like a pretty good name. Let me think it over.

> Is DelayedCall a subclass of Future, like Task? If so, FutureCall might
> work.

No, they're completely related. (I'm even thinking of renaming its
cancel() to avoid the confusion?

I still like Handler best. In fact, if I'd thought of Handler before,
I wouldn't have asked for a better name. :-)

Going once, going twice...

[Wall-clock timers]
> If someone really does want a wall-clock timer with a given granularity, it
> can be handled by adding a repeating timer with that granularity (with the
> obvious consequences for low power modes).

+1.

[Multiple calls per FD]
>> That makes sense. If we wanted to be fancy we could have several
>> different APIs: add (must not be set), set (may be set), replace (must
>> be set). But I think just offering the add and remove APIs is nicely
>> minimalistic and lets you do everything else with ease. (I'll make the
>> remove API return True if it did remove something, False otherwise.)

> Perhaps the best bet would be to have the standard API allow multiple
> callbacks, and emulate that on systems which don't natively support multiple
> callbacks for a single event?

Hm. AFAIK Twisted doesn't support this either. Antoine, do you know? I
didn't see it in the Tornado event loop either.

> Otherwise, I don't see how an event loop could efficiently expose access to
> the multiple callback APIs without requiring awkward fallbacks in the code
> interacting with the event loop. Given that the natural fallback
> implementation is reasonably clear (i.e. a single callback that calls all of
> the other callbacks), why force reimplementing that on users rather than
> event loop authors?

But what's the use case?

I don't think our goal should be to offer APIs for any feature that
any event loop might offer. It's not quite a least-common denominator
either though -- it's about offering commonly needed functionality,
and interoperability.

Also, event loop implementations are allowed to offer additional APIs
on their implementation. If the need for multiple handlers per FD only
exists on those platforms where the platform's event loop supports it,
no harm is done if the functionality is only available through a
platform-specific API.

But still, I don't understand the use case. Possibly it is using file
descriptors as a more general signaling mechanism? That sounds pretty
platform specific anyway (on Windows, FDs must represent sockets).

If someone shows me a real-world use case I may change my mind.

> Related, the protocol/transport API design may end up needing to consider
> the gather/scatter problem (i.e. fanning out data from a single transport to
> multiple consumers, as well as feeding data from multiple producers into a
> single underlying transport). Actual *implementations* of such tools
> shouldn't be needed in the standard suite, but at least understanding how
> you would go about writing multiplexers and demultiplexers can be a good
> test of a stacked I/O design.

Twisted supports this for writing through its writeSequence(), which
appears in Tulip and PEP 3156 as writelines(). (Though IIRC Glyph told
me that Twisted rarely uses the platform's scatter/gather primitives,
because they are so damn hard to use, and the kernel implementation
often just joins the buffers together before passing it to the regular
send()...)

But regardless, I don't think scatter/gather would use multiple
callbacks per FD.

I think it would be really hard to benefit from reading into multiple
buffers in Python.

>> > Just enabling/disabling these events is a bit more friendly to the
>> > programmer IMHO than having to cancel and recreate them when needed.
>>
>> The methods on the Transport class take care of this at a higher
>> level: pause() and resume() to suspend reading, and the write() method
>> takes care of buffering and so on.

> And the main advantage of handling that at a higher level is that suitable
> buffering designs are going to be transport specific.

+1

-- 
--Guido van Rossum (python.org/~guido)


From ncoghlan at gmail.com  Tue Dec 18 08:21:37 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 18 Dec 2012 17:21:37 +1000
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CAP7+vJJOo0kvOiHek59eT-mUBfocw6gP3YpMCR6uhPJR+6LVtA@mail.gmail.com>
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
	<CADbA=FXM0JEirL_DHsO_fh_-ffU_BVON2d903mAc58QaiOa-fg@mail.gmail.com>
	<CAP7+vJKEyFy_XS4iy2fsbHHfso2=eFaN9cevkjjkEwbQThUSiQ@mail.gmail.com>
	<CADiSq7c1c+nfD6SJFFY_qS5PhHgb8O6dYwdhvLyYwr=QnPbraA@mail.gmail.com>
	<CAP7+vJJOo0kvOiHek59eT-mUBfocw6gP3YpMCR6uhPJR+6LVtA@mail.gmail.com>
Message-ID: <CADiSq7f2RrZ1mpnJwunszdoaW7usKAtkwRCkT5PziZEZGoqgdw@mail.gmail.com>

On Tue, Dec 18, 2012 at 2:01 PM, Guido van Rossum <guido at python.org> wrote:

> On Mon, Dec 17, 2012 at 7:20 PM, Nick Coghlan <ncoghlan at gmail.com>
> wrote:Also, event loop implementations are allowed to offer additional APIs
> on their implementation. If the need for multiple handlers per FD only
> exists on those platforms where the platform's event loop supports it,
> no harm is done if the functionality is only available through a
> platform-specific API.
>

Sure, but since we know this capability is offered by multiple event loops,
it would be good if there was a defined way to go about exposing it.


> But still, I don't understand the use case. Possibly it is using file
> descriptors as a more general signaling mechanism? That sounds pretty
> platform specific anyway (on Windows, FDs must represent sockets).
>
> If someone shows me a real-world use case I may change my mind.
>

The most likely use case that comes to mind is monitoring and debugging
(i.e. the event loop equivalent of a sys.settrace). Being able to tap into
a datastream (e.g. to dump it to a console or pipe it to a monitoring
process) can be really powerful, and being able to do it at the Python
level means you have this kind of capability even without root access to
the machine to run Wireshark.

There are other more obscure signal analysis use cases that occur to me,
but those could readily be handled with a custom transport implementation
that duplicated that data stream, so I don't think there's any reason to
worry about those.

 > Related, the protocol/transport API design may end up needing to consider
> > the gather/scatter problem (i.e. fanning out data from a single
> transport to
> > multiple consumers, as well as feeding data from multiple producers into
> a
> > single underlying transport). Actual *implementations* of such tools
> > shouldn't be needed in the standard suite, but at least understanding how
> > you would go about writing multiplexers and demultiplexers can be a good
> > test of a stacked I/O design.
>
> Twisted supports this for writing through its writeSequence(), which
> appears in Tulip and PEP 3156 as writelines(). (Though IIRC Glyph told
> me that Twisted rarely uses the platform's scatter/gather primitives,
> because they are so damn hard to use, and the kernel implementation
> often just joins the buffers together before passing it to the regular
> send()...)
>
> But regardless, I don't think scatter/gather would use multiple
> callbacks per FD.
>
> I think it would be really hard to benefit from reading into multiple
> buffers in Python.
>

Sorry, I wasn't quite clear on what I meant by gather/scatter and it's more
a protocol thing than an event loop thing.

Specifically, gather/scatter interfaces are most useful for multiplexed
transports. The ones I'm particularly familiar with are traditional
telephony transports like E1 links, with 15 time-division-multiplexed
channels on the wire (and a signalling timeslot), as well a few different
HF comms protocols. When reading from one of those, you have a
demultiplexing component which is reading the serial data coming in on the
wire and making it look like 15 distinct data channels from the
application's point of view. Similarly, the output multiplexer takes 15
streams of data from the application and interleaves them into the single
stream on the wire.

The rise of packet switching means that sharing connections like that is
increasingly less common, though, so gather/scatter devices are
correspondingly less useful in a networking context. The only modern use
cases I can think of that someone might want to handle with Python are
things like sharing a single USB or classic serial connection amongst
multiple data streams. However, I suspect the standard transport and
protocol API definitions already proposed should also suffice for the
gather/scatter use case, as such a component would largely work like any
other protocol-as-transport adapter, with the difference being that there
would be a many-to-one relationship between the number of interfaces on the
application side and those on the communications side.

(Technically, gather/scatter components can also be used the other way
around to distribute a single data stream across multi transports, but that
use case is even less likely to come up when programming in Python.
Multi-channel HF data comms is the only possibility that really comes to
mind)

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121218/d73c5cbb/attachment.html>

From solipsis at pitrou.net  Tue Dec 18 08:29:55 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 18 Dec 2012 08:29:55 +0100
Subject: [Python-ideas] async: feedback on EventLoop API
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
	<CADbA=FXM0JEirL_DHsO_fh_-ffU_BVON2d903mAc58QaiOa-fg@mail.gmail.com>
	<CAP7+vJKEyFy_XS4iy2fsbHHfso2=eFaN9cevkjjkEwbQThUSiQ@mail.gmail.com>
	<CADiSq7c1c+nfD6SJFFY_qS5PhHgb8O6dYwdhvLyYwr=QnPbraA@mail.gmail.com>
	<CAP7+vJJOo0kvOiHek59eT-mUBfocw6gP3YpMCR6uhPJR+6LVtA@mail.gmail.com>
Message-ID: <20121218082955.77e325e0@pitrou.net>

On Mon, 17 Dec 2012 20:01:18 -0800
Guido van Rossum <guido at python.org> wrote:
> [Multiple calls per FD]
> >> That makes sense. If we wanted to be fancy we could have several
> >> different APIs: add (must not be set), set (may be set), replace (must
> >> be set). But I think just offering the add and remove APIs is nicely
> >> minimalistic and lets you do everything else with ease. (I'll make the
> >> remove API return True if it did remove something, False otherwise.)
> 
> > Perhaps the best bet would be to have the standard API allow multiple
> > callbacks, and emulate that on systems which don't natively support multiple
> > callbacks for a single event?
> 
> Hm. AFAIK Twisted doesn't support this either. Antoine, do you know? I
> didn't see it in the Tornado event loop either.

I think neither Twisted nor Tornado support it. add_reader() /
add_writer() APIs are not for the end user, they are a building block
for the framework to write higher-level abstractions.
(although, Tornado being quite low-level, you can end up having to use
add_reader() / add_writer() anyway - e.g. for UDP)

It also doesn't seem to me to make a lot of sense to allow multiplexing
at the event loop level. It is probably a protocol- or transport- level
feature (depending on the protocol and transport, obviously :-)).

Nick mentions debugging / monitoring, but I don't understand how you do
that with a write callback (or a read callback, actually, since
reading from a socket will consume the data and make it unavailable
for other readers). You really need to do it at a protocol/transport's
write()/data_received() level.

Regards

Antoine.




From benoitc at gunicorn.org  Tue Dec 18 08:25:17 2012
From: benoitc at gunicorn.org (Benoit Chesneau)
Date: Tue, 18 Dec 2012 08:25:17 +0100
Subject: [Python-ideas] Late to the async party (PEP 3156)
In-Reply-To: <20121216111602.383ebf4d@pitrou.net>
References: <50CD2592.5010507@urandom.ca>
	<CAP7+vJJ0RbHS0fU1zPNAsLt83RDtyVmH1GeWFfNiUZva_t+5ow@mail.gmail.com>
	<20121216111602.383ebf4d@pitrou.net>
Message-ID: <37A96766-6709-4B85-9005-7221A753A2FF@gunicorn.org>


On Dec 16, 2012, at 11:16 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:

> On Sat, 15 Dec 2012 21:37:15 -0800
> Guido van Rossum <guido at python.org> wrote:
>> Hi Jason,
>> 
>> I don't think you've missed anything. I had actually planned to keep
>> PEP 3156 unpublished for a bit longer, since I'm not done writing the
>> reference implementation -- I'm sure that many of the issues currently
>> marked open or TBD will be resolved that way. There hasn't been any
>> public discussion since the last threads on python-ideas some weeks
>> ago -- however I've met in person with some Twisted folks and
>> exchanged private emails with some other interested parties.
> 
> For the record, have you looked at the pyuv API? It's rather nicely
> orthogonal, although it lacks a way to stop the event loop.
> https://pyuv.readthedocs.org/en
> 


For now the only way to stop the event loop is either to stop any events in trigger its execution in a loop:

    while True:
        if loop.run_once():
            ?
            continue

If you have any questions about it I can help. I plan to use it in my own lib and already use it in gaffer [1]. One of the advantage of libuv is its multi-platform support: on windows it is using IOCP, on unix, plain sockets apis , etc?


- beno?t

[1] http://github.com/benoitc/gaffer

From ncoghlan at gmail.com  Tue Dec 18 08:39:39 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 18 Dec 2012 17:39:39 +1000
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <20121218082955.77e325e0@pitrou.net>
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
	<CADbA=FXM0JEirL_DHsO_fh_-ffU_BVON2d903mAc58QaiOa-fg@mail.gmail.com>
	<CAP7+vJKEyFy_XS4iy2fsbHHfso2=eFaN9cevkjjkEwbQThUSiQ@mail.gmail.com>
	<CADiSq7c1c+nfD6SJFFY_qS5PhHgb8O6dYwdhvLyYwr=QnPbraA@mail.gmail.com>
	<CAP7+vJJOo0kvOiHek59eT-mUBfocw6gP3YpMCR6uhPJR+6LVtA@mail.gmail.com>
	<20121218082955.77e325e0@pitrou.net>
Message-ID: <CADiSq7cTR24zF=sf1CpNZ=wT-kjw1aDSHaF+MryeTJ5QJqENOw@mail.gmail.com>

On Tue, Dec 18, 2012 at 5:29 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:

> Nick mentions debugging / monitoring, but I don't understand how you do
> that with a write callback (or a read callback, actually, since
> reading from a socket will consume the data and make it unavailable
> for other readers). You really need to do it at a protocol/transport's
> write()/data_received() level.
>

Yeah, monitoring probably falls into the same gather/scatter design model
as demultiplexing (receive side) and multi-channel transports (transmit
side).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121218/5a1f3581/attachment.html>

From geertj at gmail.com  Tue Dec 18 08:26:00 2012
From: geertj at gmail.com (Geert Jansen)
Date: Tue, 18 Dec 2012 08:26:00 +0100
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CAP7+vJLy4Nug=oVqZQWLg2B=CZOew6FAC38m37sY4n__dvtiEQ@mail.gmail.com>
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
	<50CF9603.6040409@canterbury.ac.nz>
	<20121217231134.19ede507@pitrou.net>
	<CAP7+vJLy4Nug=oVqZQWLg2B=CZOew6FAC38m37sY4n__dvtiEQ@mail.gmail.com>
Message-ID: <CADbA=FUZcfZfNi8OXyYbcRY9X+YTnvwZmUckK_QNUX_kn6h7ig@mail.gmail.com>

On Tue, Dec 18, 2012 at 1:00 AM, Guido van Rossum <guido at python.org> wrote:
> On Mon, Dec 17, 2012 at 2:11 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
>> On Tue, 18 Dec 2012 11:00:35 +1300
>> Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>>> Guido van Rossum wrote:
>>> > (*) Most event loops I've seen use e.g. 30 seconds or 1 hour as
>>> > infinity, with the idea that if somehow a race condition added
>>> > something to the ready queue just as we went to sleep, and there's no
>>> > I/O at all, the system will recover eventually.
>>>
>>> I don't see how such a race condition can occur in a
>>> cooperative multitasking system. There are no true
>>> interrupts that can cause something to happen when
>>> you're not expecting it. So I'd say let infinity
>>> really mean infinity.
>>
>> Most event loops out there allow you to schedule callbacks from other
>> (preemptive, OS-level) threads.
>
> That's what call_soon_threadsafe() is for. But bugs happen (in either
> user code or library code). And yes, call_soon_threadsafe() will use a
> self-pipe on UNIX. (I hope someone else will write the Windows main
> loop.)

I needed a self-pipe on Windows before. See below. With this, the
select() based loop might work unmodified on Windows.

https://gist.github.com/4325783

Of course it wouldn't be as efficient as an IOCP based loop.

Regards,
Geert


From tjreedy at udel.edu  Tue Dec 18 10:06:39 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 18 Dec 2012 04:06:39 -0500
Subject: [Python-ideas] Graph class
In-Reply-To: <CADiSq7dU6=wHj8+4fyhRvXw55MZpT=v9TBJw+G_L=eUwF_Q1FQ@mail.gmail.com>
References: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>
	<CAOvn4qgkX4+4B9PmmoRmHC+HG-6s+HHb6BiQfzEo+RTaFVdz+Q@mail.gmail.com>
	<CAMjeLr9wymGhGaOEa=qK-=nK6_A6ZCz+9ivBnz2rwAFh3AmX8A@mail.gmail.com>
	<CACac1F-K7iOEf60uR9x5SwSPEzXqHkbbFw-zmVxguogF=Qy-hg@mail.gmail.com>
	<CAMjeLr9KdF6n5U29BTYt7wmQd1GQv0n-3zs8V+n6UfoEuoy=Jg@mail.gmail.com>
	<CAMjeLr-hmit-GKuicXSXv8xuZASFqWAK6pLGjNvJ_+gMhq0vpQ@mail.gmail.com>
	<87txru1wxr.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CAMjeLr_2aSj44hsBa1P2J47Xh-DzHVcr0etcmHpKQnXn0yRQMw@mail.gmail.com>
	<loom.20121216T144624-97@post.gmane.org>
	<CAP7+vJL7kLAkqMjkWhaqEeaLVec3Q8HpvRRVmdRikOOgb_k0wg@mail.gmail.com>
	<50CE5904.9090102@krosing.net>
	<CADiSq7dU6=wHj8+4fyhRvXw55MZpT=v9TBJw+G_L=eUwF_Q1FQ@mail.gmail.com>
Message-ID: <kapbn3$661$1@ger.gmane.org>

On 12/17/2012 10:26 PM, Nick Coghlan wrote:
> On Mon, Dec 17, 2012 at 9:28 AM, Hannu Krosing
> <hannu at krosing.net
> <mailto:hannu at krosing.net>> wrote:
>
>     On 12/16/2012 04:41 PM, Guido van Rossum wrote:
>>     I think of graphs and trees as patterns, not data structures.
>
>     How do you draw line between what is data structure and what is
>     pattern ?
>
>
> A rough rule of thumb is that if it's harder to remember the
> configuration options in the API than it is to just write a
> purpose-specific function, it's probably better as a pattern that can be
> tweaked for a given use case than it is as an actual data structure.
>
> More generally, ABCs and magic methods are used to express patterns
> (like iteration), which may be implemented by various data structures.
>
> A graph library that focused on defining a good abstraction (and
> adapters) that allowed graph algorithms to be written that worked with
> multiple existing Python graph data structures could be quite interesting.

I was just thinking that what is needed, at least as a first step, is a 
graph api, like the db api, that would allow the writing of algorithms 
to one api and adapters to various implementations. I expect to be 
writing some graph algorithms (in Python) in the next year and will try 
to keep that idea in mind and see if it makes any sense, versus just 
whipping up a implementation that fits the particular problem.

-- 
Terry Jan Reedy



From solipsis at pitrou.net  Tue Dec 18 11:01:36 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 18 Dec 2012 11:01:36 +0100
Subject: [Python-ideas] PEP 3156 feedback
Message-ID: <20121218110136.1f85cfae@pitrou.net>


Hello,

Here is my own feedback on the in-progress PEP 3156. Please discard it
if it's too early to give feedback :-))

Event loop API
--------------

I would like to say that I prefer Tornado's model: for each primitive
provided by Tornado, you can pass an explicit Loop instance which you
instantiated manually.
There is no module function or policy object hiding this mechanism:
it's simple, explicit and flexible (in other words: if you want a
per-thread event loop, just do it yourself using TLS :-)).

There are some requirements I've found useful:

- being able to instantiate multiple loops, either at the same time or
  serially (this is especially nice for unit tests; Twisted has to use
  a dedicated test runner just because their reactor doesn't support
  multiple instances or restarts)

- being able to stop a loop explicitly: having to unregister all
  handlers or delayed calls is a PITA in non-trivial situations (for
  example you might have multiple protocol instances, each with a bunch
  of timers, some perhaps even in third-party libraries; keeping track
  of all this is the event loop's job)

* The optional sock_*() methods: how about having different ABCs, e.g.
  the EventLoop ABC for basic behaviour, and the NetworkedEventLoop ABC
  adding the socket helpers?

Protocols and transports
------------------------

We probably want to provide a Protocol base class and encourage people
to inherit it. It can provide useful functionality (perhaps write()
and writelines() shims? it can make mocking easier).

My own opinion about Twisted's API is that the Factory class is often
useless, and adds a cognitive burden. If you need a place to track all
protocols of a given kind (e.g. all connections), you can do it
yourself. Also, the Factory implies that you don't control how exactly
your protocol gets instantiated (unless you override some method on the
Factory I'm missing the name of: it is cumbersome).

So, when creating a client, I would pass it a protocol instance.
When creating a server, I would pass it a protocol class. Here the base
Protocol class comes into play, its __init__() could take the transport
as argument and set the "transport" attribute with it. Further args
could be optionally passed to the constructor:

class MyProtocol(Protocol):
    def __init__(self, transport, my_personal_attribute):
        Protocol.__init__(self, transport)
        self.my_personal_attribute = my_personal_attribute
    ...

def listen(ioloop):
    # Each new connection will instantiate a MyProtocol with "foobar"
    # for my_personal_attribute.
    ioloop.listen_tcp(("0.0.0.0", 8080), MyProtocol, "foobar")

(The hypothetical listen_tcp() is just a name: perhaps it's actually
start_serving(). It should accept any callable, not just a class:
therefore, you can define complex behaviour if you like)


I think the transport / protocol registration must be done early, not in
connection_made(). Sometimes you will want to do things on a protocol
before you know a connection is established, for example queue things
to write on the transport. An use case is a reconnecting TCP client:
the protocol will continue existing at times when the connection is
down.

Unconnected protocols need their own base class and API:
data_received()'s signature should be (data, remote_addr) or
(remote_addr, data). Same for write().

* writelines() sounds ambiguous for datagram protocols: does it send
  those "lines" as a single datagram, or one separate datagram per
  "line"? The equivalent code suggests the latter, but which one makes
  more sense?

* connection_lost(): you definitely want to know whether it's you or the
  other end who closed the connection. Typically, if the other end
  closed the connection, you will have to run some cleanup steps, and
  perhaps even log an error somewhere (if the connection was closed
  unexpectedly).
  Actually, I'm not sure it's useful to call connection_lost() when you
  closed the connection yourself: are there any use cases?


Regards

Antoine.




From shane at umbrellacode.com  Tue Dec 18 11:47:41 2012
From: shane at umbrellacode.com (Shane Green)
Date: Tue, 18 Dec 2012 02:47:41 -0800
Subject: [Python-ideas] Python-ideas Digest, Vol 73, Issue 38
In-Reply-To: <mailman.4948.1355815301.29568.python-ideas@python.org>
References: <mailman.4948.1355815301.29568.python-ideas@python.org>
Message-ID: <84FBCF87-EB2E-4887-9184-C5CE3B074ABA@umbrellacode.com>

Sending the demultiplexed data through 15 pipes so the application actually is dealing with 15 streams of data using single callback notifications from the event loop seems like the more KISS approach, in this case?





Shane Green 
www.umbrellacode.com
805-452-9666 | shane at umbrellacode.com

On Dec 17, 2012, at 11:21 PM, python-ideas-request at python.org wrote:

> Send Python-ideas mailing list submissions to
> 	python-ideas at python.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://mail.python.org/mailman/listinfo/python-ideas
> or, via email, send a message with subject or body 'help' to
> 	python-ideas-request at python.org
> 
> You can reach the person managing the list at
> 	python-ideas-owner at python.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Python-ideas digest..."
> Today's Topics:
> 
>   1. Re: Graph class (Nick Coghlan)
>   2. Re: async: feedback on EventLoop API (Guido van Rossum)
>   3. Re: async: feedback on EventLoop API (Nick Coghlan)
> 
> From: Nick Coghlan <ncoghlan at gmail.com>
> Subject: Re: [Python-ideas] Graph class
> Date: December 17, 2012 7:26:38 PM PST
> To: Hannu Krosing <hannu at krosing.net>
> Cc: Vinay Sajip <vinay_sajip at yahoo.co.uk>, "python-ideas at python.org" <python-ideas at python.org>
> 
> 
> On Mon, Dec 17, 2012 at 9:28 AM, Hannu Krosing <hannu at krosing.net> wrote:
> On 12/16/2012 04:41 PM, Guido van Rossum wrote:
>> I think of graphs and trees as patterns, not data structures.
> 
> How do you draw line between what is data structure and what is pattern ?
> 
> A rough rule of thumb is that if it's harder to remember the configuration options in the API than it is to just write a purpose-specific function, it's probably better as a pattern that can be tweaked for a given use case than it is as an actual data structure.
> 
> More generally, ABCs and magic methods are used to express patterns (like iteration), which may be implemented by various data structures.
> 
> A graph library that focused on defining a good abstraction (and adapters) that allowed graph algorithms to be written that worked with multiple existing Python graph data structures could be quite interesting.
> 
> Cheers,
> Nick.
> 
> -- 
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> 
> 
> 
> From: Guido van Rossum <guido at python.org>
> Subject: Re: [Python-ideas] async: feedback on EventLoop API
> Date: December 17, 2012 8:01:18 PM PST
> To: Nick Coghlan <ncoghlan at gmail.com>, Antoine Pitrou <solipsis at pitrou.net>
> Cc: python-ideas at python.org
> 
> 
> On Mon, Dec 17, 2012 at 7:20 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> On Tue, Dec 18, 2012 at 10:40 AM, Guido van Rossum <guido at python.org> wrote:
> 
> [A better name for DelayedCall]
>>> Anyway, Handler sounds like a pretty good name. Let me think it over.
> 
>> Is DelayedCall a subclass of Future, like Task? If so, FutureCall might
>> work.
> 
> No, they're completely related. (I'm even thinking of renaming its
> cancel() to avoid the confusion?
> 
> I still like Handler best. In fact, if I'd thought of Handler before,
> I wouldn't have asked for a better name. :-)
> 
> Going once, going twice...
> 
> [Wall-clock timers]
>> If someone really does want a wall-clock timer with a given granularity, it
>> can be handled by adding a repeating timer with that granularity (with the
>> obvious consequences for low power modes).
> 
> +1.
> 
> [Multiple calls per FD]
>>> That makes sense. If we wanted to be fancy we could have several
>>> different APIs: add (must not be set), set (may be set), replace (must
>>> be set). But I think just offering the add and remove APIs is nicely
>>> minimalistic and lets you do everything else with ease. (I'll make the
>>> remove API return True if it did remove something, False otherwise.)
> 
>> Perhaps the best bet would be to have the standard API allow multiple
>> callbacks, and emulate that on systems which don't natively support multiple
>> callbacks for a single event?
> 
> Hm. AFAIK Twisted doesn't support this either. Antoine, do you know? I
> didn't see it in the Tornado event loop either.
> 
>> Otherwise, I don't see how an event loop could efficiently expose access to
>> the multiple callback APIs without requiring awkward fallbacks in the code
>> interacting with the event loop. Given that the natural fallback
>> implementation is reasonably clear (i.e. a single callback that calls all of
>> the other callbacks), why force reimplementing that on users rather than
>> event loop authors?
> 
> But what's the use case?
> 
> I don't think our goal should be to offer APIs for any feature that
> any event loop might offer. It's not quite a least-common denominator
> either though -- it's about offering commonly needed functionality,
> and interoperability.
> 
> Also, event loop implementations are allowed to offer additional APIs
> on their implementation. If the need for multiple handlers per FD only
> exists on those platforms where the platform's event loop supports it,
> no harm is done if the functionality is only available through a
> platform-specific API.
> 
> But still, I don't understand the use case. Possibly it is using file
> descriptors as a more general signaling mechanism? That sounds pretty
> platform specific anyway (on Windows, FDs must represent sockets).
> 
> If someone shows me a real-world use case I may change my mind.
> 
>> Related, the protocol/transport API design may end up needing to consider
>> the gather/scatter problem (i.e. fanning out data from a single transport to
>> multiple consumers, as well as feeding data from multiple producers into a
>> single underlying transport). Actual *implementations* of such tools
>> shouldn't be needed in the standard suite, but at least understanding how
>> you would go about writing multiplexers and demultiplexers can be a good
>> test of a stacked I/O design.
> 
> Twisted supports this for writing through its writeSequence(), which
> appears in Tulip and PEP 3156 as writelines(). (Though IIRC Glyph told
> me that Twisted rarely uses the platform's scatter/gather primitives,
> because they are so damn hard to use, and the kernel implementation
> often just joins the buffers together before passing it to the regular
> send()...)
> 
> But regardless, I don't think scatter/gather would use multiple
> callbacks per FD.
> 
> I think it would be really hard to benefit from reading into multiple
> buffers in Python.
> 
>>>> Just enabling/disabling these events is a bit more friendly to the
>>>> programmer IMHO than having to cancel and recreate them when needed.
>>> 
>>> The methods on the Transport class take care of this at a higher
>>> level: pause() and resume() to suspend reading, and the write() method
>>> takes care of buffering and so on.
> 
>> And the main advantage of handling that at a higher level is that suitable
>> buffering designs are going to be transport specific.
> 
> +1
> 
> -- 
> --Guido van Rossum (python.org/~guido)
> 
> 
> 
> 
> From: Nick Coghlan <ncoghlan at gmail.com>
> Subject: Re: [Python-ideas] async: feedback on EventLoop API
> Date: December 17, 2012 11:21:37 PM PST
> To: Guido van Rossum <guido at python.org>
> Cc: Antoine Pitrou <solipsis at pitrou.net>, python-ideas at python.org
> 
> 
> On Tue, Dec 18, 2012 at 2:01 PM, Guido van Rossum <guido at python.org> wrote:
> On Mon, Dec 17, 2012 at 7:20 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:Also, event loop implementations are allowed to offer additional APIs
> on their implementation. If the need for multiple handlers per FD only
> exists on those platforms where the platform's event loop supports it,
> no harm is done if the functionality is only available through a
> platform-specific API.
> 
> Sure, but since we know this capability is offered by multiple event loops, it would be good if there was a defined way to go about exposing it.
>  
> But still, I don't understand the use case. Possibly it is using file
> descriptors as a more general signaling mechanism? That sounds pretty
> platform specific anyway (on Windows, FDs must represent sockets).
> 
> If someone shows me a real-world use case I may change my mind.
> 
> The most likely use case that comes to mind is monitoring and debugging (i.e. the event loop equivalent of a sys.settrace). Being able to tap into a datastream (e.g. to dump it to a console or pipe it to a monitoring process) can be really powerful, and being able to do it at the Python level means you have this kind of capability even without root access to the machine to run Wireshark.
> 
> There are other more obscure signal analysis use cases that occur to me, but those could readily be handled with a custom transport implementation that duplicated that data stream, so I don't think there's any reason to worry about those.
> 
> > Related, the protocol/transport API design may end up needing to consider
> > the gather/scatter problem (i.e. fanning out data from a single transport to
> > multiple consumers, as well as feeding data from multiple producers into a
> > single underlying transport). Actual *implementations* of such tools
> > shouldn't be needed in the standard suite, but at least understanding how
> > you would go about writing multiplexers and demultiplexers can be a good
> > test of a stacked I/O design.
> 
> Twisted supports this for writing through its writeSequence(), which
> appears in Tulip and PEP 3156 as writelines(). (Though IIRC Glyph told
> me that Twisted rarely uses the platform's scatter/gather primitives,
> because they are so damn hard to use, and the kernel implementation
> often just joins the buffers together before passing it to the regular
> send()...)
> 
> But regardless, I don't think scatter/gather would use multiple
> callbacks per FD.
> 
> I think it would be really hard to benefit from reading into multiple
> buffers in Python.
>  
> Sorry, I wasn't quite clear on what I meant by gather/scatter and it's more a protocol thing than an event loop thing.
> 
> Specifically, gather/scatter interfaces are most useful for multiplexed transports. The ones I'm particularly familiar with are traditional telephony transports like E1 links, with 15 time-division-multiplexed channels on the wire (and a signalling timeslot), as well a few different HF comms protocols. When reading from one of those, you have a demultiplexing component which is reading the serial data coming in on the wire and making it look like 15 distinct data channels from the application's point of view. Similarly, the output multiplexer takes 15 streams of data from the application and interleaves them into the single stream on the wire.
> 
> The rise of packet switching means that sharing connections like that is increasingly less common, though, so gather/scatter devices are correspondingly less useful in a networking context. The only modern use cases I can think of that someone might want to handle with Python are things like sharing a single USB or classic serial connection amongst multiple data streams. However, I suspect the standard transport and protocol API definitions already proposed should also suffice for the gather/scatter use case, as such a component would largely work like any other protocol-as-transport adapter, with the difference being that there would be a many-to-one relationship between the number of interfaces on the application side and those on the communications side.
> 
> (Technically, gather/scatter components can also be used the other way around to distribute a single data stream across multi transports, but that use case is even less likely to come up when programming in Python. Multi-channel HF data comms is the only possibility that really comes to mind)
> 
> -- 
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> 
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121218/6a340f02/attachment.html>

From amauryfa at gmail.com  Tue Dec 18 11:54:40 2012
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Tue, 18 Dec 2012 11:54:40 +0100
Subject: [Python-ideas] PEP 3156 feedback
In-Reply-To: <20121218110136.1f85cfae@pitrou.net>
References: <20121218110136.1f85cfae@pitrou.net>
Message-ID: <CAGmFidaLyOLpGEXd8Fc5nMOkYT=c2W-bXc8M5vYvkOhNQ25RMg@mail.gmail.com>

2012/12/18 Antoine Pitrou <solipsis at pitrou.net>

> My own opinion about Twisted's API is that the Factory class is often
> useless, and adds a cognitive burden. If you need a place to track all
> protocols of a given kind (e.g. all connections), you can do it
> yourself. Also, the Factory implies that you don't control how exactly
> your protocol gets instantiated (unless you override some method on the
> Factory I'm missing the name of: it is cumbersome).
>
> So, when creating a client, I would pass it a protocol instance.
>

Factories are useful to implement clients that reconnect automatically:
the framework needs to spawn a new protocol object.
The connect method could take a protocol class,
but how would you implement the reconnect strategy?

When creating a server, I would pass it a protocol class. Here the base
> Protocol class comes into play, its __init__() could take the transport
> as argument and set the "transport" attribute with it. Further args
> could be optionally passed to the constructor:
>
> class MyProtocol(Protocol):
>     def __init__(self, transport, my_personal_attribute):
>         Protocol.__init__(self, transport)
>         self.my_personal_attribute = my_personal_attribute
>
    ...
>
> def listen(ioloop):
>     # Each new connection will instantiate a MyProtocol with "foobar"
>     # for my_personal_attribute.
>     ioloop.listen_tcp(("0.0.0.0", 8080), MyProtocol, "foobar")
>

This is indeed very similar to a factory function (a callback that creates
the protocol)
Anything with a __call__ would be acceptable IMO.

(The hypothetical listen_tcp() is just a name: perhaps it's actually
> start_serving(). It should accept any callable, not just a class:
> therefore, you can define complex behaviour if you like)
>
>
> I think the transport / protocol registration must be done early, not in
> connection_made(). Sometimes you will want to do things on a protocol
> before you know a connection is established, for example queue things
> to write on the transport. An use case is a reconnecting TCP client:
> the protocol will continue existing at times when the connection is
> down.
>

We should be clear on what a protocol is. In my mind, a protocol manages
the events on a given transport; it will also probably buffer data.
For example, data for the HTTP protocol always starts with "GET ...
HTTP/1.0\r\n".
If a protocol can change transports in the middle, it can be difficult to
track
which socket you write to or receive from, and manage your buffers
correctly.

An alternative could be a "reset()" method, but then we are not far from a
factory class.


> * connection_lost(): you definitely want to know whether it's you or the
>   other end who closed the connection. Typically, if the other end
>   closed the connection, you will have to run some cleanup steps, and
>   perhaps even log an error somewhere (if the connection was closed
>   unexpectedly).
>   Actually, I'm not sure it's useful to call connection_lost() when you
>   closed the connection yourself: are there any use cases?
>

The "yourself" can in another part of the code; some protocols will
certainly
close the connection when they receive unexpected data.
Also, this example from Twisted documentation:
    attempt = myEndpoint.connect(myFactory)
    reactor.callback(30, attempt.cancel)
Even if these lines appear in my code, it's easier to have all errors caught
in one place. The alternative would be:
    attempt = myEndpoint.connect(myFactory)
    def cancel_attempt_and_notify_error():
        attempt.cancel()
        notify_error("cancelled after timeout")
    reactor.callback(30, cancel_attempt_and_notify_error)


-- 
Amaury Forgeot d'Arc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121218/1e2fc687/attachment.html>

From solipsis at pitrou.net  Tue Dec 18 12:27:30 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 18 Dec 2012 12:27:30 +0100
Subject: [Python-ideas] PEP 3156 feedback
References: <20121218110136.1f85cfae@pitrou.net>
	<CAGmFidaLyOLpGEXd8Fc5nMOkYT=c2W-bXc8M5vYvkOhNQ25RMg@mail.gmail.com>
Message-ID: <20121218122730.4b230781@pitrou.net>

Le Tue, 18 Dec 2012 11:54:40 +0100,
"Amaury Forgeot d'Arc"
<amauryfa at gmail.com> a ?crit :
> 2012/12/18 Antoine Pitrou
> <solipsis at pitrou.net>
> 
> > My own opinion about Twisted's API is that the Factory class is
> > often useless, and adds a cognitive burden. If you need a place to
> > track all protocols of a given kind (e.g. all connections), you can
> > do it yourself. Also, the Factory implies that you don't control
> > how exactly your protocol gets instantiated (unless you override
> > some method on the Factory I'm missing the name of: it is
> > cumbersome).
> >
> > So, when creating a client, I would pass it a protocol instance.
> >
> 
> Factories are useful to implement clients that reconnect
> automatically: the framework needs to spawn a new protocol object.
> The connect method could take a protocol class,
> but how would you implement the reconnect strategy?

I view it differently: the *same* protocol *instance* should be re-used
for the new connection. That's because the protocol can keep data that
lasts longer than a single connection (many protocols have session ids
or other state that can persist accross connections: this is typical
of RPC APIs affecting the state of an always-running equipment).

> We should be clear on what a protocol is. In my mind, a protocol
> manages the events on a given transport; it will also probably buffer
> data. For example, data for the HTTP protocol always starts with
> "GET ... HTTP/1.0\r\n".
> If a protocol can change transports in the middle, it can be
> difficult to track
> which socket you write to or receive from, and manage your buffers
> correctly.
> 
> An alternative could be a "reset()" method, but then we are not far
> from a factory class.

Well, the problem when switching transports is that you want to:
- wait for all outgoing data to be flushed
- migrate all pending incoming data to the new transport

IMO, this begs for a solution on the transport side, not on the client
side (some kind of migrate() API on the transport?). In other words,
you switch transports, but you keep the same protocol instance: when
your FTP protocol switches from plain TCP to TLS, it remembers the
current directory, etc.

> Also, this example from Twisted documentation:
>     attempt = myEndpoint.connect(myFactory)
>     reactor.callback(30, attempt.cancel)
> Even if these lines appear in my code, it's easier to have all errors
> caught in one place.

Ah, I think there's a misunderstanding.  Protocol.connection_lost()
should be called when an *established* connection is lost.

Indeed, there should be a separate Protocol.connection_failed() method
for when the connect() calls never succeeds (either times out or returns
with an error). And this is a reason why it is better for the transport
to be registered early on the protocol (or vice-versa) :-)

Regards

Antoine.




From oscar.j.benjamin at gmail.com  Tue Dec 18 13:08:50 2012
From: oscar.j.benjamin at gmail.com (Oscar Benjamin)
Date: Tue, 18 Dec 2012 12:08:50 +0000
Subject: [Python-ideas] Graph class
In-Reply-To: <kapbn3$661$1@ger.gmane.org>
References: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>
	<CAOvn4qgkX4+4B9PmmoRmHC+HG-6s+HHb6BiQfzEo+RTaFVdz+Q@mail.gmail.com>
	<CAMjeLr9wymGhGaOEa=qK-=nK6_A6ZCz+9ivBnz2rwAFh3AmX8A@mail.gmail.com>
	<CACac1F-K7iOEf60uR9x5SwSPEzXqHkbbFw-zmVxguogF=Qy-hg@mail.gmail.com>
	<CAMjeLr9KdF6n5U29BTYt7wmQd1GQv0n-3zs8V+n6UfoEuoy=Jg@mail.gmail.com>
	<CAMjeLr-hmit-GKuicXSXv8xuZASFqWAK6pLGjNvJ_+gMhq0vpQ@mail.gmail.com>
	<87txru1wxr.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CAMjeLr_2aSj44hsBa1P2J47Xh-DzHVcr0etcmHpKQnXn0yRQMw@mail.gmail.com>
	<loom.20121216T144624-97@post.gmane.org>
	<CAP7+vJL7kLAkqMjkWhaqEeaLVec3Q8HpvRRVmdRikOOgb_k0wg@mail.gmail.com>
	<50CE5904.9090102@krosing.net>
	<CADiSq7dU6=wHj8+4fyhRvXw55MZpT=v9TBJw+G_L=eUwF_Q1FQ@mail.gmail.com>
	<kapbn3$661$1@ger.gmane.org>
Message-ID: <CAHVvXxSZxQDHqOcTwi=BNsmSqnD5HHHRrG7QU_cRDoQkKa-NPQ@mail.gmail.com>

On 18 December 2012 09:06, Terry Reedy <tjreedy at udel.edu> wrote:
> On 12/17/2012 10:26 PM, Nick Coghlan wrote:
>> A graph library that focused on defining a good abstraction (and
>> adapters) that allowed graph algorithms to be written that worked with
>> multiple existing Python graph data structures could be quite interesting.
>
>
> I was just thinking that what is needed, at least as a first step, is a
> graph api, like the db api, that would allow the writing of algorithms to
> one api and adapters to various implementations. I expect to be writing some
> graph algorithms (in Python) in the next year and will try to keep that idea
> in mind and see if it makes any sense, versus just whipping up a
> implementation that fits the particular problem.

I'd be interested to use (and possibly to contribute to) a graph
library of this type on PyPI. I have some suggestions about the
appropriate level of abstraction below.

The graph algorithms that are the most useful can be written in terms
of two things:
1) An iterator over the nodes
2) A way to map each node into an iterator over its children (or partners)

It is also required to place some restriction on how the nodes can be
used. Descriptions of graph algorithms refer to marking/colouring the
nodes of a graph. If the nodes are instances of user defined classes,
then you can do this in a relatively literal sense by adding
attributes to the nodes, but this is fragile in the event of errors,
not thread-safe, etc.

Really though, the idea of marking the nodes just means that you need
an O(1) method for determining if a node has been marked or checking
the value that it was marked with. In Python this is easily done with
sets and dicts, which is not fragile and is thread-safe etc. (provided
the graph is not being mutated). This requires that the nodes be
hashable.

In the thread about deleting keys from a dict yesterday it occurred to
me (after MRAB's suggestion) that you can still apply the same methods
to non-hashable objects. That is, provided you have a situation where
node equality is determined by node identity you can just use id(node)
in each hash table. While this method works equally well for
user-defined class instances it does not work for immutable types
where, for example, two strings may be equal but have differing id()s.

One way to cover all cases is simply to provide a hashkey argument to
each algorithm that defaults to the identity function (lambda x: x),
but may be replaced by the id function in appropriate cases. This
means that all of the graph algorithms that I would want can be
implemented with a basic signature that goes like:

def strongly_connected(nodes, edgesfunc, hashkey=None):
    '''
    `nodes` is an iterable over the nodes of the graph.
    `edgesfunc(node)` is an iterable over the children of node.
    `hashkey` is an optional key function to apply when adding
    nodes to a hash-table. For mutable objects where identity
    is equality use `hashkey=id`.
    '''
    if hashkey is None:
        # Would be great to have operator.identity here
        hashkey = lambda x: x

There are some cases where optimisation is possible given additional
information. One example: I think it is possible to conclude that an
undirected graph contains at least one cycle if |E|>=|V|, so in this
case an optional hint parameter could give a shortcut for some graphs.
Generally, though, there are few algorithms where other quantities are
either required or are sufficient for all input graphs (exceptions to
the sufficient part of this rule are typically the relatively easy
algorithms like determining the mean degree).

Once you have algorithms that are implemented in this way it becomes
possible to piece them together as a concrete graph class, a mixin, an
ABC, a decorator that works like @functools.total_ordering or some
other class-based idiom. Crucially, though, unlike all of these class
based approaches, defining the algorithms firstly in a functional way
makes it easy to apply them to any data structure composed of
elementary types or of classes that you yourself cannot write or
subclass.


Oscar


From shane at umbrellacode.com  Tue Dec 18 13:37:58 2012
From: shane at umbrellacode.com (Shane Green)
Date: Tue, 18 Dec 2012 04:37:58 -0800
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <84FBCF87-EB2E-4887-9184-C5CE3B074ABA@umbrellacode.com>
References: <mailman.4948.1355815301.29568.python-ideas@python.org>
	<84FBCF87-EB2E-4887-9184-C5CE3B074ABA@umbrellacode.com>
Message-ID: <5E168314-653B-4A4E-AFD1-D438EAADEE39@umbrellacode.com>

Sorry for the utter lack of formatting etiquette in my previous responses everyone?  My message (the next sentence) was in response to the message below.  Sorry for the confusion..

	Sending the demultiplexed data through 15 pipes so the application actually is dealing with 15 streams of data using single callback notifications from the event loop seems like the more KIS approach, in this case?



> From: Nick Coghlan <ncoghlan at gmail.com>
> Subject: Re: [Python-ideas] async: feedback on EventLoop API
> Date: December 17, 2012 11:21:37 PM PST
> To: Guido van Rossum <guido at python.org>
> Cc: Antoine Pitrou <solipsis at pitrou.net>, python-ideas at python.org
> 
> 
> 
> 
> 
> 
>> Sorry, I wasn't quite clear on what I meant by gather/scatter and it's more a protocol thing than an event loop thing.
>> 
>> Specifically, gather/scatter interfaces are most useful for multiplexed transports. The ones I'm particularly familiar with are traditional telephony transports like E1 links, with 15 time-division-multiplexed channels on the wire (and a signalling timeslot), as well a few different HF comms protocols. When reading from one of those, you have a demultiplexing component which is reading the serial data coming in on the wire and making it look like 15 distinct data channels from the application's point of view. Similarly, the output multiplexer takes 15 streams of data from the application and interleaves them into the single stream on the wire.
>> 
>> The rise of packet switching means that sharing connections like that is increasingly less common, though, so gather/scatter devices are correspondingly less useful in a networking context. The only modern use cases I can think of that someone might want to handle with Python are things like sharing a single USB or classic serial connection amongst multiple data streams. However, I suspect the standard transport and protocol API definitions already proposed should also suffice for the gather/scatter use case, as such a component would largely work like any other protocol-as-transport adapter, with the difference being that there would be a many-to-one relationship between the number of interfaces on the application side and those on the communications side.
>> 
>> (Technically, gather/scatter components can also be used the other way around to distribute a single data stream across multi transports, but that use case is even less likely to come up when programming in Python. Multi-channel HF data comms is the only possibility that really comes to mind)
>> 
>> -- 
>> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
>> 
>> 
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121218/5086f164/attachment.html>

From ubershmekel at gmail.com  Tue Dec 18 15:24:01 2012
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Tue, 18 Dec 2012 16:24:01 +0200
Subject: [Python-ideas] Graph class
In-Reply-To: <CAHVvXxSZxQDHqOcTwi=BNsmSqnD5HHHRrG7QU_cRDoQkKa-NPQ@mail.gmail.com>
References: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>
	<CAOvn4qgkX4+4B9PmmoRmHC+HG-6s+HHb6BiQfzEo+RTaFVdz+Q@mail.gmail.com>
	<CAMjeLr9wymGhGaOEa=qK-=nK6_A6ZCz+9ivBnz2rwAFh3AmX8A@mail.gmail.com>
	<CACac1F-K7iOEf60uR9x5SwSPEzXqHkbbFw-zmVxguogF=Qy-hg@mail.gmail.com>
	<CAMjeLr9KdF6n5U29BTYt7wmQd1GQv0n-3zs8V+n6UfoEuoy=Jg@mail.gmail.com>
	<CAMjeLr-hmit-GKuicXSXv8xuZASFqWAK6pLGjNvJ_+gMhq0vpQ@mail.gmail.com>
	<87txru1wxr.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CAMjeLr_2aSj44hsBa1P2J47Xh-DzHVcr0etcmHpKQnXn0yRQMw@mail.gmail.com>
	<loom.20121216T144624-97@post.gmane.org>
	<CAP7+vJL7kLAkqMjkWhaqEeaLVec3Q8HpvRRVmdRikOOgb_k0wg@mail.gmail.com>
	<50CE5904.9090102@krosing.net>
	<CADiSq7dU6=wHj8+4fyhRvXw55MZpT=v9TBJw+G_L=eUwF_Q1FQ@mail.gmail.com>
	<kapbn3$661$1@ger.gmane.org>
	<CAHVvXxSZxQDHqOcTwi=BNsmSqnD5HHHRrG7QU_cRDoQkKa-NPQ@mail.gmail.com>
Message-ID: <CANSw7KzUdPDdoQn_JqwhtHdASDfzyOvEb4=UwfXvgWX5w9qkoA@mail.gmail.com>

On Tue, Dec 18, 2012 at 2:08 PM, Oscar Benjamin
<oscar.j.benjamin at gmail.com>wrote:

> On 18 December 2012 09:06, Terry Reedy <tjreedy at udel.edu> wrote:
> > On 12/17/2012 10:26 PM, Nick Coghlan wrote:
> >> A graph library that focused on defining a good abstraction (and
> >> adapters) that allowed graph algorithms to be written that worked with
> >> multiple existing Python graph data structures could be quite
> interesting.
> >
> >
> > I was just thinking that what is needed, at least as a first step, is a
> > graph api, like the db api, that would allow the writing of algorithms to
> > one api and adapters to various implementations. I expect to be writing
> some
> > graph algorithms (in Python) in the next year and will try to keep that
> idea
> > in mind and see if it makes any sense, versus just whipping up a
> > implementation that fits the particular problem.
>
> I'd be interested to use (and possibly to contribute to) a graph
> library of this type on PyPI. I have some suggestions about the
> appropriate level of abstraction below.
>
> The graph algorithms that are the most useful can be written in terms
> of two things:
> 1) An iterator over the nodes
> 2) A way to map each node into an iterator over its children (or partners)
>
>

Some graphs don't care for the nodes, all their information is in the
edges. That's why most graph frameworks have iter_edges and iter_nodes
functions. I'm not sure what's the clean way to represent the
optional directionality of edges though.

Some example API's from networkx:

http://networkx.lanl.gov/reference/classes.html
http://networkx.lanl.gov/reference/classes.digraph.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121218/693d07ff/attachment.html>

From tjreedy at udel.edu  Tue Dec 18 17:06:29 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 18 Dec 2012 11:06:29 -0500
Subject: [Python-ideas] Graph class
In-Reply-To: <CANSw7KzUdPDdoQn_JqwhtHdASDfzyOvEb4=UwfXvgWX5w9qkoA@mail.gmail.com>
References: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>
	<CAOvn4qgkX4+4B9PmmoRmHC+HG-6s+HHb6BiQfzEo+RTaFVdz+Q@mail.gmail.com>
	<CAMjeLr9wymGhGaOEa=qK-=nK6_A6ZCz+9ivBnz2rwAFh3AmX8A@mail.gmail.com>
	<CACac1F-K7iOEf60uR9x5SwSPEzXqHkbbFw-zmVxguogF=Qy-hg@mail.gmail.com>
	<CAMjeLr9KdF6n5U29BTYt7wmQd1GQv0n-3zs8V+n6UfoEuoy=Jg@mail.gmail.com>
	<CAMjeLr-hmit-GKuicXSXv8xuZASFqWAK6pLGjNvJ_+gMhq0vpQ@mail.gmail.com>
	<87txru1wxr.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CAMjeLr_2aSj44hsBa1P2J47Xh-DzHVcr0etcmHpKQnXn0yRQMw@mail.gmail.com>
	<loom.20121216T144624-97@post.gmane.org>
	<CAP7+vJL7kLAkqMjkWhaqEeaLVec3Q8HpvRRVmdRikOOgb_k0wg@mail.gmail.com>
	<50CE5904.9090102@krosing.net>
	<CADiSq7dU6=wHj8+4fyhRvXw55MZpT=v9TBJw+G_L=eUwF_Q1FQ@mail.gmail.com>
	<kapbn3$661$1@ger.gmane.org>
	<CAHVvXxSZxQDHqOcTwi=BNsmSqnD5HHHRrG7QU_cRDoQkKa-NPQ@mail.gmail.com>
	<CANSw7KzUdPDdoQn_JqwhtHdASDfzyOvEb4=UwfXvgWX5w9qkoA@mail.gmail.com>
Message-ID: <kaq4a7$ams$1@ger.gmane.org>

On 12/18/2012 9:24 AM, Yuval Greenfield wrote:
>
> On Tue, Dec 18, 2012 at 2:08 PM, Oscar Benjamin
> <oscar.j.benjamin at gmail.com
> <mailto:oscar.j.benjamin at gmail.com>> wrote:

>     I'd be interested to use (and possibly to contribute to) a graph
>     library of this type on PyPI. I have some suggestions about the
>     appropriate level of abstraction below.
>
>     The graph algorithms that are the most useful can be written in terms
>     of two things:
>     1) An iterator over the nodes

Or iterable if re-iteration is needed.

>     2) A way to map each node into an iterator over its children (or
>     partners)

A callable could be either an iterator class or a generator function.

> Some graphs don't care for the nodes, all their information is in the
> edges. That's why most graph frameworks have iter_edges and iter_nodes
> functions. I'm not sure what's the clean way to represent the
> optional directionality of edges though.
>
> Some example API's from networkx:
>
> http://networkx.lanl.gov/reference/classes.html
> http://networkx.lanl.gov/reference/classes.digraph.html

Thank you both the the 'thought food'. Defining things in terms of 
iterables and iterators instead of (for instance) sets is certainly the 
Python3 way.

Oscar, I don't consider hashability an issue. General class instances 
are hashable by default. One can even consider such instances as 
hashable facades for unhashable dicts. Giving each instance a list 
attribute does the same for lists.

The more important question, it seems to me, is whether to represent 
nodes by counts and let the algorithm do its bookkeeping in private 
structures, or to represent them by externally defined instances that 
the algorithm mutates.

-- 
Terry Jan Reedy



From guido at python.org  Tue Dec 18 18:03:07 2012
From: guido at python.org (Guido van Rossum)
Date: Tue, 18 Dec 2012 09:03:07 -0800
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CADiSq7f2RrZ1mpnJwunszdoaW7usKAtkwRCkT5PziZEZGoqgdw@mail.gmail.com>
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
	<CADbA=FXM0JEirL_DHsO_fh_-ffU_BVON2d903mAc58QaiOa-fg@mail.gmail.com>
	<CAP7+vJKEyFy_XS4iy2fsbHHfso2=eFaN9cevkjjkEwbQThUSiQ@mail.gmail.com>
	<CADiSq7c1c+nfD6SJFFY_qS5PhHgb8O6dYwdhvLyYwr=QnPbraA@mail.gmail.com>
	<CAP7+vJJOo0kvOiHek59eT-mUBfocw6gP3YpMCR6uhPJR+6LVtA@mail.gmail.com>
	<CADiSq7f2RrZ1mpnJwunszdoaW7usKAtkwRCkT5PziZEZGoqgdw@mail.gmail.com>
Message-ID: <CAP7+vJJQkWcLy8BFh1KgdVYJgqp-zWSUJrZK36K=HUDpA6Ejxg@mail.gmail.com>

On Mon, Dec 17, 2012 at 11:21 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Tue, Dec 18, 2012 at 2:01 PM, Guido van Rossum <guido at python.org> wrote:
>>
>> On Mon, Dec 17, 2012 at 7:20 PM, Nick Coghlan <ncoghlan at gmail.com>
>> wrote:Also, event loop implementations are allowed to offer additional APIs
>> on their implementation. If the need for multiple handlers per FD only
>> exists on those platforms where the platform's event loop supports it,
>> no harm is done if the functionality is only available through a
>> platform-specific API.
>
>
> Sure, but since we know this capability is offered by multiple event loops,
> it would be good if there was a defined way to go about exposing it.

Only if there's a use case.

>> But still, I don't understand the use case. Possibly it is using file
>> descriptors as a more general signaling mechanism? That sounds pretty
>> platform specific anyway (on Windows, FDs must represent sockets).
>>
>> If someone shows me a real-world use case I may change my mind.

> The most likely use case that comes to mind is monitoring and debugging
> (i.e. the event loop equivalent of a sys.settrace). Being able to tap into a
> datastream (e.g. to dump it to a console or pipe it to a monitoring process)
> can be really powerful, and being able to do it at the Python level means
> you have this kind of capability even without root access to the machine to
> run Wireshark.

I can't see how that would work. Once one callback reads the data the
other callback won't see it. There's also the issue of ordering.

Solving this seems easier by implementing a facade for the event loop
that wraps certain callbacks, and installing it using a custom event
loop policy.

So, I still don't see the use case.

> There are other more obscure signal analysis use cases that occur to me, but
> those could readily be handled with a custom transport implementation that
> duplicated that data stream, so I don't think there's any reason to worry
> about those.

Right, that seems a better way to go about it.

>> Twisted supports this for writing through its writeSequence(), which
>> appears in Tulip and PEP 3156 as writelines(). (Though IIRC Glyph told
>> me that Twisted rarely uses the platform's scatter/gather primitives,
>> because they are so damn hard to use, and the kernel implementation
>> often just joins the buffers together before passing it to the regular
>> send()...)
>>
>> But regardless, I don't think scatter/gather would use multiple
>> callbacks per FD.
>>
>> I think it would be really hard to benefit from reading into multiple
>> buffers in Python.

> Sorry, I wasn't quite clear on what I meant by gather/scatter and it's more
> a protocol thing than an event loop thing.
>
> Specifically, gather/scatter interfaces are most useful for multiplexed
> transports. The ones I'm particularly familiar with are traditional
> telephony transports like E1 links, with 15 time-division-multiplexed
> channels on the wire (and a signalling timeslot), as well a few different HF
> comms protocols. When reading from one of those, you have a demultiplexing
> component which is reading the serial data coming in on the wire and making
> it look like 15 distinct data channels from the application's point of view.
> Similarly, the output multiplexer takes 15 streams of data from the
> application and interleaves them into the single stream on the wire.
>
> The rise of packet switching means that sharing connections like that is
> increasingly less common, though, so gather/scatter devices are
> correspondingly less useful in a networking context. The only modern use
> cases I can think of that someone might want to handle with Python are
> things like sharing a single USB or classic serial connection amongst
> multiple data streams. However, I suspect the standard transport and
> protocol API definitions already proposed should also suffice for the
> gather/scatter use case, as such a component would largely work like any
> other protocol-as-transport adapter, with the difference being that there
> would be a many-to-one relationship between the number of interfaces on the
> application side and those on the communications side.
>
> (Technically, gather/scatter components can also be used the other way
> around to distribute a single data stream across multi transports, but that
> use case is even less likely to come up when programming in Python.
> Multi-channel HF data comms is the only possibility that really comes to
> mind)

I'm glad you talked yourself out of that objection. :-)

-- 
--Guido van Rossum (python.org/~guido)


From guido at python.org  Tue Dec 18 17:59:55 2012
From: guido at python.org (Guido van Rossum)
Date: Tue, 18 Dec 2012 08:59:55 -0800
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CADbA=FUZcfZfNi8OXyYbcRY9X+YTnvwZmUckK_QNUX_kn6h7ig@mail.gmail.com>
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
	<50CF9603.6040409@canterbury.ac.nz>
	<20121217231134.19ede507@pitrou.net>
	<CAP7+vJLy4Nug=oVqZQWLg2B=CZOew6FAC38m37sY4n__dvtiEQ@mail.gmail.com>
	<CADbA=FUZcfZfNi8OXyYbcRY9X+YTnvwZmUckK_QNUX_kn6h7ig@mail.gmail.com>
Message-ID: <CAP7+vJKKU4WWFmZrsGhH5r0xVKe7_9RKUszJT1Z5tZCx2v3PjA@mail.gmail.com>

On Mon, Dec 17, 2012 at 11:26 PM, Geert Jansen <geertj at gmail.com> wrote:
> I needed a self-pipe on Windows before. See below. With this, the
> select() based loop might work unmodified on Windows.
>
> https://gist.github.com/4325783

Thanks! Before I paste this into Tulip, is there any kind of copyright on this?

> Of course it wouldn't be as efficient as an IOCP based loop.

The socket loop is definitely handy on Windows in a pinch. I have
plans for an IOCP-based loop based on Richard Oudkerk's 'proactor'
branch of Tulip v1, but I don't have a Windows machine to test it on
ATM (hopefully that'll change once I am actually at Dropbox).

-- 
--Guido van Rossum (python.org/~guido)


From geertj at gmail.com  Tue Dec 18 18:10:13 2012
From: geertj at gmail.com (Geert Jansen)
Date: Tue, 18 Dec 2012 18:10:13 +0100
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CAP7+vJKKU4WWFmZrsGhH5r0xVKe7_9RKUszJT1Z5tZCx2v3PjA@mail.gmail.com>
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
	<50CF9603.6040409@canterbury.ac.nz>
	<20121217231134.19ede507@pitrou.net>
	<CAP7+vJLy4Nug=oVqZQWLg2B=CZOew6FAC38m37sY4n__dvtiEQ@mail.gmail.com>
	<CADbA=FUZcfZfNi8OXyYbcRY9X+YTnvwZmUckK_QNUX_kn6h7ig@mail.gmail.com>
	<CAP7+vJKKU4WWFmZrsGhH5r0xVKe7_9RKUszJT1Z5tZCx2v3PjA@mail.gmail.com>
Message-ID: <CADbA=FX1g0-6yQxVF-zHjQCRgJUwXuqrTJqOyX7c7aRfES5cuA@mail.gmail.com>

On Tue, Dec 18, 2012 at 5:59 PM, Guido van Rossum <guido at python.org> wrote:
> On Mon, Dec 17, 2012 at 11:26 PM, Geert Jansen <geertj at gmail.com> wrote:
>> I needed a self-pipe on Windows before. See below. With this, the
>> select() based loop might work unmodified on Windows.
>>
>> https://gist.github.com/4325783
>
> Thanks! Before I paste this into Tulip, is there any kind of copyright on this?

[include list]

I wrote the code. I hereby put it in the public domain.

Regards,
Geert


From guido at python.org  Tue Dec 18 19:02:05 2012
From: guido at python.org (Guido van Rossum)
Date: Tue, 18 Dec 2012 10:02:05 -0800
Subject: [Python-ideas] PEP 3156 feedback
In-Reply-To: <20121218110136.1f85cfae@pitrou.net>
References: <20121218110136.1f85cfae@pitrou.net>
Message-ID: <CAP7+vJJcQRZrRCoSgcpnB2-2d+xbFtFqHWoLJ8voqiHxJ7+sOQ@mail.gmail.com>

On Tue, Dec 18, 2012 at 2:01 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:
>
> Here is my own feedback on the in-progress PEP 3156. Please discard it
> if it's too early to give feedback :-))

Thank you, it's very to the point.

> Event loop API
> --------------
>
> I would like to say that I prefer Tornado's model: for each primitive
> provided by Tornado, you can pass an explicit Loop instance which you
> instantiated manually.
> There is no module function or policy object hiding this mechanism:
> it's simple, explicit and flexible (in other words: if you want a
> per-thread event loop, just do it yourself using TLS :-)).

It sounds though as if the explicit loop is optional, and still
defaults to some global default loop?

Having one global loop shared by multiple threads is iffy though. Only
one thread should be *running* the loop, otherwise the loop can' be
used as a mutual exclusion device. Worse, all primitives for adding
and removing callbacks/handlers must be made threadsafe, and then
basically the entire event loop becomes full of locks, which seems
wrong to me.

PEP 3156 lets the loop implementation choose the policy, which seems
safer than letting the user choose a policy that may or may not be
compatible with the loop's implementation.

Steve Dower keeps telling me that on Windows 8 the loop is built into
the OS. The Windows 8 loop also seems to be eager to use threads, so I
don't know if it can be relied on to serialize callbacks, but there is
probably a way to do that, or else the Python wrapper could add a lock
around callbacks.

> There are some requirements I've found useful:
>
> - being able to instantiate multiple loops, either at the same time or
>   serially (this is especially nice for unit tests; Twisted has to use
>   a dedicated test runner just because their reactor doesn't support
>   multiple instances or restarts)

Serially, for unit tests: definitely. The loop policy has
init_event_loop() for this, which forcibly creates a new loop.

At the same time: that seems to be an esoteric use case and not
favorable to interop with Twisted.

I want the loop to be mostly out of the way of the user, at least for
users using the high-level APIs (tasks, futures, transports,
protocols). In fact, just for this reason it may be better if the
protocol-creating methods had wrapper functions that just called
get_event_loop() and then called the corresponding method on the loop,
so the user code doesn't have to call get_event_loop() at all, ever
(or at least, if you call it, you should feel a slight tinge of guilt
about using a low-level API :-).

> - being able to stop a loop explicitly: having to unregister all
>   handlers or delayed calls is a PITA in non-trivial situations (for
>   example you might have multiple protocol instances, each with a bunch
>   of timers, some perhaps even in third-party libraries; keeping track
>   of all this is the event loop's job)

I've been convinced of that too. I'm just procrastinating on the
implementation at this point.

TBH the details of what you should put in your main program will
probably change a few times before we're done...

> * The optional sock_*() methods: how about having different ABCs, e.g.
>   the EventLoop ABC for basic behaviour, and the NetworkedEventLoop ABC
>   adding the socket helpers?

Hm. That smells of Twisted's tree of interfaces, which I'm honestly
trying to get away from (and Glyph didn't push back on that :-).

I'm actually leaning towards requiring these for all loop
implementations -- surely they can all be emulated using each other.

But I'm not totally wedded to that either. I need more experience
using the stuff first. And Steve Dower says he's not interested in any
of the async I/O stuff (I suppose he means sockets), just in futures
and coroutines. So maybe the socket operations do have to be optional.
In that case, I propose to add inquiry functions that can tell you
whether certain groups of APIs are supported. Though you can probably
get away with hasattr(loop, 'sock_recv') and so on.

> Protocols and transports
> ------------------------
>
> We probably want to provide a Protocol base class and encourage people
> to inherit it.

Glyph suggested that too, and hinted that it does some useful stuff
that users otherwise forget. I'm a bit worried though that the
functionality of the base implementation becomes the de-facto standard
rather than the PEP. (Glyph mentions that the base class has a method
that sets self.transport and without it lots of other stuff breaks.)

> It can provide useful functionality (perhaps write()
> and writelines() shims? it can make mocking easier).

Those are transport methods though.

> My own opinion about Twisted's API is that the Factory class is often
> useless, and adds a cognitive burden. If you need a place to track all
> protocols of a given kind (e.g. all connections), you can do it
> yourself. Also, the Factory implies that you don't control how exactly
> your protocol gets instantiated (unless you override some method on the
> Factory I'm missing the name of: it is cumbersome).

Yeah, Glyph complains that people laugh at Twisted for using factories. :-)

> So, when creating a client, I would pass it a protocol instance.

Heh. That's how I started, and Glyph told me to pass a protocol
factory. It can just be a Protocol subclass though, as long as the
constructor has the right signature. So maybe we can avoid calling it
protocol_factory and name it protocol_class instead.

I struggled with what to do if the socket cannot be connected and
hence the transport not created. If you've already created the
protocol you're in a bit of trouble at that point. I proposed to call
connection_lost() in that case (without ever having called
connection_made()) but Glyph suggested that would be asking for rare
bugs (the connection_lost() code might not expect a half-initialized
protocol instance). Glyph proposed instead that create_transport()
should return a Future and the error should be that Future's
exception, and I like that much better.

> When creating a server, I would pass it a protocol class. Here the base
> Protocol class comes into play, its __init__() could take the transport
> as argument and set the "transport" attribute with it. Further args
> could be optionally passed to the constructor:
>
> class MyProtocol(Protocol):
>     def __init__(self, transport, my_personal_attribute):
>         Protocol.__init__(self, transport)
>         self.my_personal_attribute = my_personal_attribute
>     ...
>
> def listen(ioloop):
>     # Each new connection will instantiate a MyProtocol with "foobar"
>     # for my_personal_attribute.
>     ioloop.listen_tcp(("0.0.0.0", 8080), MyProtocol, "foobar")
>
> (The hypothetical listen_tcp() is just a name: perhaps it's actually
> start_serving(). It should accept any callable, not just a class:
> therefore, you can define complex behaviour if you like)

I agree that it should be a callable, not necessarily a class. I don't
think it should take the transport -- that's what connection_made() is
for. I don't think we should make the API have additional arguments
either; you can use a lambda or functools.partial to pass those in.
(There are too many other arguments to start_serving() to make it
convenient or clear to have a *args, I think, though maybe we could
rearrange the argument order.)

> I think the transport / protocol registration must be done early, not in
> connection_made(). Sometimes you will want to do things on a protocol
> before you know a connection is established, for example queue things
> to write on the transport. An use case is a reconnecting TCP client:
> the protocol will continue existing at times when the connection is
> down.

Hm. That seems a pretty advanced use case. I think it is better
handled by passing a "factory function" that returns a pre-created
protocol:

pr = MyProtocol(...)
ev.create_transport(lambda: pr, host, port)

However you do this, such a protocol object must expect multiple
connection_made - connection_lost cycles, which sounds to me like
asking for trouble. So maybe it's better to have a thin protocol class
that is newly instantiated for each reconnection but given a pointer
to a more permanent data structure that carries state between
reconnections.

> Unconnected protocols need their own base class and API:
> data_received()'s signature should be (data, remote_addr) or
> (remote_addr, data). Same for write().

You mean UDP? Let's put that off until later. But yes, it probably
needs more thought.

> * writelines() sounds ambiguous for datagram protocols: does it send
>   those "lines" as a single datagram, or one separate datagram per
>   "line"? The equivalent code suggests the latter, but which one makes
>   more sense?

It is the transport's choice. Twisted has writeSequence(), which is
just as ambiguous.

> * connection_lost(): you definitely want to know whether it's you or the
>   other end who closed the connection. Typically, if the other end
>   closed the connection, you will have to run some cleanup steps, and
>   perhaps even log an error somewhere (if the connection was closed
>   unexpectedly).

Glyph's idea was to always pass an exception and use special exception
subclasses to distinguish the three cases (clean eof from other end,
self.close(), self.abort(). I resisted this but maybe it's the only
way?

>   Actually, I'm not sure it's useful to call connection_lost() when you
>   closed the connection yourself: are there any use cases?

Well, close() first has to finish writing buffered data, so any
cleanup needs to be done asynchronously after that is taken care off.
AFAIK Twisted always calls it, and I think that's the best approach to
ensure cleanup is always taken care of.

-- 
--Guido van Rossum (python.org/~guido)


From oscar.j.benjamin at gmail.com  Tue Dec 18 19:21:42 2012
From: oscar.j.benjamin at gmail.com (Oscar Benjamin)
Date: Tue, 18 Dec 2012 18:21:42 +0000
Subject: [Python-ideas] Graph class
In-Reply-To: <kaq4a7$ams$1@ger.gmane.org>
References: <CAMjeLr8vWXQKhnVxipc_oa2uQ=fZborvOWLpyiHnOKZzpNbCfA@mail.gmail.com>
	<CAOvn4qgkX4+4B9PmmoRmHC+HG-6s+HHb6BiQfzEo+RTaFVdz+Q@mail.gmail.com>
	<CAMjeLr9wymGhGaOEa=qK-=nK6_A6ZCz+9ivBnz2rwAFh3AmX8A@mail.gmail.com>
	<CACac1F-K7iOEf60uR9x5SwSPEzXqHkbbFw-zmVxguogF=Qy-hg@mail.gmail.com>
	<CAMjeLr9KdF6n5U29BTYt7wmQd1GQv0n-3zs8V+n6UfoEuoy=Jg@mail.gmail.com>
	<CAMjeLr-hmit-GKuicXSXv8xuZASFqWAK6pLGjNvJ_+gMhq0vpQ@mail.gmail.com>
	<87txru1wxr.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CAMjeLr_2aSj44hsBa1P2J47Xh-DzHVcr0etcmHpKQnXn0yRQMw@mail.gmail.com>
	<loom.20121216T144624-97@post.gmane.org>
	<CAP7+vJL7kLAkqMjkWhaqEeaLVec3Q8HpvRRVmdRikOOgb_k0wg@mail.gmail.com>
	<50CE5904.9090102@krosing.net>
	<CADiSq7dU6=wHj8+4fyhRvXw55MZpT=v9TBJw+G_L=eUwF_Q1FQ@mail.gmail.com>
	<kapbn3$661$1@ger.gmane.org>
	<CAHVvXxSZxQDHqOcTwi=BNsmSqnD5HHHRrG7QU_cRDoQkKa-NPQ@mail.gmail.com>
	<CANSw7KzUdPDdoQn_JqwhtHdASDfzyOvEb4=UwfXvgWX5w9qkoA@mail.gmail.com>
	<kaq4a7$ams$1@ger.gmane.org>
Message-ID: <CAHVvXxQLnXZLZyaeg6HKZN=tkAzd8y4HUviTUfyHmCWMCRbwMQ@mail.gmail.com>

On 18 December 2012 16:06, Terry Reedy <tjreedy at udel.edu> wrote:
> On 12/18/2012 9:24 AM, Yuval Greenfield wrote:
>> On Tue, Dec 18, 2012 at 2:08 PM, Oscar Benjamin
>> <oscar.j.benjamin at gmail.com
>> <mailto:oscar.j.benjamin at gmail.com>> wrote:
>>
>>     The graph algorithms that are the most useful can be written in terms
>>     of two things:
>>     1) An iterator over the nodes
>
>
> Or iterable if re-iteration is needed.

True. Although there aren't many cases where re-iteration is needed.
The main exception would be if you wanted to instantiate a new Graph
as a result of the algorithm. For example a transitive pruning
function could be written to accept a factory like

def transitive_prune(nodes, childfunc, factory):
    return factory(nodes, pruned_edges(nodes, childfunc))

in which case you need to be able to iterate once over the nodes for
the pruning algorithm and once to construct the new graph. In these
cases, the fact that you want to instantiate a new graph suggests that
your original graph was a concrete data structure so that it is
probably okay to require an iterable. To mutate the graph in place the
user would need to supply a function to remove edges:

def transitive_prune(nodes, childfunc, remove_edge):

>>     2) A way to map each node into an iterator over its children (or
>>     partners)
>
> A callable could be either an iterator class or a generator function.
>
>> Some graphs don't care for the nodes, all their information is in the
>> edges. That's why most graph frameworks have iter_edges and iter_nodes
>> functions.

This is true. Some algorithms would rather have this information.
There are also a few that can proceed just from a particular node
rather than needing an iterable over all nodes.

>> I'm not sure what's the clean way to represent the
>> optional directionality of edges though.

I would have said that each API entry point should state how it will
interpret the edges. Are there algorithms that simultaneously make
sense for directed and undirected graphs while needing to behave
differently in the two cases (in which case is it really the same
algorithm)?

> Oscar, I don't consider hashability an issue. General class instances are
> hashable by default. One can even consider such instances as hashable
> facades for unhashable dicts. Giving each instance a list attribute does the
> same for lists.

True, I've not found hashability to be a problem in practice.

> The more important question, it seems to me, is whether to represent nodes
> by counts and let the algorithm do its bookkeeping in private structures, or
> to represent them by externally defined instances that the algorithm
> mutates.

I don't think I understand: How would the "externally defined instances" work?

Do you mean that the caller of a function would supply functions like
mark(), is_marked(), set_colour(), get_colour() and so on? If that's
the case what would the advantages be? I can think of one: if desired
the algorithm could be made to store all of its computations in say a
database so that it would be very scalable. Though to me that seems
like quite a specialised case that would probably merit from
reimplementing the desired algorithm anyway. Otherwise I guess it's a
lot simpler/safer to implement everything in private data structures.


Oscar


From sam-pydeas at rushing.nightmare.com  Tue Dec 18 19:55:27 2012
From: sam-pydeas at rushing.nightmare.com (Sam Rushing)
Date: Tue, 18 Dec 2012 10:55:27 -0800
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <50CF8193.4040501@gmail.com>
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
	<50CF8193.4040501@gmail.com>
Message-ID: <50D0BC1F.30509@rushing.nightmare.com>

On 12/17/12 12:33 PM, Ronan Lamy wrote:
> It seems to me that a DelayedCall is nothing but a frozen, reified
> function call. That it's a reified thing is already obvious from the
> fact that it's an object, so how about naming it just "Call"?
> "Delayed" is actually only one of the possible relations between the
> object and the actual call - it could also represent a cancelled call,
> or a cached one, or ...?
In the functional world, these are called 'thunks'.  I don't know if
that's a more obvious name, but a fun one.

http://en.wikipedia.org/wiki/Thunk_(functional_programming)

-Sam



From solipsis at pitrou.net  Tue Dec 18 20:21:06 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 18 Dec 2012 20:21:06 +0100
Subject: [Python-ideas] PEP 3156 feedback
References: <20121218110136.1f85cfae@pitrou.net>
	<CAP7+vJJcQRZrRCoSgcpnB2-2d+xbFtFqHWoLJ8voqiHxJ7+sOQ@mail.gmail.com>
Message-ID: <20121218202106.0ad96d3b@pitrou.net>

On Tue, 18 Dec 2012 10:02:05 -0800
Guido van Rossum <guido at python.org> wrote:
> > Event loop API
> > --------------
> >
> > I would like to say that I prefer Tornado's model: for each primitive
> > provided by Tornado, you can pass an explicit Loop instance which you
> > instantiated manually.
> > There is no module function or policy object hiding this mechanism:
> > it's simple, explicit and flexible (in other words: if you want a
> > per-thread event loop, just do it yourself using TLS :-)).
> 
> It sounds though as if the explicit loop is optional, and still
> defaults to some global default loop?

Yes.

> Having one global loop shared by multiple threads is iffy though. Only
> one thread should be *running* the loop, otherwise the loop can' be
> used as a mutual exclusion device. Worse, all primitives for adding
> and removing callbacks/handlers must be made threadsafe, and then
> basically the entire event loop becomes full of locks, which seems
> wrong to me.

Hmm, I don't think that's implied. Only call_soon_threadsafe() needs to
be thread-safe. Calling other methods from another thread is simply a
programming error. Since Tornado's and Twisted's global event loops
already work like that, I don't think the surprise will be huge for
users.

> > There are some requirements I've found useful:
> >
> > - being able to instantiate multiple loops, either at the same time or
> >   serially (this is especially nice for unit tests; Twisted has to use
> >   a dedicated test runner just because their reactor doesn't support
> >   multiple instances or restarts)
> 
> Serially, for unit tests: definitely. The loop policy has
> init_event_loop() for this, which forcibly creates a new loop.

Ah, nice.

> > Protocols and transports
> > ------------------------
> >
> > We probably want to provide a Protocol base class and encourage people
> > to inherit it.
> 
> Glyph suggested that too, and hinted that it does some useful stuff
> that users otherwise forget. I'm a bit worried though that the
> functionality of the base implementation becomes the de-facto standard
> rather than the PEP. (Glyph mentions that the base class has a method
> that sets self.transport and without it lots of other stuff breaks.)

Well, in the I/O stack we do have base classes with useful method
implementations too (IOBase and friends).

> > So, when creating a client, I would pass it a protocol instance.
> 
> Heh. That's how I started, and Glyph told me to pass a protocol
> factory. It can just be a Protocol subclass though, as long as the
> constructor has the right signature. So maybe we can avoid calling it
> protocol_factory and name it protocol_class instead.
> 
> I struggled with what to do if the socket cannot be connected and
> hence the transport not created. If you've already created the
> protocol you're in a bit of trouble at that point. I proposed to call
> connection_lost() in that case (without ever having called
> connection_made()) but Glyph suggested that would be asking for rare
> bugs (the connection_lost() code might not expect a half-initialized
> protocol instance).

I'm proposing something different: the transport should be created
before the socket is connected, and it should handle the connection
itself (by calling sock_connect() on the loop, perhaps).

Then:
- if connect() succeeds, protocol.connection_made() is called
- if connect() fails, protocol.connection_failed(exc) is called
(not connection_lost())

I think it makes more sense for the transport to do the connecting: why
should the I/O loop know about specific transports? Ideally, it should
only know about socket objects or fds.

I don't know if Twisted had a specific reason for having connectTCP()
and friends on the reactor (other than they want the reactor to be the
API entry point, perhaps). I'd be curious to hear about it.

> Glyph proposed instead that create_transport()
> should return a Future and the error should be that Future's
> exception, and I like that much better.

But then you have several API layers with different conventions:
connection_made() / connection_lost() use well-defined protocol
methods, while create_transport() returns you a Future on which you
must register success / failure callbacks.

> > I think the transport / protocol registration must be done early, not in
> > connection_made(). Sometimes you will want to do things on a protocol
> > before you know a connection is established, for example queue things
> > to write on the transport. An use case is a reconnecting TCP client:
> > the protocol will continue existing at times when the connection is
> > down.
> 
> Hm. That seems a pretty advanced use case. I think it is better
> handled by passing a "factory function" that returns a pre-created
> protocol:
> 
> pr = MyProtocol(...)
> ev.create_transport(lambda: pr, host, port)
> 
> However you do this, such a protocol object must expect multiple
> connection_made - connection_lost cycles, which sounds to me like
> asking for trouble.

It's quite straightforward actually (*). Of course, only a protocol
explicitly designed for use with a reconnecting client has to be
well-behaved in that regard.

(*) I'm using such a pattern at work, where I've stacked a protocol
abstraction on top of Tornado.

> > * connection_lost(): you definitely want to know whether it's you or the
> >   other end who closed the connection. Typically, if the other end
> >   closed the connection, you will have to run some cleanup steps, and
> >   perhaps even log an error somewhere (if the connection was closed
> >   unexpectedly).
> 
> Glyph's idea was to always pass an exception and use special exception
> subclasses to distinguish the three cases (clean eof from other end,
> self.close(), self.abort(). I resisted this but maybe it's the only
> way?

Perhaps both self.close() and self.abort() should pass None. So
"if error is None: return" is all you have to do to filter out the
boring case.

Regards

Antoine.




From shibturn at gmail.com  Tue Dec 18 20:41:55 2012
From: shibturn at gmail.com (Richard Oudkerk)
Date: Tue, 18 Dec 2012 19:41:55 +0000
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CAP7+vJKKU4WWFmZrsGhH5r0xVKe7_9RKUszJT1Z5tZCx2v3PjA@mail.gmail.com>
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
	<50CF9603.6040409@canterbury.ac.nz>
	<20121217231134.19ede507@pitrou.net>
	<CAP7+vJLy4Nug=oVqZQWLg2B=CZOew6FAC38m37sY4n__dvtiEQ@mail.gmail.com>
	<CADbA=FUZcfZfNi8OXyYbcRY9X+YTnvwZmUckK_QNUX_kn6h7ig@mail.gmail.com>
	<CAP7+vJKKU4WWFmZrsGhH5r0xVKe7_9RKUszJT1Z5tZCx2v3PjA@mail.gmail.com>
Message-ID: <kaqgu3$60u$1@ger.gmane.org>

On 18/12/2012 4:59pm, Guido van Rossum wrote:
> On Mon, Dec 17, 2012 at 11:26 PM, Geert Jansen <geertj at gmail.com> wrote:
>> I needed a self-pipe on Windows before. See below. With this, the
>> select() based loop might work unmodified on Windows.
>>
>> https://gist.github.com/4325783
>
> Thanks! Before I paste this into Tulip, is there any kind of copyright on this?
>
>> Of course it wouldn't be as efficient as an IOCP based loop.
>
> The socket loop is definitely handy on Windows in a pinch. I have
> plans for an IOCP-based loop based on Richard Oudkerk's 'proactor'
> branch of Tulip v1, but I don't have a Windows machine to test it on
> ATM (hopefully that'll change once I am actually at Dropbox).
>

polling.py in the proactor branch already had an implementation of 
socketpair() for Windows;-)

Also note that on Windows a connecting socket needs to be added to wfds 
*and* xfds when you do

     ... = select(rfds, wfds, xfds, timeout)

If the connection fails then the handle is reported as being exceptional 
but *not* writable.

It might make sense to have add_connector()/remove_connector() which on 
Unix is just an alias for add_writer()/remove_writer().  This would be 
useful if tulip ever has a loop based on WSAPoll() for Windows (Vista 
and later), since WSAPoll() has an awkward bug concerning asynchronous 
connects.

-- 
Richard



From guido at python.org  Tue Dec 18 21:41:04 2012
From: guido at python.org (Guido van Rossum)
Date: Tue, 18 Dec 2012 12:41:04 -0800
Subject: [Python-ideas] PEP 3156 feedback
In-Reply-To: <20121218202106.0ad96d3b@pitrou.net>
References: <20121218110136.1f85cfae@pitrou.net>
	<CAP7+vJJcQRZrRCoSgcpnB2-2d+xbFtFqHWoLJ8voqiHxJ7+sOQ@mail.gmail.com>
	<20121218202106.0ad96d3b@pitrou.net>
Message-ID: <CAP7+vJLugoO8Yg=FX+JBt4pTsqkCXxwuQoNNM_MX-6mhr5Md6A@mail.gmail.com>

On Tue, Dec 18, 2012 at 11:21 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Tue, 18 Dec 2012 10:02:05 -0800 Guido van Rossum <guido at python.org> wrote:
>> > Protocols and transports
>> > ------------------------
>> >
>> > We probably want to provide a Protocol base class and encourage people
>> > to inherit it.
>>
>> Glyph suggested that too, and hinted that it does some useful stuff
>> that users otherwise forget. I'm a bit worried though that the
>> functionality of the base implementation becomes the de-facto standard
>> rather than the PEP. (Glyph mentions that the base class has a method
>> that sets self.transport and without it lots of other stuff breaks.)
>
> Well, in the I/O stack we do have base classes with useful method
> implementations too (IOBase and friends).

True. If we go that way they should be in the PEP as well.

>> > So, when creating a client, I would pass it a protocol instance.
>>
>> Heh. That's how I started, and Glyph told me to pass a protocol
>> factory. It can just be a Protocol subclass though, as long as the
>> constructor has the right signature. So maybe we can avoid calling it
>> protocol_factory and name it protocol_class instead.
>>
>> I struggled with what to do if the socket cannot be connected and
>> hence the transport not created. If you've already created the
>> protocol you're in a bit of trouble at that point. I proposed to call
>> connection_lost() in that case (without ever having called
>> connection_made()) but Glyph suggested that would be asking for rare
>> bugs (the connection_lost() code might not expect a half-initialized
>> protocol instance).
>
> I'm proposing something different: the transport should be created
> before the socket is connected, and it should handle the connection
> itself (by calling sock_connect() on the loop, perhaps).

That's a possible implementation technique. But it will still be
created implicitly by create_transport() or start_serving().

> Then:
> - if connect() succeeds, protocol.connection_made() is called
> - if connect() fails, protocol.connection_failed(exc) is called
> (not connection_lost())

That's what I had, but it just adds extra APIs to the abstract class.
Returning a Future that can succeed (probably returning the protocol)
or fail (with some exception) doesn't require adding new methods.

> I think it makes more sense for the transport to do the connecting: why
> should the I/O loop know about specific transports? Ideally, it should
> only know about socket objects or fds.

Actually, there's one reason why the loop should know (something)
about transports: different loop implementations will want to use
different transport implementations to meet the same requirements.
E.g. an IOCP-based loop will use different transports than a UNIXy
*poll-based loop.

> I don't know if Twisted had a specific reason for having connectTCP()
> and friends on the reactor (other than they want the reactor to be the
> API entry point, perhaps). I'd be curious to hear about it.

That's the reason.

>> Glyph proposed instead that create_transport()
>> should return a Future and the error should be that Future's
>> exception, and I like that much better.
>
> But then you have several API layers with different conventions:
> connection_made() / connection_lost() use well-defined protocol
> methods, while create_transport() returns you a Future on which you
> must register success / failure callbacks.

Different layers have different needs. Note that if you're using
coroutines the Futures are very easy to use. And Twisted will just
wrap the Future in a Deferred.

>> > I think the transport / protocol registration must be done early, not in
>> > connection_made(). Sometimes you will want to do things on a protocol
>> > before you know a connection is established, for example queue things
>> > to write on the transport. An use case is a reconnecting TCP client:
>> > the protocol will continue existing at times when the connection is
>> > down.
>>
>> Hm. That seems a pretty advanced use case. I think it is better
>> handled by passing a "factory function" that returns a pre-created
>> protocol:
>>
>> pr = MyProtocol(...)
>> ev.create_transport(lambda: pr, host, port)
>>
>> However you do this, such a protocol object must expect multiple
>> connection_made - connection_lost cycles, which sounds to me like
>> asking for trouble.
>
> It's quite straightforward actually (*). Of course, only a protocol
> explicitly designed for use with a reconnecting client has to be
> well-behaved in that regard.

Yeah, but it still is an odd corner case. Anyway, I think I've shown
you how to do it in several different ways while still having a
protocol_factory argument.

> (*) I'm using such a pattern at work, where I've stacked a protocol
> abstraction on top of Tornado.
>
>> > * connection_lost(): you definitely want to know whether it's you or the
>> >   other end who closed the connection. Typically, if the other end
>> >   closed the connection, you will have to run some cleanup steps, and
>> >   perhaps even log an error somewhere (if the connection was closed
>> >   unexpectedly).
>>
>> Glyph's idea was to always pass an exception and use special exception
>> subclasses to distinguish the three cases (clean eof from other end,
>> self.close(), self.abort(). I resisted this but maybe it's the only
>> way?
>
> Perhaps both self.close() and self.abort() should pass None.

They do.

> So "if error is None: return" is all you have to do to filter out the
> boring case.

But a clean close from the other end (as opposed to an unexpected
disconnect e.g. due to a sudden network partition) also passes None. I
guess this is okay because in that case eof_received() is first
called. So I guess the PEP is already okay here. :-)

-- 
--Guido van Rossum (python.org/~guido)


From solipsis at pitrou.net  Tue Dec 18 21:44:22 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 18 Dec 2012 21:44:22 +0100
Subject: [Python-ideas] PEP 3156 feedback
References: <20121218110136.1f85cfae@pitrou.net>
	<CAP7+vJJcQRZrRCoSgcpnB2-2d+xbFtFqHWoLJ8voqiHxJ7+sOQ@mail.gmail.com>
	<20121218202106.0ad96d3b@pitrou.net>
	<CAP7+vJLugoO8Yg=FX+JBt4pTsqkCXxwuQoNNM_MX-6mhr5Md6A@mail.gmail.com>
Message-ID: <20121218214422.323a4d2f@pitrou.net>

On Tue, 18 Dec 2012 12:41:04 -0800
Guido van Rossum <guido at python.org> wrote:
> > So "if error is None: return" is all you have to do to filter out the
> > boring case.
> 
> But a clean close from the other end (as opposed to an unexpected
> disconnect e.g. due to a sudden network partition) also passes None. I
> guess this is okay because in that case eof_received() is first
> called. So I guess the PEP is already okay here. :-)

Only if the protocol supports EOF, though? Or do you "emulate" by
calling eof_received() in any case?

Regards

Antoine.




From tjreedy at udel.edu  Tue Dec 18 22:15:17 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 18 Dec 2012 16:15:17 -0500
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CADbA=FX1g0-6yQxVF-zHjQCRgJUwXuqrTJqOyX7c7aRfES5cuA@mail.gmail.com>
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
	<50CF9603.6040409@canterbury.ac.nz>
	<20121217231134.19ede507@pitrou.net>
	<CAP7+vJLy4Nug=oVqZQWLg2B=CZOew6FAC38m37sY4n__dvtiEQ@mail.gmail.com>
	<CADbA=FUZcfZfNi8OXyYbcRY9X+YTnvwZmUckK_QNUX_kn6h7ig@mail.gmail.com>
	<CAP7+vJKKU4WWFmZrsGhH5r0xVKe7_9RKUszJT1Z5tZCx2v3PjA@mail.gmail.com>
	<CADbA=FX1g0-6yQxVF-zHjQCRgJUwXuqrTJqOyX7c7aRfES5cuA@mail.gmail.com>
Message-ID: <kaqmd9$mlv$1@ger.gmane.org>

On 12/18/2012 12:10 PM, Geert Jansen wrote:
> On Tue, Dec 18, 2012 at 5:59 PM, Guido van Rossum <guido at python.org> wrote:
>> On Mon, Dec 17, 2012 at 11:26 PM, Geert Jansen <geertj at gmail.com> wrote:
>>> I needed a self-pipe on Windows before. See below. With this, the
>>> select() based loop might work unmodified on Windows.
>>>
>>> https://gist.github.com/4325783
>>
>> Thanks! Before I paste this into Tulip, is there any kind of copyright on this?
>
> [include list]
>
> I wrote the code. I hereby put it in the public domain.

Sign a PSF contributor agreement if you have not done so yet and that 
should cover it for distribution with CPython.


-- 
Terry Jan Reedy



From guido at python.org  Tue Dec 18 23:39:47 2012
From: guido at python.org (Guido van Rossum)
Date: Tue, 18 Dec 2012 14:39:47 -0800
Subject: [Python-ideas] PEP 3156 feedback
In-Reply-To: <20121218214422.323a4d2f@pitrou.net>
References: <20121218110136.1f85cfae@pitrou.net>
	<CAP7+vJJcQRZrRCoSgcpnB2-2d+xbFtFqHWoLJ8voqiHxJ7+sOQ@mail.gmail.com>
	<20121218202106.0ad96d3b@pitrou.net>
	<CAP7+vJLugoO8Yg=FX+JBt4pTsqkCXxwuQoNNM_MX-6mhr5Md6A@mail.gmail.com>
	<20121218214422.323a4d2f@pitrou.net>
Message-ID: <CAP7+vJ+EVA4P9wTFUKCYpdKBK8K658TgZHF7_nAyQwmzK3nYMQ@mail.gmail.com>

On Tue, Dec 18, 2012 at 12:44 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Tue, 18 Dec 2012 12:41:04 -0800
> Guido van Rossum <guido at python.org> wrote:
>> > So "if error is None: return" is all you have to do to filter out the
>> > boring case.
>>
>> But a clean close from the other end (as opposed to an unexpected
>> disconnect e.g. due to a sudden network partition) also passes None. I
>> guess this is okay because in that case eof_received() is first
>> called. So I guess the PEP is already okay here. :-)
>
> Only if the protocol supports EOF, though? Or do you "emulate" by
> calling eof_received() in any case?

EOF is part of TCP (although I'm sure it has a different name at the
protocol level). The sender can force it by using shutdown(SHUT_WR)
(== write_eof() in Tulip/PEP 3156) or just by closing the socket (if
they don't expect a response). The low-level reader detects this by
recv() returning an empty string. Of course, if the other end closed
both halves and you try to write before reading, send() may raise an
exception and then you'll not get the EOF. And then again, send() may
not raise an exception, it all depends on where stuff gets buffered.
But arguably you get what you ask for in that case.

I plan to call eof_received(), once, if and only if recv() returns an
empty byte string.

(The PEP says that eof_received() should call close() by default, but
I don't actually think that's correct -- it also is hard to put in the
abstract Protocol class unless a specific instance variable holding
the transport is made part of the spec, which I am hesitant to do. I
don't think that ignoring it by default is actually a problem.)

-- 
--Guido van Rossum (python.org/~guido)


From andrew.svetlov at gmail.com  Tue Dec 18 23:49:15 2012
From: andrew.svetlov at gmail.com (Andrew Svetlov)
Date: Wed, 19 Dec 2012 00:49:15 +0200
Subject: [Python-ideas] PEP 3156 feedback
In-Reply-To: <CAP7+vJ+EVA4P9wTFUKCYpdKBK8K658TgZHF7_nAyQwmzK3nYMQ@mail.gmail.com>
References: <20121218110136.1f85cfae@pitrou.net>
	<CAP7+vJJcQRZrRCoSgcpnB2-2d+xbFtFqHWoLJ8voqiHxJ7+sOQ@mail.gmail.com>
	<20121218202106.0ad96d3b@pitrou.net>
	<CAP7+vJLugoO8Yg=FX+JBt4pTsqkCXxwuQoNNM_MX-6mhr5Md6A@mail.gmail.com>
	<20121218214422.323a4d2f@pitrou.net>
	<CAP7+vJ+EVA4P9wTFUKCYpdKBK8K658TgZHF7_nAyQwmzK3nYMQ@mail.gmail.com>
Message-ID: <CAL3CFcW1TmMtwGv=cQtMKZmNPTrUNxAAy53w6PifhK8R4MPeAg@mail.gmail.com>

About protocols: I think eventloop should support UDP datagrams as
well as operations with file descriptors which are not sockets at all.
I mean timerfd_create and inotify as examples.

On Wed, Dec 19, 2012 at 12:39 AM, Guido van Rossum <guido at python.org> wrote:
> On Tue, Dec 18, 2012 at 12:44 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
>> On Tue, 18 Dec 2012 12:41:04 -0800
>> Guido van Rossum <guido at python.org> wrote:
>>> > So "if error is None: return" is all you have to do to filter out the
>>> > boring case.
>>>
>>> But a clean close from the other end (as opposed to an unexpected
>>> disconnect e.g. due to a sudden network partition) also passes None. I
>>> guess this is okay because in that case eof_received() is first
>>> called. So I guess the PEP is already okay here. :-)
>>
>> Only if the protocol supports EOF, though? Or do you "emulate" by
>> calling eof_received() in any case?
>
> EOF is part of TCP (although I'm sure it has a different name at the
> protocol level). The sender can force it by using shutdown(SHUT_WR)
> (== write_eof() in Tulip/PEP 3156) or just by closing the socket (if
> they don't expect a response). The low-level reader detects this by
> recv() returning an empty string. Of course, if the other end closed
> both halves and you try to write before reading, send() may raise an
> exception and then you'll not get the EOF. And then again, send() may
> not raise an exception, it all depends on where stuff gets buffered.
> But arguably you get what you ask for in that case.
>
> I plan to call eof_received(), once, if and only if recv() returns an
> empty byte string.
>
> (The PEP says that eof_received() should call close() by default, but
> I don't actually think that's correct -- it also is hard to put in the
> abstract Protocol class unless a specific instance variable holding
> the transport is made part of the spec, which I am hesitant to do. I
> don't think that ignoring it by default is actually a problem.)
>
> --
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas



-- 
Thanks,
Andrew Svetlov


From guido at python.org  Tue Dec 18 23:51:05 2012
From: guido at python.org (Guido van Rossum)
Date: Tue, 18 Dec 2012 14:51:05 -0800
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <kaqgu3$60u$1@ger.gmane.org>
References: <CADbA=FXxgj8L+g2BSbrPzqG7dp3MNHP92erdMkxMzgAN7EgggQ@mail.gmail.com>
	<CAP7+vJL1cyS98evDSXOPu-SaAnG4k0ZeM7Tpr=4qwVmhNB4Gmw@mail.gmail.com>
	<50CF9603.6040409@canterbury.ac.nz>
	<20121217231134.19ede507@pitrou.net>
	<CAP7+vJLy4Nug=oVqZQWLg2B=CZOew6FAC38m37sY4n__dvtiEQ@mail.gmail.com>
	<CADbA=FUZcfZfNi8OXyYbcRY9X+YTnvwZmUckK_QNUX_kn6h7ig@mail.gmail.com>
	<CAP7+vJKKU4WWFmZrsGhH5r0xVKe7_9RKUszJT1Z5tZCx2v3PjA@mail.gmail.com>
	<kaqgu3$60u$1@ger.gmane.org>
Message-ID: <CAP7+vJKtyoWi=e5VNZmRUBfG4V5Lt48LLTR2gbO17Zj2goKt7g@mail.gmail.com>

On Tue, Dec 18, 2012 at 11:41 AM, Richard Oudkerk <shibturn at gmail.com> wrote:
> polling.py in the proactor branch already had an implementation of
> socketpair() for Windows;-)

D'oh! And it always uses sockets for the "self-pipe". That makes sense.

> Also note that on Windows a connecting socket needs to be added to wfds
> *and* xfds when you do
>
>     ... = select(rfds, wfds, xfds, timeout)
>
> If the connection fails then the handle is reported as being exceptional but
> *not* writable.

But SelectProactor in proactor.py doesn't seem to do this.

> It might make sense to have add_connector()/remove_connector() which on Unix
> is just an alias for add_writer()/remove_writer().  This would be useful if
> tulip ever has a loop based on WSAPoll() for Windows (Vista and later),
> since WSAPoll() has an awkward bug concerning asynchronous connects.

Can't we do this for all writers? (If we have to make a distinction,
so be it, but it seems easy to have latent bugs if some platforms
require you to make a different call but others don't care either
way.)

-- 
--Guido van Rossum (python.org/~guido)


From guido at python.org  Wed Dec 19 00:00:38 2012
From: guido at python.org (Guido van Rossum)
Date: Tue, 18 Dec 2012 15:00:38 -0800
Subject: [Python-ideas] PEP 3156 feedback
In-Reply-To: <CAL3CFcW1TmMtwGv=cQtMKZmNPTrUNxAAy53w6PifhK8R4MPeAg@mail.gmail.com>
References: <20121218110136.1f85cfae@pitrou.net>
	<CAP7+vJJcQRZrRCoSgcpnB2-2d+xbFtFqHWoLJ8voqiHxJ7+sOQ@mail.gmail.com>
	<20121218202106.0ad96d3b@pitrou.net>
	<CAP7+vJLugoO8Yg=FX+JBt4pTsqkCXxwuQoNNM_MX-6mhr5Md6A@mail.gmail.com>
	<20121218214422.323a4d2f@pitrou.net>
	<CAP7+vJ+EVA4P9wTFUKCYpdKBK8K658TgZHF7_nAyQwmzK3nYMQ@mail.gmail.com>
	<CAL3CFcW1TmMtwGv=cQtMKZmNPTrUNxAAy53w6PifhK8R4MPeAg@mail.gmail.com>
Message-ID: <CAP7+vJLA1dp_3d6Cat_+MM8C5LkbDSA0TXK7qrUns8082tkgXQ@mail.gmail.com>

On Tue, Dec 18, 2012 at 2:49 PM, Andrew Svetlov
<andrew.svetlov at gmail.com> wrote:
> About protocols: I think eventloop should support UDP datagrams

Supporting UDP should be relatively straightforward, I just haven't
used it in ages so I could use some help in describing the needed
APIs. There are a lot of recv() variants: recv(), recvfrom(),
recvmsg(), and then an _into() variant for each. And for sending
there's send()/sendall(), sendmsg(), and sendto(). I'd be ecstatic if
someone contributed code to tulip.

> as well as operations with file descriptors which are not sockets at all.

That won't work on Windows though. On UNIX you can always use the
add/remove reader/writer APIs and make the calls yourself -- the
patterns in sock_recv() and sock_sendall() are simple enough. (These
are standardized in the PEP mainly because on Windows, with IOCP, the
expectation is that they won't use "ready callbacks" (polling using
select/*poll/kqueue) but instead Windows-specific APIs for starting
I/O operations with a "completion callback".

> I mean timerfd_create and inotify as examples.

I think those will work -- they look very platform specific but in the
end there's nothing in the add/remove reader/writer API that prevents
you from using non-socket FDs on UNIX. (It's different on Windows,
where select() is the only pollster supported, and Windows select only
works with socket FDs.)

-- 
--Guido van Rossum (python.org/~guido)


From shane at umbrellacode.com  Wed Dec 19 04:44:03 2012
From: shane at umbrellacode.com (Shane Green)
Date: Tue, 18 Dec 2012 19:44:03 -0800
Subject: [Python-ideas]  async: feedback on EventLoop API
Message-ID: <3B90AC3A-C73B-4BD1-9BE8-9ECF21F0D243@umbrellacode.com>

Ignoring the API overlap with Futures/Promises for a moment, let me throw out this straw man approach to the event loop that seems to my naive eye like it pull together a lot of these ideas?

Rather than passing in your callbacks, factories, etc., asynchronous APIs return a lightweight object you register your callback with.  Unlike promises, deferrers, etc., this is a one-time thing: only one callback can register with it.  However, it can be chained.  The registered callback is invoked with the output of the operation when it completes.  

Timer.wait(20).then(callme, *args, **kw)
# I could do 
Timer.wait(20).then(callme, *args, **kw).then(piped_from_callme) 

#I could not do 
handler = Timer.wait(20)
handler.then(callme)
handler.then(callme2) # this would throw an exception.

# I/O example?
sock.accept().then(handle_connection) # invokes handle_connection(conn, addr)
# Read some data
conn.read(1024).then(handle_incoming) # handle_incoming invoked with up to 1024 bytes, read asynchronously.
# Write some data
conn.write("data").then(handle_written) # handle_written invoked with up number 5, giving number of bytes written async. 
# Connect HTTP channel and add it to HTTP dispatcher.
channel.connect((hostname,80)).then(dispatcher.add_channel)


# Listen to FD's for I/O events
descriptors.select(r, w, e).then(handle) # handle(readable, writables, oobs) 

It seems like only supporting a single callback per returned handle lets us circumvent a lot of the weight associated with normal promise/future/deferred pattern type implementations, but the chaining could come in handy as it may cover some of the use-cases being considered when multiple events per fd came up, plus chaining is pretty powerful, especially when it comes at little cost.  The API would be much more extensive than "then()", of course, with things like "every", etc. we'd have to pull examples from everything already discussed.  Just wanted to throw out there to get beat up about ;-) 



Shane Green 
www.umbrellacode.com
805-452-9666 | shane at umbrellacode.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121218/05f261e0/attachment.html>

From jstpierre at mecheye.net  Wed Dec 19 04:51:25 2012
From: jstpierre at mecheye.net (Jasper St. Pierre)
Date: Tue, 18 Dec 2012 22:51:25 -0500
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <3B90AC3A-C73B-4BD1-9BE8-9ECF21F0D243@umbrellacode.com>
References: <3B90AC3A-C73B-4BD1-9BE8-9ECF21F0D243@umbrellacode.com>
Message-ID: <CAA0H+QT9qVJ1fNMayOy_Lh9af5gLEDrxsOFmBnZ+P2_OevD9pQ@mail.gmail.com>

A lot of things become trivially easy if they assumption is that they can
never fail.

Deferreds/Promises/Tasks/Futures are about sane error handling, not sane
success handling.

(There's a few parts in the current proposal where this falls short, like
par, but that's another post)


On Tue, Dec 18, 2012 at 10:44 PM, Shane Green <shane at umbrellacode.com>wrote:

> Ignoring the API overlap with Futures/Promises for a moment, let me throw
> out this straw man approach to the event loop that seems to my naive eye
> like it pull together a lot of these ideas?
>
> Rather than passing in your callbacks, factories, etc., asynchronous APIs
> return a lightweight object you register your callback with.  Unlike
> promises, deferrers, etc., this is a one-time thing: only one callback can
> register with it.  However, it can be chained.  The registered callback is
> invoked with the output of the operation when it completes.
>
> Timer.wait(20).then(callme, *args, **kw)
> # I could do
> Timer.wait(20).then(callme, *args, **kw).then(piped_from_callme)
>
> #I could not do
> handler = Timer.wait(20)
> handler.then(callme)
> handler.then(callme2) # this would throw an exception.
>
> # I/O example?
> sock.accept().then(handle_connection) # invokes handle_connection(conn,
> addr)
> # Read some data
> conn.read(1024).then(handle_incoming) # handle_incoming invoked with up to
> 1024 bytes, read asynchronously.
> # Write some data
> conn.write("data").then(handle_written) # handle_written invoked with up
> number 5, giving number of bytes written async.
> # Connect HTTP channel and add it to HTTP dispatcher.
> channel.connect((hostname,80)).then(dispatcher.add_channel)
>
>
> # Listen to FD's for I/O events
> descriptors.select(r, w, e).then(handle) # handle(readable, writables,
> oobs)
>
> It seems like only supporting a single callback per returned handle lets
> us circumvent a lot of the weight associated with normal
> promise/future/deferred pattern type implementations, but the chaining
> could come in handy as it may cover some of the use-cases being considered
> when multiple events per fd came up, plus chaining is pretty powerful,
> especially when it comes at little cost.  The API would be much more
> extensive than "then()", of course, with things like "every", etc. we'd
> have to pull examples from everything already discussed.  Just wanted to
> throw out there to get beat up about ;-)
>
>
>
> Shane Green
> www.umbrellacode.com
> 805-452-9666 | shane at umbrellacode.com
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>


-- 
  Jasper
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121218/d313d9db/attachment.html>

From shane at umbrellacode.com  Wed Dec 19 04:55:52 2012
From: shane at umbrellacode.com (Shane Green)
Date: Tue, 18 Dec 2012 19:55:52 -0800
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CAA0H+QT9qVJ1fNMayOy_Lh9af5gLEDrxsOFmBnZ+P2_OevD9pQ@mail.gmail.com>
References: <3B90AC3A-C73B-4BD1-9BE8-9ECF21F0D243@umbrellacode.com>
	<CAA0H+QT9qVJ1fNMayOy_Lh9af5gLEDrxsOFmBnZ+P2_OevD9pQ@mail.gmail.com>
Message-ID: <69DB9EF6-CDBB-4C7A-B14C-D6F2F36BD217@umbrellacode.com>

Oh, I forgot error-backs.  True, though, error-handling is a bit more difficult.  I'm not sure I see it as being much more challenging that asynchronous callback error handling/reporting in general, though: it can still be executed in the exception context, etc.  And the chaining can even be used to attach extended error logging to all your callback chains, without losing or swallowing anything that wouldn't have been lost or swallowed by other approaches, unless I'm overlooking something. 





Shane Green 
www.umbrellacode.com
805-452-9666 | shane at umbrellacode.com

On Dec 18, 2012, at 7:51 PM, "Jasper St. Pierre" <jstpierre at mecheye.net> wrote:

> A lot of things become trivially easy if they assumption is that they can never fail.
> 
> Deferreds/Promises/Tasks/Futures are about sane error handling, not sane success handling.
> 
> (There's a few parts in the current proposal where this falls short, like par, but that's another post)
> 
> 
> On Tue, Dec 18, 2012 at 10:44 PM, Shane Green <shane at umbrellacode.com> wrote:
> Ignoring the API overlap with Futures/Promises for a moment, let me throw out this straw man approach to the event loop that seems to my naive eye like it pull together a lot of these ideas?
> 
> Rather than passing in your callbacks, factories, etc., asynchronous APIs return a lightweight object you register your callback with.  Unlike promises, deferrers, etc., this is a one-time thing: only one callback can register with it.  However, it can be chained.  The registered callback is invoked with the output of the operation when it completes.  
> 
> Timer.wait(20).then(callme, *args, **kw)
> # I could do 
> Timer.wait(20).then(callme, *args, **kw).then(piped_from_callme) 
> 
> #I could not do 
> handler = Timer.wait(20)
> handler.then(callme)
> handler.then(callme2) # this would throw an exception.
> 
> # I/O example?
> sock.accept().then(handle_connection) # invokes handle_connection(conn, addr)
> # Read some data
> conn.read(1024).then(handle_incoming) # handle_incoming invoked with up to 1024 bytes, read asynchronously.
> # Write some data
> conn.write("data").then(handle_written) # handle_written invoked with up number 5, giving number of bytes written async. 
> # Connect HTTP channel and add it to HTTP dispatcher.
> channel.connect((hostname,80)).then(dispatcher.add_channel)
> 
> 
> # Listen to FD's for I/O events
> descriptors.select(r, w, e).then(handle) # handle(readable, writables, oobs) 
> 
> It seems like only supporting a single callback per returned handle lets us circumvent a lot of the weight associated with normal promise/future/deferred pattern type implementations, but the chaining could come in handy as it may cover some of the use-cases being considered when multiple events per fd came up, plus chaining is pretty powerful, especially when it comes at little cost.  The API would be much more extensive than "then()", of course, with things like "every", etc. we'd have to pull examples from everything already discussed.  Just wanted to throw out there to get beat up about ;-) 
> 
> 
> 
> Shane Green 
> www.umbrellacode.com
> 805-452-9666 | shane at umbrellacode.com
> 
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
> 
> 
> 
> 
> -- 
>   Jasper
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121218/2e0778b3/attachment.html>

From shane at umbrellacode.com  Wed Dec 19 04:57:36 2012
From: shane at umbrellacode.com (Shane Green)
Date: Tue, 18 Dec 2012 19:57:36 -0800
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <69DB9EF6-CDBB-4C7A-B14C-D6F2F36BD217@umbrellacode.com>
References: <3B90AC3A-C73B-4BD1-9BE8-9ECF21F0D243@umbrellacode.com>
	<CAA0H+QT9qVJ1fNMayOy_Lh9af5gLEDrxsOFmBnZ+P2_OevD9pQ@mail.gmail.com>
	<69DB9EF6-CDBB-4C7A-B14C-D6F2F36BD217@umbrellacode.com>
Message-ID: <7C468382-2A39-47B3-A868-558207CAC02D@umbrellacode.com>

Or maybe I just misread your response.  Can you elaborate on what you mean by "not about sane success handling"?





Shane Green 
www.umbrellacode.com
805-452-9666 | shane at umbrellacode.com

On Dec 18, 2012, at 7:55 PM, Shane Green <shane at umbrellacode.com> wrote:

> Oh, I forgot error-backs.  True, though, error-handling is a bit more difficult.  I'm not sure I see it as being much more challenging that asynchronous callback error handling/reporting in general, though: it can still be executed in the exception context, etc.  And the chaining can even be used to attach extended error logging to all your callback chains, without losing or swallowing anything that wouldn't have been lost or swallowed by other approaches, unless I'm overlooking something. 
> 
> 
> 
> 
> 
> Shane Green 
> www.umbrellacode.com
> 805-452-9666 | shane at umbrellacode.com
> 
> On Dec 18, 2012, at 7:51 PM, "Jasper St. Pierre" <jstpierre at mecheye.net> wrote:
> 
>> A lot of things become trivially easy if they assumption is that they can never fail.
>> 
>> Deferreds/Promises/Tasks/Futures are about sane error handling, not sane success handling.
>> 
>> (There's a few parts in the current proposal where this falls short, like par, but that's another post)
>> 
>> 
>> On Tue, Dec 18, 2012 at 10:44 PM, Shane Green <shane at umbrellacode.com> wrote:
>> Ignoring the API overlap with Futures/Promises for a moment, let me throw out this straw man approach to the event loop that seems to my naive eye like it pull together a lot of these ideas?
>> 
>> Rather than passing in your callbacks, factories, etc., asynchronous APIs return a lightweight object you register your callback with.  Unlike promises, deferrers, etc., this is a one-time thing: only one callback can register with it.  However, it can be chained.  The registered callback is invoked with the output of the operation when it completes.  
>> 
>> Timer.wait(20).then(callme, *args, **kw)
>> # I could do 
>> Timer.wait(20).then(callme, *args, **kw).then(piped_from_callme) 
>> 
>> #I could not do 
>> handler = Timer.wait(20)
>> handler.then(callme)
>> handler.then(callme2) # this would throw an exception.
>> 
>> # I/O example?
>> sock.accept().then(handle_connection) # invokes handle_connection(conn, addr)
>> # Read some data
>> conn.read(1024).then(handle_incoming) # handle_incoming invoked with up to 1024 bytes, read asynchronously.
>> # Write some data
>> conn.write("data").then(handle_written) # handle_written invoked with up number 5, giving number of bytes written async. 
>> # Connect HTTP channel and add it to HTTP dispatcher.
>> channel.connect((hostname,80)).then(dispatcher.add_channel)
>> 
>> 
>> # Listen to FD's for I/O events
>> descriptors.select(r, w, e).then(handle) # handle(readable, writables, oobs) 
>> 
>> It seems like only supporting a single callback per returned handle lets us circumvent a lot of the weight associated with normal promise/future/deferred pattern type implementations, but the chaining could come in handy as it may cover some of the use-cases being considered when multiple events per fd came up, plus chaining is pretty powerful, especially when it comes at little cost.  The API would be much more extensive than "then()", of course, with things like "every", etc. we'd have to pull examples from everything already discussed.  Just wanted to throw out there to get beat up about ;-) 
>> 
>> 
>> 
>> Shane Green 
>> www.umbrellacode.com
>> 805-452-9666 | shane at umbrellacode.com
>> 
>> 
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>> 
>> 
>> 
>> 
>> -- 
>>   Jasper
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121218/c65d781b/attachment.html>

From guido at python.org  Wed Dec 19 05:28:35 2012
From: guido at python.org (Guido van Rossum)
Date: Tue, 18 Dec 2012 20:28:35 -0800
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <3B90AC3A-C73B-4BD1-9BE8-9ECF21F0D243@umbrellacode.com>
References: <3B90AC3A-C73B-4BD1-9BE8-9ECF21F0D243@umbrellacode.com>
Message-ID: <CAP7+vJJrn2arYjYGgkALt8wGJPMCqw3oQaMS3OqCdRSjCjOg+w@mail.gmail.com>

The point of PEP 3156 is not to make using callbacks easy. It is to
make callbacks mostly disappear in favor of coroutines, but keeping
them around in order to provide interoperability with callback-based
frameworks such as Twisted or Tornado.

Your handlers appear to be an attempt at reinventing Twisted's
Deferred. But Deferred already exists, and it works perfectly fine
with the current callback-based event loop spec in the PEP. It's not
clear how your handlers will enable a coroutine to wait for the result
(or exception) however.

--Guido

On Tue, Dec 18, 2012 at 7:44 PM, Shane Green <shane at umbrellacode.com> wrote:
> Ignoring the API overlap with Futures/Promises for a moment, let me throw
> out this straw man approach to the event loop that seems to my naive eye
> like it pull together a lot of these ideas?
>
> Rather than passing in your callbacks, factories, etc., asynchronous APIs
> return a lightweight object you register your callback with.  Unlike
> promises, deferrers, etc., this is a one-time thing: only one callback can
> register with it.  However, it can be chained.  The registered callback is
> invoked with the output of the operation when it completes.
>
> Timer.wait(20).then(callme, *args, **kw)
> # I could do
> Timer.wait(20).then(callme, *args, **kw).then(piped_from_callme)
>
> #I could not do
> handler = Timer.wait(20)
> handler.then(callme)
> handler.then(callme2) # this would throw an exception.
>
> # I/O example?
> sock.accept().then(handle_connection) # invokes handle_connection(conn,
> addr)
> # Read some data
> conn.read(1024).then(handle_incoming) # handle_incoming invoked with up to
> 1024 bytes, read asynchronously.
> # Write some data
> conn.write("data").then(handle_written) # handle_written invoked with up
> number 5, giving number of bytes written async.
> # Connect HTTP channel and add it to HTTP dispatcher.
> channel.connect((hostname,80)).then(dispatcher.add_channel)
>
>
> # Listen to FD's for I/O events
> descriptors.select(r, w, e).then(handle) # handle(readable, writables, oobs)
>
> It seems like only supporting a single callback per returned handle lets us
> circumvent a lot of the weight associated with normal
> promise/future/deferred pattern type implementations, but the chaining could
> come in handy as it may cover some of the use-cases being considered when
> multiple events per fd came up, plus chaining is pretty powerful, especially
> when it comes at little cost.  The API would be much more extensive than
> "then()", of course, with things like "every", etc. we'd have to pull
> examples from everything already discussed.  Just wanted to throw out there
> to get beat up about ;-)
>
>
>
> Shane Green
> www.umbrellacode.com
> 805-452-9666 | shane at umbrellacode.com
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
--Guido van Rossum (python.org/~guido)


From shane at umbrellacode.com  Wed Dec 19 05:47:55 2012
From: shane at umbrellacode.com (Shane Green)
Date: Tue, 18 Dec 2012 20:47:55 -0800
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CAP7+vJJrn2arYjYGgkALt8wGJPMCqw3oQaMS3OqCdRSjCjOg+w@mail.gmail.com>
References: <3B90AC3A-C73B-4BD1-9BE8-9ECF21F0D243@umbrellacode.com>
	<CAP7+vJJrn2arYjYGgkALt8wGJPMCqw3oQaMS3OqCdRSjCjOg+w@mail.gmail.com>
Message-ID: <07806C41-9384-442C-BDA1-3D0E04C6F441@umbrellacode.com>

Ah, I see.  I did not read though the PEP like I should have.  Given that I didn't do my homework, it would be an awesome coincidence if they enabled a coroutine to wait for the result (or exception) ;-)





Shane Green 
www.umbrellacode.com
805-452-9666 | shane at umbrellacode.com

On Dec 18, 2012, at 8:28 PM, Guido van Rossum <guido at python.org> wrote:

> The point of PEP 3156 is not to make using callbacks easy. It is to
> make callbacks mostly disappear in favor of coroutines, but keeping
> them around in order to provide interoperability with callback-based
> frameworks such as Twisted or Tornado.
> 
> Your handlers appear to be an attempt at reinventing Twisted's
> Deferred. But Deferred already exists, and it works perfectly fine
> with the current callback-based event loop spec in the PEP. It's not
> clear how your handlers will enable a coroutine to wait for the result
> (or exception) however.
> 
> --Guido
> 
> On Tue, Dec 18, 2012 at 7:44 PM, Shane Green <shane at umbrellacode.com> wrote:
>> Ignoring the API overlap with Futures/Promises for a moment, let me throw
>> out this straw man approach to the event loop that seems to my naive eye
>> like it pull together a lot of these ideas?
>> 
>> Rather than passing in your callbacks, factories, etc., asynchronous APIs
>> return a lightweight object you register your callback with.  Unlike
>> promises, deferrers, etc., this is a one-time thing: only one callback can
>> register with it.  However, it can be chained.  The registered callback is
>> invoked with the output of the operation when it completes.
>> 
>> Timer.wait(20).then(callme, *args, **kw)
>> # I could do
>> Timer.wait(20).then(callme, *args, **kw).then(piped_from_callme)
>> 
>> #I could not do
>> handler = Timer.wait(20)
>> handler.then(callme)
>> handler.then(callme2) # this would throw an exception.
>> 
>> # I/O example?
>> sock.accept().then(handle_connection) # invokes handle_connection(conn,
>> addr)
>> # Read some data
>> conn.read(1024).then(handle_incoming) # handle_incoming invoked with up to
>> 1024 bytes, read asynchronously.
>> # Write some data
>> conn.write("data").then(handle_written) # handle_written invoked with up
>> number 5, giving number of bytes written async.
>> # Connect HTTP channel and add it to HTTP dispatcher.
>> channel.connect((hostname,80)).then(dispatcher.add_channel)
>> 
>> 
>> # Listen to FD's for I/O events
>> descriptors.select(r, w, e).then(handle) # handle(readable, writables, oobs)
>> 
>> It seems like only supporting a single callback per returned handle lets us
>> circumvent a lot of the weight associated with normal
>> promise/future/deferred pattern type implementations, but the chaining could
>> come in handy as it may cover some of the use-cases being considered when
>> multiple events per fd came up, plus chaining is pretty powerful, especially
>> when it comes at little cost.  The API would be much more extensive than
>> "then()", of course, with things like "every", etc. we'd have to pull
>> examples from everything already discussed.  Just wanted to throw out there
>> to get beat up about ;-)
>> 
>> 
>> 
>> Shane Green
>> www.umbrellacode.com
>> 805-452-9666 | shane at umbrellacode.com
>> 
>> 
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>> 
> 
> 
> 
> -- 
> --Guido van Rossum (python.org/~guido)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121218/dab87375/attachment.html>

From jstpierre at mecheye.net  Wed Dec 19 05:45:34 2012
From: jstpierre at mecheye.net (Jasper St. Pierre)
Date: Tue, 18 Dec 2012 23:45:34 -0500
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CAP7+vJJrn2arYjYGgkALt8wGJPMCqw3oQaMS3OqCdRSjCjOg+w@mail.gmail.com>
References: <3B90AC3A-C73B-4BD1-9BE8-9ECF21F0D243@umbrellacode.com>
	<CAP7+vJJrn2arYjYGgkALt8wGJPMCqw3oQaMS3OqCdRSjCjOg+w@mail.gmail.com>
Message-ID: <CAA0H+QQpfV6K4jYEZ5N13cCJyv+RJRgspVA_T_-N8re0hgfwxw@mail.gmail.com>

I guess this is a good place as any to bring this up, but we really need to
address issues with error handling and things like par().

par() has one way to handle errors: if one task (using it as a general term
to encompass futures and coroutines) fails, all tasks fail.

This is nowhere near acceptable. As a simple example, par(grab_page("
http://google.com"), grab_page("http://yahoo.com")) should not fail if one
of the two sites returns a 500; the results of another may still be useful
to us.

I can think of an approach that doesn't require passing more arguments to
par(), but may be absurdly silly: the results generated by par() are not
directly results returned by the task, but instead an intermediate wrapper
value that allows us to hoist the error handling into the caller.

    for intermediate in par(*tasks):
        try:
            result = intermediate.result()
        except ValueError as e:
            print("bad")
        else:
            print("good")

But this makes the trade-off that you can't immediately cancel all the
other tasks when one task fails.

The only truly way to be notified when a task has finished, either with
success, with error, is a callback, which I think we should flesh out
entirely in our Futures model.

And, of course, we should make sure that we can handle the four situations
mentioned in [0] , even if we don't solve them with callbacks.

[0] https://gist.github.com/3889970


On Tue, Dec 18, 2012 at 11:28 PM, Guido van Rossum <guido at python.org> wrote:

> The point of PEP 3156 is not to make using callbacks easy. It is to
> make callbacks mostly disappear in favor of coroutines, but keeping
> them around in order to provide interoperability with callback-based
> frameworks such as Twisted or Tornado.
>
> Your handlers appear to be an attempt at reinventing Twisted's
> Deferred. But Deferred already exists, and it works perfectly fine
> with the current callback-based event loop spec in the PEP. It's not
> clear how your handlers will enable a coroutine to wait for the result
> (or exception) however.
>
> --Guido
>
> On Tue, Dec 18, 2012 at 7:44 PM, Shane Green <shane at umbrellacode.com>
> wrote:
> > Ignoring the API overlap with Futures/Promises for a moment, let me throw
> > out this straw man approach to the event loop that seems to my naive eye
> > like it pull together a lot of these ideas?
> >
> > Rather than passing in your callbacks, factories, etc., asynchronous APIs
> > return a lightweight object you register your callback with.  Unlike
> > promises, deferrers, etc., this is a one-time thing: only one callback
> can
> > register with it.  However, it can be chained.  The registered callback
> is
> > invoked with the output of the operation when it completes.
> >
> > Timer.wait(20).then(callme, *args, **kw)
> > # I could do
> > Timer.wait(20).then(callme, *args, **kw).then(piped_from_callme)
> >
> > #I could not do
> > handler = Timer.wait(20)
> > handler.then(callme)
> > handler.then(callme2) # this would throw an exception.
> >
> > # I/O example?
> > sock.accept().then(handle_connection) # invokes handle_connection(conn,
> > addr)
> > # Read some data
> > conn.read(1024).then(handle_incoming) # handle_incoming invoked with up
> to
> > 1024 bytes, read asynchronously.
> > # Write some data
> > conn.write("data").then(handle_written) # handle_written invoked with up
> > number 5, giving number of bytes written async.
> > # Connect HTTP channel and add it to HTTP dispatcher.
> > channel.connect((hostname,80)).then(dispatcher.add_channel)
> >
> >
> > # Listen to FD's for I/O events
> > descriptors.select(r, w, e).then(handle) # handle(readable, writables,
> oobs)
> >
> > It seems like only supporting a single callback per returned handle lets
> us
> > circumvent a lot of the weight associated with normal
> > promise/future/deferred pattern type implementations, but the chaining
> could
> > come in handy as it may cover some of the use-cases being considered when
> > multiple events per fd came up, plus chaining is pretty powerful,
> especially
> > when it comes at little cost.  The API would be much more extensive than
> > "then()", of course, with things like "every", etc. we'd have to pull
> > examples from everything already discussed.  Just wanted to throw out
> there
> > to get beat up about ;-)
> >
> >
> >
> > Shane Green
> > www.umbrellacode.com
> > 805-452-9666 | shane at umbrellacode.com
> >
> >
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at python.org
> > http://mail.python.org/mailman/listinfo/python-ideas
> >
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
  Jasper
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121218/05d6834e/attachment.html>

From guido at python.org  Wed Dec 19 06:36:34 2012
From: guido at python.org (Guido van Rossum)
Date: Tue, 18 Dec 2012 21:36:34 -0800
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CAA0H+QQpfV6K4jYEZ5N13cCJyv+RJRgspVA_T_-N8re0hgfwxw@mail.gmail.com>
References: <3B90AC3A-C73B-4BD1-9BE8-9ECF21F0D243@umbrellacode.com>
	<CAP7+vJJrn2arYjYGgkALt8wGJPMCqw3oQaMS3OqCdRSjCjOg+w@mail.gmail.com>
	<CAA0H+QQpfV6K4jYEZ5N13cCJyv+RJRgspVA_T_-N8re0hgfwxw@mail.gmail.com>
Message-ID: <CAP7+vJLSpaw5bL9pMXCTP0fEhe93NkaK1Y8bCH5_THQ-RNqcyw@mail.gmail.com>

On Tue, Dec 18, 2012 at 8:45 PM, Jasper St. Pierre
<jstpierre at mecheye.net> wrote:
> I guess this is a good place as any to bring this up, but we really need to
> address issues with error handling and things like par().
>
> par() has one way to handle errors: if one task (using it as a general term
> to encompass futures and coroutines) fails, all tasks fail.
>
> This is nowhere near acceptable. As a simple example,
> par(grab_page("http://google.com"), grab_page("http://yahoo.com")) should
> not fail if one of the two sites returns a 500; the results of another may
> still be useful to us.

Yes, there need to be a few variants. If you want all the results,
regardless of errors, we can provide a variant of par() whose result
is a list of futures instead of a list of results (or a single
exception). This could also add a timeout. There also needs to be a
way to take a set of tasks and wait for the first one to complete. (In
fact, put a timeout on this and you can build any other variant
easily.)

PEP 3148 probably shows the way here, it has as_completed() and
wait(), although we cannot emulate these APIs exactly (since they
block -- we need something you can use in a yield from, e.g.

fs = {set of Futures}
while fs:
    f = yield from wait_one(fs)  # Optionally with a timeout
    fs.remove(f)
    <use f>

(We could possibly do the remove() call ih wait_one(), although that
may limit the argument type to a set.)

> I can think of an approach that doesn't require passing more arguments to
> par(), but may be absurdly silly: the results generated by par() are not
> directly results returned by the task, but instead an intermediate wrapper
> value that allows us to hoist the error handling into the caller.
>
>     for intermediate in par(*tasks):
>         try:
>             result = intermediate.result()
>         except ValueError as e:
>             print("bad")
>         else:
>             print("good")
>
> But this makes the trade-off that you can't immediately cancel all the other
> tasks when one task fails.

Yeah, that's the par() variant that returns futures instead of results.

> The only truly way to be notified when a task has finished, either with
> success, with error, is a callback, which I think we should flesh out
> entirely in our Futures model.

Proposal?

> And, of course, we should make sure that we can handle the four situations
> mentioned in [0] , even if we don't solve them with callbacks.
>
> [0] https://gist.github.com/3889970

That's longwinded and written in a confrontational style. Can you summarize?

-- 
--Guido van Rossum (python.org/~guido)


From jstpierre at mecheye.net  Wed Dec 19 07:10:36 2012
From: jstpierre at mecheye.net (Jasper St. Pierre)
Date: Wed, 19 Dec 2012 01:10:36 -0500
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CAP7+vJLSpaw5bL9pMXCTP0fEhe93NkaK1Y8bCH5_THQ-RNqcyw@mail.gmail.com>
References: <3B90AC3A-C73B-4BD1-9BE8-9ECF21F0D243@umbrellacode.com>
	<CAP7+vJJrn2arYjYGgkALt8wGJPMCqw3oQaMS3OqCdRSjCjOg+w@mail.gmail.com>
	<CAA0H+QQpfV6K4jYEZ5N13cCJyv+RJRgspVA_T_-N8re0hgfwxw@mail.gmail.com>
	<CAP7+vJLSpaw5bL9pMXCTP0fEhe93NkaK1Y8bCH5_THQ-RNqcyw@mail.gmail.com>
Message-ID: <CAA0H+QQF-WPEviYhAd9iWNdCmNh_OL0q9iytUj5mCEgGz1hDJw@mail.gmail.com>

On Wed, Dec 19, 2012 at 12:36 AM, Guido van Rossum <guido at python.org> wrote:

> On Tue, Dec 18, 2012 at 8:45 PM, Jasper St. Pierre
> <jstpierre at mecheye.net> wrote:
> > I guess this is a good place as any to bring this up, but we really need
> to
> > address issues with error handling and things like par().
> >
> > par() has one way to handle errors: if one task (using it as a general
> term
> > to encompass futures and coroutines) fails, all tasks fail.
> >
> > This is nowhere near acceptable. As a simple example,
> > par(grab_page("http://google.com"), grab_page("http://yahoo.com"))
> should
> > not fail if one of the two sites returns a 500; the results of another
> may
> > still be useful to us.
>
> Yes, there need to be a few variants. If you want all the results,
> regardless of errors, we can provide a variant of par() whose result
> is a list of futures instead of a list of results (or a single
> exception). This could also add a timeout. There also needs to be a
> way to take a set of tasks and wait for the first one to complete. (In
> fact, put a timeout on this and you can build any other variant
> easily.)
>
> PEP 3148 probably shows the way here, it has as_completed() and
> wait(), although we cannot emulate these APIs exactly (since they
> block -- we need something you can use in a yield from, e.g.
>
> fs = {set of Futures}
> while fs:
>     f = yield from wait_one(fs)  # Optionally with a timeout
>     fs.remove(f)
>     <use f>
>
> (We could possibly do the remove() call ih wait_one(), although that
> may limit the argument type to a set.)
>
> > I can think of an approach that doesn't require passing more arguments to
> > par(), but may be absurdly silly: the results generated by par() are not
> > directly results returned by the task, but instead an intermediate
> wrapper
> > value that allows us to hoist the error handling into the caller.
> >
> >     for intermediate in par(*tasks):
> >         try:
> >             result = intermediate.result()
> >         except ValueError as e:
> >             print("bad")
> >         else:
> >             print("good")
> >
> > But this makes the trade-off that you can't immediately cancel all the
> other
> > tasks when one task fails.
>
> Yeah, that's the par() variant that returns futures instead of results.
>
> > The only truly way to be notified when a task has finished, either with
> > success, with error, is a callback, which I think we should flesh out
> > entirely in our Futures model.
>
> Proposal?
>

I'm not sure if this will work out, but I think the par() could have some
sort of "immediate result" callback which fires when one of the sub-tasks
fire. If we then take out the part where we fail and abort automatically,
we might have a close enough approximation:

    def fail_silently(par_task, subtask):
        try:
            return subtask.result()
        except Exception as e:
            print("grabbing failed", e)
            return None

    pages = list(yield par(grab_page("http://google.com"), grab_page("
http://yahoo.com"), subtask_completed=fail_silently))

Where par returns a list of values instead of a list of tasks. But maybe
the ability to manipulate the return value from the subtask completion
callback hands it a bit too much power.

I like the initial approach, but the details need fleshing out. I think it
would be neat if we could have several standard behaviors in the stdlib:
subtask_completed=fail_silently, subtask_completed=abort_task, etc.

> And, of course, we should make sure that we can handle the four situations
> > mentioned in [0] , even if we don't solve them with callbacks.
> >
> > [0] https://gist.github.com/3889970
>
> That's longwinded and written in a confrontational style. Can you
> summarize?
>

Yeah, this was more at a lament at libraries like jQuery that implement the
CommonJS Promise/A specification wrong. It's really only relevant if we
choose to add errbacks, as it's about the composition and sematics between
callbacks/errbacks, and chaining the two.

--
> --Guido van Rossum (python.org/~guido)
>



-- 
  Jasper
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121219/894ebc6c/attachment.html>

From guido at python.org  Wed Dec 19 07:24:50 2012
From: guido at python.org (Guido van Rossum)
Date: Tue, 18 Dec 2012 22:24:50 -0800
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CAA0H+QQF-WPEviYhAd9iWNdCmNh_OL0q9iytUj5mCEgGz1hDJw@mail.gmail.com>
References: <3B90AC3A-C73B-4BD1-9BE8-9ECF21F0D243@umbrellacode.com>
	<CAP7+vJJrn2arYjYGgkALt8wGJPMCqw3oQaMS3OqCdRSjCjOg+w@mail.gmail.com>
	<CAA0H+QQpfV6K4jYEZ5N13cCJyv+RJRgspVA_T_-N8re0hgfwxw@mail.gmail.com>
	<CAP7+vJLSpaw5bL9pMXCTP0fEhe93NkaK1Y8bCH5_THQ-RNqcyw@mail.gmail.com>
	<CAA0H+QQF-WPEviYhAd9iWNdCmNh_OL0q9iytUj5mCEgGz1hDJw@mail.gmail.com>
Message-ID: <CAP7+vJJ2bkQEGPXp6TW-DCKee-r67_fD2=42W6d9xWgtKE5P3g@mail.gmail.com>

On Tuesday, December 18, 2012, Jasper St. Pierre wrote:

> On Wed, Dec 19, 2012 at 12:36 AM, Guido van Rossum <guido at python.org<javascript:_e({}, 'cvml', 'guido at python.org');>
> > wrote:
>
>> On Tue, Dec 18, 2012 at 8:45 PM, Jasper St. Pierre
>> <jstpierre at mecheye.net <javascript:_e({}, 'cvml',
>> 'jstpierre at mecheye.net');>> wrote:
>> > I guess this is a good place as any to bring this up, but we really
>> need to
>> > address issues with error handling and things like par().
>> >
>> > par() has one way to handle errors: if one task (using it as a general
>> term
>> > to encompass futures and coroutines) fails, all tasks fail.
>> >
>> > This is nowhere near acceptable. As a simple example,
>> > par(grab_page("http://google.com"), grab_page("http://yahoo.com"))
>> should
>> > not fail if one of the two sites returns a 500; the results of another
>> may
>> > still be useful to us.
>>
>> Yes, there need to be a few variants. If you want all the results,
>> regardless of errors, we can provide a variant of par() whose result
>> is a list of futures instead of a list of results (or a single
>> exception). This could also add a timeout. There also needs to be a
>> way to take a set of tasks and wait for the first one to complete. (In
>> fact, put a timeout on this and you can build any other variant
>> easily.)
>>
>> PEP 3148 probably shows the way here, it has as_completed() and
>> wait(), although we cannot emulate these APIs exactly (since they
>> block -- we need something you can use in a yield from, e.g.
>>
>> fs = {set of Futures}
>> while fs:
>>     f = yield from wait_one(fs)  # Optionally with a timeout
>>     fs.remove(f)
>>     <use f>
>>
>> (We could possibly do the remove() call ih wait_one(), although that
>> may limit the argument type to a set.)
>>
>> > I can think of an approach that doesn't require passing more arguments
>> to
>> > par(), but may be absurdly silly: the results generated by par() are not
>> > directly results returned by the task, but instead an intermediate
>> wrapper
>> > value that allows us to hoist the error handling into the caller.
>> >
>> >     for intermediate in par(*tasks):
>> >         try:
>> >             result = intermediate.result()
>> >         except ValueError as e:
>> >             print("bad")
>> >         else:
>> >             print("good")
>> >
>> > But this makes the trade-off that you can't immediately cancel all the
>> other
>> > tasks when one task fails.
>>
>> Yeah, that's the par() variant that returns futures instead of results.
>>
>> > The only truly way to be notified when a task has finished, either with
>> > success, with error, is a callback, which I think we should flesh out
>> > entirely in our Futures model.
>>
>> Proposal?
>>
>
> I'm not sure if this will work out, but I think the par() could have some
> sort of "immediate result" callback which fires when one of the sub-tasks
> fire. If we then take out the part where we fail and abort automatically,
> we might have a close enough approximation:
>
>     def fail_silently(par_task, subtask):
>         try:
>             return subtask.result()
>         except Exception as e:
>             print("grabbing failed", e)
>             return None
>
>     pages = list(yield par(grab_page("http://google.com"), grab_page("
> http://yahoo.com"), subtask_completed=fail_silently))
>
> Where par returns a list of values instead of a list of tasks. But maybe
> the ability to manipulate the return value from the subtask completion
> callback hands it a bit too much power.
>

That looks reasonable too, although the signature may need to be adjusted.
(How does it cancel the remaining tasks if it wants to? Or does par() do
that if this callback raises?) maybe call it filter?

But what did you think of my wait_one() proposal? It may work beter in a
coroutine, where callbacks are considered a nuisance.



> I like the initial approach, but the details need fleshing out. I think it
> would be neat if we could have several standard behaviors in the stdlib:
> subtask_completed=fail_silently, subtask_completed=abort_task, etc.
>
> > And, of course, we should make sure that we can handle the four
>> situations
>> > mentioned in [0] , even if we don't solve them with callbacks.
>> >
>> > [0] https://gist.github.com/3889970
>>
>> That's longwinded and written in a confrontational style. Can you
>> summarize?
>>
>
> Yeah, this was more at a lament at libraries like jQuery that implement
> the CommonJS Promise/A specification wrong. It's really only relevant if we
> choose to add errbacks, as it's about the composition and sematics between
> callbacks/errbacks, and chaining the two.
>

No, no, no! Please. No errbacks. No chaining. Coroutines have a different
way to spell those already: errbacks -> except clauses, chaining ->
multiple yield-froms in one coroutine, or call another coroutine. Please.

--Guido


-- 
--Guido van Rossum (on iPad)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121218/b523a336/attachment.html>

From jstpierre at mecheye.net  Wed Dec 19 07:41:07 2012
From: jstpierre at mecheye.net (Jasper St. Pierre)
Date: Wed, 19 Dec 2012 01:41:07 -0500
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CAP7+vJJ2bkQEGPXp6TW-DCKee-r67_fD2=42W6d9xWgtKE5P3g@mail.gmail.com>
References: <3B90AC3A-C73B-4BD1-9BE8-9ECF21F0D243@umbrellacode.com>
	<CAP7+vJJrn2arYjYGgkALt8wGJPMCqw3oQaMS3OqCdRSjCjOg+w@mail.gmail.com>
	<CAA0H+QQpfV6K4jYEZ5N13cCJyv+RJRgspVA_T_-N8re0hgfwxw@mail.gmail.com>
	<CAP7+vJLSpaw5bL9pMXCTP0fEhe93NkaK1Y8bCH5_THQ-RNqcyw@mail.gmail.com>
	<CAA0H+QQF-WPEviYhAd9iWNdCmNh_OL0q9iytUj5mCEgGz1hDJw@mail.gmail.com>
	<CAP7+vJJ2bkQEGPXp6TW-DCKee-r67_fD2=42W6d9xWgtKE5P3g@mail.gmail.com>
Message-ID: <CAA0H+QTc24JHbhiZZAfohzSNd=haUt_xLiXZ5m40SUqAEtf_pQ@mail.gmail.com>

On Wed, Dec 19, 2012 at 1:24 AM, Guido van Rossum <guido at python.org> wrote:

... snip ...

That looks reasonable too, although the signature may need to be adjusted.
> (How does it cancel the remaining tasks if it wants to? Or does par() do
> that if this callback raises?) maybe call it filter?
>

The subtask completion callback can call abort() on the overall par_task,
which could cancel the rest of the unfinished tasks.

    def abort_task(par_task, subtask):
        try:
            return subtask.result()
        except ValueError:
            par_task.abort()

The issue with this approach is that since the par() would return values
again, not tasks, we'd can't handle errors locally. Futures are also
immutable, so we can't modify the values after they resolve. Maybe we'd
have something like:

    def fail_silently(par_task, subtask):
        try:
            subtask.result()
        except ValueError as e:
            return Future.completed(None) # an already completed future
that has a value of None, sorry, don't remember the exact spelling
        else:
            return subtask

which allows us:

    for task in par(*tasks, subtask_completion=fail_silently):
        # ...

Which allows us both local error handling, as well as batch error handling.
But it's very verbose from the side of the callback. Hm.


> But what did you think of my wait_one() proposal? It may work beter in a
> coroutine, where callbacks are considered a nuisance.
>

To be honest, I didn't quite understand it. I'd have to go back and re-read
PEP 3148.


> I like the initial approach, but the details need fleshing out. I think it
>> would be neat if we could have several standard behaviors in the stdlib:
>> subtask_completed=fail_silently, subtask_completed=abort_task, etc.
>>
>> > And, of course, we should make sure that we can handle the four
>>> situations
>>> > mentioned in [0] , even if we don't solve them with callbacks.
>>> >
>>> > [0] https://gist.github.com/3889970
>>>
>>> That's longwinded and written in a confrontational style. Can you
>>> summarize?
>>>
>>
>> Yeah, this was more at a lament at libraries like jQuery that implement
>> the CommonJS Promise/A specification wrong. It's really only relevant if we
>> choose to add errbacks, as it's about the composition and sematics between
>> callbacks/errbacks, and chaining the two.
>>
>
> No, no, no! Please. No errbacks. No chaining. Coroutines have a different
> way to spell those already: errbacks -> except clauses, chaining ->
> multiple yield-froms in one coroutine, or call another coroutine. Please.
>
> --Guido
>
>
> --
> --Guido van Rossum (on iPad)
>



-- 
  Jasper
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121219/b16677ef/attachment.html>

From g.rodola at gmail.com  Wed Dec 19 15:51:43 2012
From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=)
Date: Wed, 19 Dec 2012 15:51:43 +0100
Subject: [Python-ideas] PEP 3156 feedback
In-Reply-To: <CAP7+vJJcQRZrRCoSgcpnB2-2d+xbFtFqHWoLJ8voqiHxJ7+sOQ@mail.gmail.com>
References: <20121218110136.1f85cfae@pitrou.net>
	<CAP7+vJJcQRZrRCoSgcpnB2-2d+xbFtFqHWoLJ8voqiHxJ7+sOQ@mail.gmail.com>
Message-ID: <CAFYqXL_2W3W=x52y8tCaj=huXu+ZAhAgiq4tE5ULB5m+ne53ew@mail.gmail.com>

2012/12/18 Guido van Rossum <guido at python.org>
>
> On Tue, Dec 18, 2012 at 2:01 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:> Event loop API
> > --------------
> >
> > I would like to say that I prefer Tornado's model: for each primitive
> > provided by Tornado, you can pass an explicit Loop instance which you
> > instantiated manually.
> > There is no module function or policy object hiding this mechanism:
> > it's simple, explicit and flexible (in other words: if you want a
> > per-thread event loop, just do it yourself using TLS :-)).
>
> It sounds though as if the explicit loop is optional, and still
> defaults to some global default loop?
>
> Having one global loop shared by multiple threads is iffy though. Only
> one thread should be *running* the loop, otherwise the loop can' be
> used as a mutual exclusion device. Worse, all primitives for adding
> and removing callbacks/handlers must be made threadsafe, and then
> basically the entire event loop becomes full of locks, which seems
> wrong to me.

The basic idea is to have multiple threads/processes, each running its
own IO loop.
No locks are required because each IO poller instance will deal with
its own socket-map / callbacks-queue and no resources are shared.
In asyncore this was achieved by introducing the "map" parameter.
Similarly to Tornado, pyftpdlib uses an "ioloop" parameter which can
be passed to all the classes which will handle the connection (the
handlers).
If "ioloop" is provided all the handlers will use that (...and
register() against it, add_reader() etc..) otherwise the "global"
ioloop instance will be used (default).
A dynamic IO poller like this is important because in case the
connection handlers are forced to block for some reason, you can
switch from a concurrency model (async / non-blocking) to another
(multi threads/process) very easily.
See:
http://code.google.com/p/pyftpdlib/issues/detail?id=212#c9
http://code.google.com/p/pyftpdlib/source/browse/trunk/pyftpdlib/servers.py?spec=svn1137&r=1137

Hope this helps,

--- Giampaolo
http://code.google.com/p/pyftpdlib/
http://code.google.com/p/psutil/
http://code.google.com/p/pysendfile/


From techtonik at gmail.com  Wed Dec 19 16:11:10 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Wed, 19 Dec 2012 18:11:10 +0300
Subject: [Python-ideas] Tree as a data structure (Was:  Graph class)
Message-ID: <CAPkN8xK_g0=TskvGjcOpbtgnffXZVC2ce8qrg=D7dGmDLfUmLQ@mail.gmail.com>

On Sun, Dec 16, 2012 at 6:41 PM, Guido van Rossum <guido at python.org> wrote:

> I think of graphs and trees as patterns, not data structures.


In my world strings, ints and lists are 1D data types, and tree can be a
very important 2D data structure. Even if it is a pattern, this pattern is
vital for the transformation of structured data, because it allows to
represent any data structure in canonical format.

Speaking of tree as a data structure, I assume that it has a very basic
definition:

1. tree consists of nodes
2. some nodes are containers for other nodes
3. every node has properties
4. every node has 0 or 1 parent
5. every container has 1+ children
6. tree has a single starting root node
7. no child of a parent can be its ancestor
   (no cyclic dependencies between elements)

List of trees is a forest. Every subtree is a complete tree.


To see which tree data type would be really useful in Python distribution
(e.g. provides a simple, extendable and intuitive interface), I see only
one way - is to scratching some itches relevant to Python and then try to
scale it to other problems. The outcome should be the answer -
what for native tree type is not suitable?

More ideas:
[ ] Much experience for working with trees can be brought from XML and DOM
manipulation practices (jQuery and friends)
  [ ] every element in a tree can be accessed by its address specificator
as 'root/node[3]/last'
  [ ] but it is also convenient to access tree data using node names as
'mylib.books[:1]'
  [ ] and of course, you can run queries over trees

[ ] Tree is the base for any "Data Transformation Framework" as it allows
to jump from "data type conversion" to "data structure conversion and
mapping"
  [ ] Trees can be converted to other trees and to more complicated
structures
  [ ] Conversion can be symmetrical and non-symmetrical
  [ ] Conversion can be lossy and lossless
  [ ] Conversion can be lossless and non-symmetrical at the same time

Trees can be used, for example, for realtime migration of issues from one
tracker to another. For managing changesets with additional meta
information. For presenting package dependencies and working with them. For
atomic (transactional) file management. For managing operating system
capability information. For logging setup. For debugging structures in
Python. For working and converting binary file formats. For the common AST
transformation and query interface. For the understanding how 2to3 fixers
work. For the common ground of visual representation, comparison and
transformation of data structures. That's probably enough of my itches.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121219/3f962fb7/attachment.html>

From jeff at jeffreyjenkins.ca  Wed Dec 19 17:07:31 2012
From: jeff at jeffreyjenkins.ca (Jeff Jenkins)
Date: Wed, 19 Dec 2012 11:07:31 -0500
Subject: [Python-ideas] Tree as a data structure (Was: Graph class)
In-Reply-To: <CAK6S7jKeX1jcsBR8JFYwH7mjYo3foCYtb_NR_08WsJjdY7K9Mg@mail.gmail.com>
References: <CAPkN8xK_g0=TskvGjcOpbtgnffXZVC2ce8qrg=D7dGmDLfUmLQ@mail.gmail.com>
	<CAK6S7jKeX1jcsBR8JFYwH7mjYo3foCYtb_NR_08WsJjdY7K9Mg@mail.gmail.com>
Message-ID: <CAK6S7jJ0niDqG1xt0Fwwu=QwisRnD4cHoU_7GZ0K5m7dNXoUBg@mail.gmail.com>

trying again, this email address was apparently not on the list:

My experience dealing with trees is always that the "tree" part is always
so simple that it isn't a big deal to re-implement it.  The problem is
dealing with all of the extra stuff that you need and the details of what
happens when you do different operations.  I think it makes more sense to
have an interface for a kind of thing you want to do with a tree (e.g.
sorted sets, or ordered maps) rather than the tree itself.



On Wed, Dec 19, 2012 at 10:55 AM, Jeff Jenkins <jeff at jeffreyjenkins.ca>wrote:

> My experience dealing with trees is always that the "tree" part is always
> so simple that it isn't a big deal to re-implement it.  The problem is
> dealing with all of the extra stuff that you need and the details of what
> happens when you do different operations.  I think it makes more sense to
> have an interface for a kind of thing you want to do with a tree (e.g.
> sorted sets, or ordered maps) rather than the tree itself.
>
>
> On Wed, Dec 19, 2012 at 10:11 AM, anatoly techtonik <techtonik at gmail.com>wrote:
>
>> On Sun, Dec 16, 2012 at 6:41 PM, Guido van Rossum <guido at python.org>wrote:
>>
>>> I think of graphs and trees as patterns, not data structures.
>>
>>
>> In my world strings, ints and lists are 1D data types, and tree can be a
>> very important 2D data structure. Even if it is a pattern, this pattern is
>> vital for the transformation of structured data, because it allows to
>> represent any data structure in canonical format.
>>
>> Speaking of tree as a data structure, I assume that it has a very basic
>> definition:
>>
>> 1. tree consists of nodes
>> 2. some nodes are containers for other nodes
>> 3. every node has properties
>> 4. every node has 0 or 1 parent
>> 5. every container has 1+ children
>> 6. tree has a single starting root node
>> 7. no child of a parent can be its ancestor
>>    (no cyclic dependencies between elements)
>>
>> List of trees is a forest. Every subtree is a complete tree.
>>
>>
>> To see which tree data type would be really useful in Python distribution
>> (e.g. provides a simple, extendable and intuitive interface), I see only
>> one way - is to scratching some itches relevant to Python and then try to
>> scale it to other problems. The outcome should be the answer -
>> what for native tree type is not suitable?
>>
>> More ideas:
>> [ ] Much experience for working with trees can be brought from XML and
>> DOM manipulation practices (jQuery and friends)
>>   [ ] every element in a tree can be accessed by its address specificator
>> as 'root/node[3]/last'
>>   [ ] but it is also convenient to access tree data using node names as
>> 'mylib.books[:1]'
>>   [ ] and of course, you can run queries over trees
>>
>> [ ] Tree is the base for any "Data Transformation Framework" as it allows
>> to jump from "data type conversion" to "data structure conversion and
>> mapping"
>>   [ ] Trees can be converted to other trees and to more complicated
>> structures
>>   [ ] Conversion can be symmetrical and non-symmetrical
>>   [ ] Conversion can be lossy and lossless
>>   [ ] Conversion can be lossless and non-symmetrical at the same time
>>
>> Trees can be used, for example, for realtime migration of issues from one
>> tracker to another. For managing changesets with additional meta
>> information. For presenting package dependencies and working with them. For
>> atomic (transactional) file management. For managing operating system
>> capability information. For logging setup. For debugging structures in
>> Python. For working and converting binary file formats. For the common AST
>> transformation and query interface. For the understanding how 2to3 fixers
>> work. For the common ground of visual representation, comparison and
>> transformation of data structures. That's probably enough of my itches.
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121219/54f0445c/attachment.html>

From jimjjewett at gmail.com  Wed Dec 19 17:38:03 2012
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 19 Dec 2012 11:38:03 -0500
Subject: [Python-ideas] Tree as a data structure (Was: Graph class)
In-Reply-To: <CAPkN8xK_g0=TskvGjcOpbtgnffXZVC2ce8qrg=D7dGmDLfUmLQ@mail.gmail.com>
References: <CAPkN8xK_g0=TskvGjcOpbtgnffXZVC2ce8qrg=D7dGmDLfUmLQ@mail.gmail.com>
Message-ID: <CA+OGgf64Az2g3vq+3UY7WRA1y_Z0PWiSy==Co4-zfnCey6O+Bg@mail.gmail.com>

On 12/19/12, anatoly techtonik <techtonik at gmail.com> wrote:
> On Sun, Dec 16, 2012 at 6:41 PM, Guido van Rossum <guido at python.org> wrote:

>> I think of graphs and trees as patterns, not data structures.

> In my world strings, ints and lists are 1D data types, and tree can be a
> very important 2D data structure.

Yes; the catch is that the details of that data structure will differ
depending on the problem.  Most problems do not need the fancy
algorithms -- or the extra overhead that supports them.  Since a
simple tree (or graph) is easy to write, and the fiddly details are
often -- but not always -- wasted overhead, it doesn't make sense to
designate a single physical structure as "the" tree (or graph)
representation.  So it stays a pattern, rather than a concrete data
structure.

> Speaking of tree as a data structure, I assume that it has a very basic
> definition:

> 1. tree consists of nodes
> 2. some nodes are containers for other nodes

Are the leaves a different type, or just nodes that happen to have
zero children at the moment?

> 3. every node has properties

What sort of properties?
A single value of a given class, plus some binary flags that are
internal to the graph implementation?
A fixed set of values that occur on every node?  (Possibly differing
between leaves and regular nodes?)
A fixed value (used for ordering) plus an arbitrary collection that
can vary by node?


> More ideas:

>   [ ] every element in a tree can be accessed by its address specificator
> as 'root/node[3]/last'

That assumes an arbitrary number of children, and that the children
are ordered.  A sensible choice, but it adds way too much overhead for
some cases.

(And of course, the same goes for the overhead of balancing, etc.)

-jJ


From guido at python.org  Wed Dec 19 17:55:02 2012
From: guido at python.org (Guido van Rossum)
Date: Wed, 19 Dec 2012 08:55:02 -0800
Subject: [Python-ideas] PEP 3156 feedback
In-Reply-To: <CAFYqXL_2W3W=x52y8tCaj=huXu+ZAhAgiq4tE5ULB5m+ne53ew@mail.gmail.com>
References: <20121218110136.1f85cfae@pitrou.net>
	<CAP7+vJJcQRZrRCoSgcpnB2-2d+xbFtFqHWoLJ8voqiHxJ7+sOQ@mail.gmail.com>
	<CAFYqXL_2W3W=x52y8tCaj=huXu+ZAhAgiq4tE5ULB5m+ne53ew@mail.gmail.com>
Message-ID: <CAP7+vJ+N70rETnqoAEmThCdGh+-AW_6zfDQL7dgN-1XjCVhcMw@mail.gmail.com>

On Wed, Dec 19, 2012 at 6:51 AM, Giampaolo Rodol? <g.rodola at gmail.com> wrote:
> 2012/12/18 Guido van Rossum <guido at python.org>
>>
>> On Tue, Dec 18, 2012 at 2:01 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:> Event loop API
>> > --------------
>> >
>> > I would like to say that I prefer Tornado's model: for each primitive
>> > provided by Tornado, you can pass an explicit Loop instance which you
>> > instantiated manually.
>> > There is no module function or policy object hiding this mechanism:
>> > it's simple, explicit and flexible (in other words: if you want a
>> > per-thread event loop, just do it yourself using TLS :-)).
>>
>> It sounds though as if the explicit loop is optional, and still
>> defaults to some global default loop?
>>
>> Having one global loop shared by multiple threads is iffy though. Only
>> one thread should be *running* the loop, otherwise the loop can' be
>> used as a mutual exclusion device. Worse, all primitives for adding
>> and removing callbacks/handlers must be made threadsafe, and then
>> basically the entire event loop becomes full of locks, which seems
>> wrong to me.
>
> The basic idea is to have multiple threads/processes, each running its
> own IO loop.

I understand that, and the Tulip implementation supports this. However
different frameworks may have different policies (e.g. AFAIK Twisted
only supports one reactor, period, and it is not threadsafe). I don't
want to put requirements in the PEP that *require* compliant
implementations to support the loop-per-thread model. OTOH I do want
compliant implementations to decide on their own policy. I guess the
minimal requirement for a compliant implementation is that callbacks
associated with the same loop are serialized and never executed
concurrently on different threads.

> No locks are required because each IO poller instance will deal with
> its own socket-map / callbacks-queue and no resources are shared.
> In asyncore this was achieved by introducing the "map" parameter.
> Similarly to Tornado, pyftpdlib uses an "ioloop" parameter which can
> be passed to all the classes which will handle the connection (the
> handlers).

Read the description in the PEP of the event loop policy, or the
default implementation in Tulip. It discourages user code from
creating new event loops (since the framework may not support this)
but does not prevent e.g. unit tests from creating a new loop for each
test (even Twisted supports that).

> If "ioloop" is provided all the handlers will use that (...and
> register() against it, add_reader() etc..) otherwise the "global"
> ioloop instance will be used (default).
> A dynamic IO poller like this is important because in case the
> connection handlers are forced to block for some reason, you can
> switch from a concurrency model (async / non-blocking) to another
> (multi threads/process) very easily.

Did you see run_in_executor() and wrap_future() in the PEP or in the
Tulip implementation? They make it perfectly simple to run something
in another thread (and the default implementation will use this to
call getaddrinfo(), since the stdlib wrappers for it have no async
version. The two APIs are even capable of using a ProcessPoolExecutor.

> See:
> http://code.google.com/p/pyftpdlib/issues/detail?id=212#c9
> http://code.google.com/p/pyftpdlib/source/browse/trunk/pyftpdlib/servers.py?spec=svn1137&r=1137

Of course, if all you want is a server that creates a new thread or
process for each connection, PEP 3156 and Tulip are overkill -- in
that case there's no reason not to use the stdlib's SocketServer
class, which has supported this for over a decade. :-)

-- 
--Guido van Rossum (python.org/~guido)


From feedbackflow at gmail.com  Wed Dec 19 17:51:05 2012
From: feedbackflow at gmail.com (Bart Thate)
Date: Wed, 19 Dec 2012 17:51:05 +0100
Subject: [Python-ideas] context aware execution
In-Reply-To: <CAPTjJmrddAk3BL2gKxd3PeSj3dmSDEGONszMkMJzuh_FyoFBZw@mail.gmail.com>
References: <CADPXuAgW5uMRhNomqbLmnXKMuhxb6WzSnFsEjJCp-MX1WZK9pw@mail.gmail.com>
	<CAPTjJmrddAk3BL2gKxd3PeSj3dmSDEGONszMkMJzuh_FyoFBZw@mail.gmail.com>
Message-ID: <CADPXuAgcPMVB+NBu7g0GvaXL4tRyrLeZxB9uXO8cAR8VDcr4mg@mail.gmail.com>

Thanks for your response Chris !

Ha ! the job of the mad man is todo the things the are not "advisable" and
see what gives. Like why it is not advisable and,  if possible, their are
ways to achieve things that are previously overseen.

i already do a lot of travelling of the callstack to see from where a
function is called. mostely for logging purposes, like what plugin
registered this callback etc.

lately i also had the need to log the variable name of a object, and the
thought of be able to "break out of the namespace" concept got me thinking

what i am thinking of is code that can examine the context it is run in.
The object, when called can figure out in what kind of space it is living
in and discover what kind of API other objects in the space offer.

This is a pre function that gets called before the actual function/method
and can thus determine if a certain request can be fullfilled.

I bet a decorator could be made, in which i can assign certain caller
context references into variables in the function/method ?

I use 1 generic parameter object, in which i can stuff lots of things, but
i rather have the function be able to see what is around ;]

Think of sending JSON over the wire, reconstruct an object with it and then
let the object figure out what it can and cannot do in this external
environment.

Code i use now is this:

 # life/plugs/context.py
#
#

""" show context. """

## basic import

import sys

## context command

def context(event):
    result = []
    frame = sys._getframe()
    code = frame.f_back.f_code
    for i in dir(code): print("%s => %s" % (i, getattr(code, i)))
    del frame

context.cmnd = "context"

So much to explore ;]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121219/aa05d79f/attachment.html>

From guido at python.org  Wed Dec 19 18:26:33 2012
From: guido at python.org (Guido van Rossum)
Date: Wed, 19 Dec 2012 09:26:33 -0800
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CAA0H+QTc24JHbhiZZAfohzSNd=haUt_xLiXZ5m40SUqAEtf_pQ@mail.gmail.com>
References: <3B90AC3A-C73B-4BD1-9BE8-9ECF21F0D243@umbrellacode.com>
	<CAP7+vJJrn2arYjYGgkALt8wGJPMCqw3oQaMS3OqCdRSjCjOg+w@mail.gmail.com>
	<CAA0H+QQpfV6K4jYEZ5N13cCJyv+RJRgspVA_T_-N8re0hgfwxw@mail.gmail.com>
	<CAP7+vJLSpaw5bL9pMXCTP0fEhe93NkaK1Y8bCH5_THQ-RNqcyw@mail.gmail.com>
	<CAA0H+QQF-WPEviYhAd9iWNdCmNh_OL0q9iytUj5mCEgGz1hDJw@mail.gmail.com>
	<CAP7+vJJ2bkQEGPXp6TW-DCKee-r67_fD2=42W6d9xWgtKE5P3g@mail.gmail.com>
	<CAA0H+QTc24JHbhiZZAfohzSNd=haUt_xLiXZ5m40SUqAEtf_pQ@mail.gmail.com>
Message-ID: <CAP7+vJLqAe06+kOLWHZzP2My3aanM1xR3wAQ82WwpEtF7q62Og@mail.gmail.com>

On Tue, Dec 18, 2012 at 10:41 PM, Jasper St. Pierre
<jstpierre at mecheye.net> wrote:
> On Wed, Dec 19, 2012 at 1:24 AM, Guido van Rossum <guido at python.org> wrote:
>
> ... snip ...
>
>> That looks reasonable too, although the signature may need to be adjusted.
>> (How does it cancel the remaining tasks if it wants to? Or does par() do
>> that if this callback raises?) maybe call it filter?
>
>
> The subtask completion callback can call abort() on the overall par_task,

Tasks don't have abort(), I suppose you meant cancel().

> which could cancel the rest of the unfinished tasks.
>
>     def abort_task(par_task, subtask):
>         try:
>             return subtask.result()
>         except ValueError:
>             par_task.abort()
>
> The issue with this approach is that since the par() would return values
> again, not tasks, we'd can't handle errors locally. Futures are also
> immutable, so we can't modify the values after they resolve. Maybe we'd have
> something like:
>
>     def fail_silently(par_task, subtask):
>         try:
>             subtask.result()
>         except ValueError as e:
>             return Future.completed(None) # an already completed future that
> has a value of None, sorry, don't remember the exact spelling
>         else:
>             return subtask
>
> which allows us:
>
>     for task in par(*tasks, subtask_completion=fail_silently):
>         # ...
>
> Which allows us both local error handling, as well as batch error handling.
> But it's very verbose from the side of the callback. Hm.

Hm indeed. Unless you can get your thoughts straight I think I'd
rather go with the wait_one() API, which can be used to build anything
else you like, but doesn't require one to be quite so clever with
callbacks. (Did I say I hate callbacks?)

-- 
--Guido van Rossum (python.org/~guido)


From solipsis at pitrou.net  Wed Dec 19 19:55:55 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 19 Dec 2012 19:55:55 +0100
Subject: [Python-ideas] PEP 3156 feedback
References: <20121218110136.1f85cfae@pitrou.net>
	<CAP7+vJJcQRZrRCoSgcpnB2-2d+xbFtFqHWoLJ8voqiHxJ7+sOQ@mail.gmail.com>
	<CAFYqXL_2W3W=x52y8tCaj=huXu+ZAhAgiq4tE5ULB5m+ne53ew@mail.gmail.com>
	<CAP7+vJ+N70rETnqoAEmThCdGh+-AW_6zfDQL7dgN-1XjCVhcMw@mail.gmail.com>
Message-ID: <20121219195555.577593f2@pitrou.net>

On Wed, 19 Dec 2012 08:55:02 -0800
Guido van Rossum <guido at python.org> wrote:
> On Wed, Dec 19, 2012 at 6:51 AM, Giampaolo Rodol? <g.rodola-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org> wrote:
> > 2012/12/18 Guido van Rossum <guido-+ZN9ApsXKcEdnm+yROfE0A at public.gmane.org>
> >>
> >> On Tue, Dec 18, 2012 at 2:01 AM, Antoine Pitrou <solipsis-xNDA5Wrcr86sTnJN9+BGXg at public.gmane.org> wrote:> Event loop API
> >> > --------------
> >> >
> >> > I would like to say that I prefer Tornado's model: for each primitive
> >> > provided by Tornado, you can pass an explicit Loop instance which you
> >> > instantiated manually.
> >> > There is no module function or policy object hiding this mechanism:
> >> > it's simple, explicit and flexible (in other words: if you want a
> >> > per-thread event loop, just do it yourself using TLS :-)).
> >>
> >> It sounds though as if the explicit loop is optional, and still
> >> defaults to some global default loop?
> >>
> >> Having one global loop shared by multiple threads is iffy though. Only
> >> one thread should be *running* the loop, otherwise the loop can' be
> >> used as a mutual exclusion device. Worse, all primitives for adding
> >> and removing callbacks/handlers must be made threadsafe, and then
> >> basically the entire event loop becomes full of locks, which seems
> >> wrong to me.
> >
> > The basic idea is to have multiple threads/processes, each running its
> > own IO loop.
> 
> I understand that, and the Tulip implementation supports this. However
> different frameworks may have different policies (e.g. AFAIK Twisted
> only supports one reactor, period, and it is not threadsafe). I don't
> want to put requirements in the PEP that *require* compliant
> implementations to support the loop-per-thread model.

Why not let implementations raise NotImplementedError when they don't
want to support certain use cases?

> Read the description in the PEP of the event loop policy, or the
> default implementation in Tulip. It discourages user code from
> creating new event loops (since the framework may not support this)
> but does not prevent e.g. unit tests from creating a new loop for each
> test (even Twisted supports that).

Is it the plan that code written for an event loop will always work with
another one?
Will tulip offer more than the GCD of the other event loops?

Regards

Antoine.




From guido at python.org  Thu Dec 20 00:40:36 2012
From: guido at python.org (Guido van Rossum)
Date: Wed, 19 Dec 2012 15:40:36 -0800
Subject: [Python-ideas] PEP 3156 feedback
In-Reply-To: <20121219195555.577593f2@pitrou.net>
References: <20121218110136.1f85cfae@pitrou.net>
	<CAP7+vJJcQRZrRCoSgcpnB2-2d+xbFtFqHWoLJ8voqiHxJ7+sOQ@mail.gmail.com>
	<CAFYqXL_2W3W=x52y8tCaj=huXu+ZAhAgiq4tE5ULB5m+ne53ew@mail.gmail.com>
	<CAP7+vJ+N70rETnqoAEmThCdGh+-AW_6zfDQL7dgN-1XjCVhcMw@mail.gmail.com>
	<20121219195555.577593f2@pitrou.net>
Message-ID: <CAP7+vJKaZsc4NE6=eCdB4O58u5y0rggnK_VgEZuuL5nOmO6obA@mail.gmail.com>

On Wed, Dec 19, 2012 at 10:55 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> Why not let implementations raise NotImplementedError when they don't
> want to support certain use cases?

That's always a last resort, but the problem is that an app or library
can't be sure that everything will work, and the failure might be
subtle and late.

That said, my remark about the loop needing to be wholly threadsafe
was misguided. I think there are two reasonable policies with regards
to thread that any reasonable implementation could follow:

1. There's only one loop, it runs in a dedicated thread, and other
threads can only use call_soon_threadsafe().

2. There's (potentially) a loop per thread, and these are effectively
independent. (TBD: How would these pass work or results between one
another? Probably by calling call_soon_threadsafe() back and forth.)

The default implementation actually takes a halfway position: it
supports (2), but you must manually call init_event_loop() in each
thread except for the main thread, and you must call run() in each
thread, including the main thread. The requirement to call
init_event_loop() is to prevent code running in some random thread
trying to schedule callback, which would never run because the thread
isn't calling run().

When we get further along we may have a compliance test suite,
separate from the unittests (I am working on unittests but I'm aware
they aren't at all thorough yet).

> Is it the plan that code written for an event loop will always work with
> another one?

The plan is to make it easy to write code that will work with all (or
most) event loops, without making it impossible to write code that
depends on a specific event loop implementation. This is Python's
general attitude about platform-specific APIs.

> Will tulip offer more than the GCD of the other event loops?

People writing PEP 3156 compliant implementations on top of some other
event loop, whether it's Twisted or libuv, may have to emulate some
functionality, and there will also be some functionality that their
underlying loop supports that PEP 3156 doesn't. The goal is to offer a
wide enough range of features that it's possible to write many useful
types of apps without resorting to platform-specific APIs, and to make
these fast enough.

But if an app knows it will only be used with a certain loop
implementation it is free to use extra APIs that only that loop
offers. There's still a benefit in that situation: the app may be tied
to a platform, but it may still want to use some 3rd party libraries
that also require event loop integration, and by conforming to PEP
3156 the platform's loop implementation can ensure that such libraries
actually work and interact with the rest of the app in a reasonable
manner. (In particular, they should all use the same Future and Task
classes.)

-- 
--Guido van Rossum (python.org/~guido)


From jonathan at slenders.be  Thu Dec 20 23:52:27 2012
From: jonathan at slenders.be (Jonathan Slenders)
Date: Thu, 20 Dec 2012 23:52:27 +0100
Subject: [Python-ideas] An async facade?
Message-ID: <CAKfyG3w9wPmmW4a=e8y2bJQ5CTv5zt3x21E7P3B+s=H4uGwt6w@mail.gmail.com>

Hi All,

This week I finished some Python syntax on a Pypy fork. It was an
experiment I was working on this week. We really needed a cleaner way of
writing asynchronous code. So, instead of using the yield keyword and an
@async decorator, we implemented the 'await' keyword, similar to c#.

So, because I just now subscribed to python-ideas, I cannot reply
immediately to the following thread:

An async facade? (was Re: [Python-Dev] Socket timeout and completion based
sockets)


Anyway, like c# does, I implemented the await keyword for Python, and
should say that I'm really confident of the usability of the result.
Personally,
I think this is a very clean solution for Twisted's @defer.inlineCalbacks,
Tornado's @gen.engine, and similar functions in other async frameworks. We
use it right now in a commercial web environment, where third party users
should have to be able to write asynchronous code as easy as possible in a
web based IDE.

https://bitbucket.org/jonathanslenders/pypy

Two interpreter hooks were added: (both accept a callable as parameter.)

>>>> sys.setawaithandler(wrapper)
>>>> sys.setawaitresultwrapper(result_wrapper)


The first will set the I/O scheduler  a functions for wrapping others
functions which contain 'await' instead of 'yield'. This wrapper function
will receive a generator as input. So, 'await' still acts like 'yield' for
the interpreter, but the result is automatically wrapped by this function,
if the await keyword was found.

The second function will wrap the return result of asynchronous functions.
So, unlike normal generators with 'yield' keywords, where 'await' has been
used, we still can return a result. But this result will be wrapped by this
function, so that the generator in the scheduler  will be able
te recognize the returned result.

This:

@defer.inlineCallbacks
def async_function(deferred_param):
    a = yield deferred_param
    b = yield some_call(a)
    yield defer.returnValue(b)


will now become:

def async_function(deferred_param):
    a = await deferred_param
    b = await some_call(a)
    return b


So, while still being explicit, it requires minimal syntax, and
allows distinguishing between when to 'wait' for an asynchronous task, and
when to pass the Future object around.

I really have no idea whether this has been proposed before, I can only say
that we are using it and it works pretty well.

Cheers,
Jonathan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121220/fc67b368/attachment.html>

From jstpierre at mecheye.net  Fri Dec 21 00:17:12 2012
From: jstpierre at mecheye.net (Jasper St. Pierre)
Date: Thu, 20 Dec 2012 18:17:12 -0500
Subject: [Python-ideas] An async facade?
In-Reply-To: <CAKfyG3w9wPmmW4a=e8y2bJQ5CTv5zt3x21E7P3B+s=H4uGwt6w@mail.gmail.com>
References: <CAKfyG3w9wPmmW4a=e8y2bJQ5CTv5zt3x21E7P3B+s=H4uGwt6w@mail.gmail.com>
Message-ID: <CAA0H+QQfGY5ao-=be2S9jJENgmoM6EHRJgt4AYptY-j8oSLfQA@mail.gmail.com>

Note that the "return b" is already being handled through the
"StopIteration" proposal.

I'm not a fan of the new syntax because it means that removing all the
"await" keywords from a method changes the return value. Requiring the
decorator means that this can cleanly be handled in all cases, even if the
decorator implementation is a bit ugly.

This means that all we have left is the "await" vs. "yield" vs. "yield
from" discussion. I don't think the new valuable enough to warrant a new
keyword.



On Thu, Dec 20, 2012 at 5:52 PM, Jonathan Slenders <jonathan at slenders.be>wrote:

> Hi All,
>
> This week I finished some Python syntax on a Pypy fork. It was an
> experiment I was working on this week. We really needed a cleaner way of
> writing asynchronous code. So, instead of using the yield keyword and an
> @async decorator, we implemented the 'await' keyword, similar to c#.
>
> So, because I just now subscribed to python-ideas, I cannot reply
> immediately to the following thread:
>
> An async facade? (was Re: [Python-Dev] Socket timeout and completion based
> sockets)
>
>
> Anyway, like c# does, I implemented the await keyword for Python, and
> should say that I'm really confident of the usability of the result. Personally,
> I think this is a very clean solution for Twisted's @defer.inlineCalbacks,
> Tornado's @gen.engine, and similar functions in other async frameworks. We
> use it right now in a commercial web environment, where third party users
> should have to be able to write asynchronous code as easy as possible in a
> web based IDE.
>
> https://bitbucket.org/jonathanslenders/pypy
>
> Two interpreter hooks were added: (both accept a callable as parameter.)
>
> >>>> sys.setawaithandler(wrapper)
> >>>> sys.setawaitresultwrapper(result_wrapper)
>
>
> The first will set the I/O scheduler  a functions for wrapping others
> functions which contain 'await' instead of 'yield'. This wrapper function
> will receive a generator as input. So, 'await' still acts like 'yield' for
> the interpreter, but the result is automatically wrapped by this function,
> if the await keyword was found.
>
> The second function will wrap the return result of asynchronous functions.
> So, unlike normal generators with 'yield' keywords, where 'await' has been
> used, we still can return a result. But this result will be wrapped by this
> function, so that the generator in the scheduler  will be able
> te recognize the returned result.
>
> This:
>
> @defer.inlineCallbacks
> def async_function(deferred_param):
>     a = yield deferred_param
>     b = yield some_call(a)
>     yield defer.returnValue(b)
>
>
> will now become:
>
> def async_function(deferred_param):
>     a = await deferred_param
>     b = await some_call(a)
>     return b
>
>
> So, while still being explicit, it requires minimal syntax, and
> allows distinguishing between when to 'wait' for an asynchronous task, and
> when to pass the Future object around.
>
> I really have no idea whether this has been proposed before, I can only
> say that we are using it and it works pretty well.
>
> Cheers,
> Jonathan
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>


-- 
  Jasper
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121220/3160b2e3/attachment.html>

From tjreedy at udel.edu  Fri Dec 21 00:21:20 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 20 Dec 2012 18:21:20 -0500
Subject: [Python-ideas] An async facade?
In-Reply-To: <CAKfyG3w9wPmmW4a=e8y2bJQ5CTv5zt3x21E7P3B+s=H4uGwt6w@mail.gmail.com>
References: <CAKfyG3w9wPmmW4a=e8y2bJQ5CTv5zt3x21E7P3B+s=H4uGwt6w@mail.gmail.com>
Message-ID: <50D39D70.5010802@udel.edu>

On 12/20/2012 5:52 PM, Jonathan Slenders wrote:

Please post plain text rather than html (same for all python.org lists). 
Html posts ofter come out a bit weird.

 > Anyway, like c# does, I implemented the await keyword for Python,

On my reader, this is normal size text.

 > Personally, I think this is a very clean solution for Twisted's

While this was half sized micro text. (It is normal here because by 
default Thunderbird converts to plain text for newsgroups and I am 
posting via news.gmane.org.) The alternation between full and 
half-height characters makes your post hard to read.

-- 
Terry Jan Reedy



From jonathan at slenders.be  Fri Dec 21 00:34:55 2012
From: jonathan at slenders.be (Jonathan Slenders)
Date: Fri, 21 Dec 2012 00:34:55 +0100
Subject: [Python-ideas] An async facade?
In-Reply-To: <CAA0H+QQfGY5ao-=be2S9jJENgmoM6EHRJgt4AYptY-j8oSLfQA@mail.gmail.com>
References: <CAKfyG3w9wPmmW4a=e8y2bJQ5CTv5zt3x21E7P3B+s=H4uGwt6w@mail.gmail.com>
	<CAA0H+QQfGY5ao-=be2S9jJENgmoM6EHRJgt4AYptY-j8oSLfQA@mail.gmail.com>
Message-ID: <CAKfyG3wESUXfFuauavx8qt3f=jL0hd22=9sTjpx8CnBRY+zOgg@mail.gmail.com>

As removing "yield" changes the return value of a function. Nothing
different.

For me +1 for the "StopIteration" proposal. That's certainly better, and
more generic than what I said.

So, the difference is still that the "await" proposal makes the @async
decorator implicit. I'm still in favor of this because
in asynchronous code, you can have really many functions with this
decorator. And if someone forgets about that, getting a generator object
instead of a Future is quite different in semantics.

P.S. excuse me, Terry.



2012/12/21 Jasper St. Pierre <jstpierre at mecheye.net>

> Note that the "return b" is already being handled through the
> "StopIteration" proposal.
>
> I'm not a fan of the new syntax because it means that removing all the
> "await" keywords from a method changes the return value. Requiring the
> decorator means that this can cleanly be handled in all cases, even if the
> decorator implementation is a bit ugly.
>
> This means that all we have left is the "await" vs. "yield" vs. "yield
> from" discussion. I don't think the new valuable enough to warrant a new
> keyword.
>
>
>
> On Thu, Dec 20, 2012 at 5:52 PM, Jonathan Slenders <jonathan at slenders.be>wrote:
>
>> Hi All,
>>
>> This week I finished some Python syntax on a Pypy fork. It was an
>> experiment I was working on this week. We really needed a cleaner way of
>> writing asynchronous code. So, instead of using the yield keyword and an
>> @async decorator, we implemented the 'await' keyword, similar to c#.
>>
>> So, because I just now subscribed to python-ideas, I cannot reply
>> immediately to the following thread:
>>
>> An async facade? (was Re: [Python-Dev] Socket timeout and completion
>> based sockets)
>>
>>
>> Anyway, like c# does, I implemented the await keyword for Python, and
>> should say that I'm really confident of the usability of the result. Personally,
>> I think this is a very clean solution for Twisted's @defer.inlineCalbacks,
>> Tornado's @gen.engine, and similar functions in other async frameworks. We
>> use it right now in a commercial web environment, where third party users
>> should have to be able to write asynchronous code as easy as possible in a
>> web based IDE.
>>
>> https://bitbucket.org/jonathanslenders/pypy
>>
>> Two interpreter hooks were added: (both accept a callable as parameter.)
>>
>> >>>> sys.setawaithandler(wrapper)
>> >>>> sys.setawaitresultwrapper(result_wrapper)
>>
>>
>> The first will set the I/O scheduler  a functions for wrapping others
>> functions which contain 'await' instead of 'yield'. This wrapper function
>> will receive a generator as input. So, 'await' still acts like 'yield' for
>> the interpreter, but the result is automatically wrapped by this function,
>> if the await keyword was found.
>>
>> The second function will wrap the return result
>> of asynchronous functions. So, unlike normal generators with 'yield'
>> keywords, where 'await' has been used, we still can return a result. But
>> this result will be wrapped by this function, so that the generator in
>> the scheduler  will be able te recognize the returned result.
>>
>> This:
>>
>> @defer.inlineCallbacks
>> def async_function(deferred_param):
>>     a = yield deferred_param
>>     b = yield some_call(a)
>>     yield defer.returnValue(b)
>>
>>
>> will now become:
>>
>> def async_function(deferred_param):
>>     a = await deferred_param
>>     b = await some_call(a)
>>     return b
>>
>>
>> So, while still being explicit, it requires minimal syntax, and
>> allows distinguishing between when to 'wait' for an asynchronous task, and
>> when to pass the Future object around.
>>
>> I really have no idea whether this has been proposed before, I can only
>> say that we are using it and it works pretty well.
>>
>> Cheers,
>> Jonathan
>>
>>
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>>
>>
>
>
> --
>   Jasper
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121221/aa44c027/attachment.html>

From guido at python.org  Fri Dec 21 00:46:03 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 20 Dec 2012 15:46:03 -0800
Subject: [Python-ideas] An async facade?
In-Reply-To: <CAKfyG3w9wPmmW4a=e8y2bJQ5CTv5zt3x21E7P3B+s=H4uGwt6w@mail.gmail.com>
References: <CAKfyG3w9wPmmW4a=e8y2bJQ5CTv5zt3x21E7P3B+s=H4uGwt6w@mail.gmail.com>
Message-ID: <CAP7+vJ+xkJ6Am0tne2e-fccXTR-aVeod3OdyQHaGVuygRZwU3A@mail.gmail.com>

Have you read PEP 3156 and PEP 380?

Instead of await, Python 3.3 has yield from, with the same semantics.
This is somewhat more verbose, but has the advantage that it doesn't
introduce a new keyword, and it's already in Python 3.3, so you can
start using it now -- no fork of the language required.

-- 
--Guido van Rossum (python.org/~guido)


From guido at python.org  Fri Dec 21 00:49:44 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 20 Dec 2012 15:49:44 -0800
Subject: [Python-ideas] An async facade?
In-Reply-To: <CAKfyG3wESUXfFuauavx8qt3f=jL0hd22=9sTjpx8CnBRY+zOgg@mail.gmail.com>
References: <CAKfyG3w9wPmmW4a=e8y2bJQ5CTv5zt3x21E7P3B+s=H4uGwt6w@mail.gmail.com>
	<CAA0H+QQfGY5ao-=be2S9jJENgmoM6EHRJgt4AYptY-j8oSLfQA@mail.gmail.com>
	<CAKfyG3wESUXfFuauavx8qt3f=jL0hd22=9sTjpx8CnBRY+zOgg@mail.gmail.com>
Message-ID: <CAP7+vJ+EPq3hYyfYMfBRAOV957jTk3-ZMzOTEsMeyCYM=OTd8Q@mail.gmail.com>

On Thu, Dec 20, 2012 at 3:34 PM, Jonathan Slenders <jonathan at slenders.be> wrote:
> So, the difference is still that the "await" proposal makes the @async
> decorator implicit. I'm still in favor of this because in asynchronous code,
> you can have really many functions with this decorator. And if someone
> forgets about that, getting a generator object instead of a Future is quite
> different in semantics.

Carefully read PEP 3156, and the tulip implementation:
http://code.google.com/p/tulip/source/browse/tulip/tasks.py . The
@coroutine decorator is technically redundant when you use yield from.

-- 
--Guido van Rossum (python.org/~guido)


From solipsis at pitrou.net  Fri Dec 21 11:51:18 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 21 Dec 2012 10:51:18 +0000 (UTC)
Subject: [Python-ideas] Tree as a data structure (Was: Graph class)
References: <CAPkN8xK_g0=TskvGjcOpbtgnffXZVC2ce8qrg=D7dGmDLfUmLQ@mail.gmail.com>
	<CA+OGgf64Az2g3vq+3UY7WRA1y_Z0PWiSy==Co4-zfnCey6O+Bg@mail.gmail.com>
Message-ID: <loom.20121221T114654-559@post.gmane.org>

Jim Jewett <jimjjewett at ...> writes:
> 
> On 12/19/12, anatoly techtonik <techtonik at ...> wrote:
> > On Sun, Dec 16, 2012 at 6:41 PM, Guido van Rossum <guido at ...> wrote:
> 
> >> I think of graphs and trees as patterns, not data structures.
> 
> > In my world strings, ints and lists are 1D data types, and tree can be a
> > very important 2D data structure.
> 
> Yes; the catch is that the details of that data structure will differ
> depending on the problem.  Most problems do not need the fancy
> algorithms -- or the extra overhead that supports them.  Since a
> simple tree (or graph) is easy to write, and the fiddly details are
> often -- but not always -- wasted overhead, it doesn't make sense to
> designate a single physical structure as "the" tree (or graph)
> representation.

Do you care about the overhead of an OrderedDict? As long as you are not 
manipulating a huge amount of data, a generic tree structure such as provided
by e.g. the networkx library is perfectly fine.

And if you want to reimplement a more optimized structure, sure, that's fine.
But that's not an argument against a generic data structure that would be
sufficient for 99.9% of all use cases.

Regards

Antoine.




From jstpierre at mecheye.net  Fri Dec 21 12:38:53 2012
From: jstpierre at mecheye.net (Jasper St. Pierre)
Date: Fri, 21 Dec 2012 06:38:53 -0500
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CAP7+vJLqAe06+kOLWHZzP2My3aanM1xR3wAQ82WwpEtF7q62Og@mail.gmail.com>
References: <3B90AC3A-C73B-4BD1-9BE8-9ECF21F0D243@umbrellacode.com>
	<CAP7+vJJrn2arYjYGgkALt8wGJPMCqw3oQaMS3OqCdRSjCjOg+w@mail.gmail.com>
	<CAA0H+QQpfV6K4jYEZ5N13cCJyv+RJRgspVA_T_-N8re0hgfwxw@mail.gmail.com>
	<CAP7+vJLSpaw5bL9pMXCTP0fEhe93NkaK1Y8bCH5_THQ-RNqcyw@mail.gmail.com>
	<CAA0H+QQF-WPEviYhAd9iWNdCmNh_OL0q9iytUj5mCEgGz1hDJw@mail.gmail.com>
	<CAP7+vJJ2bkQEGPXp6TW-DCKee-r67_fD2=42W6d9xWgtKE5P3g@mail.gmail.com>
	<CAA0H+QTc24JHbhiZZAfohzSNd=haUt_xLiXZ5m40SUqAEtf_pQ@mail.gmail.com>
	<CAP7+vJLqAe06+kOLWHZzP2My3aanM1xR3wAQ82WwpEtF7q62Og@mail.gmail.com>
Message-ID: <CAA0H+QTUimdD9X7qskJKjTU3=rGuxdgVqB7cKmxAPm3w-eGVuQ@mail.gmail.com>

I read over the wait_one() proposal again, and I still don't understand it,
so it would need more explanation to me.

But I don't see the point of avoiding callbacks. In this case, we have two
or more in-flight requests that can be finished at any time. This does not
have a synchronous code equivalent -- callbacks are pretty much the only
mechanism we can use to be notified when something is done.



On Wed, Dec 19, 2012 at 12:26 PM, Guido van Rossum <guido at python.org> wrote:

> On Tue, Dec 18, 2012 at 10:41 PM, Jasper St. Pierre
> <jstpierre at mecheye.net> wrote:
> > On Wed, Dec 19, 2012 at 1:24 AM, Guido van Rossum <guido at python.org>
> wrote:
> >
> > ... snip ...
> >
> >> That looks reasonable too, although the signature may need to be
> adjusted.
> >> (How does it cancel the remaining tasks if it wants to? Or does par() do
> >> that if this callback raises?) maybe call it filter?
> >
> >
> > The subtask completion callback can call abort() on the overall par_task,
>
> Tasks don't have abort(), I suppose you meant cancel().
>
> > which could cancel the rest of the unfinished tasks.
> >
> >     def abort_task(par_task, subtask):
> >         try:
> >             return subtask.result()
> >         except ValueError:
> >             par_task.abort()
> >
> > The issue with this approach is that since the par() would return values
> > again, not tasks, we'd can't handle errors locally. Futures are also
> > immutable, so we can't modify the values after they resolve. Maybe we'd
> have
> > something like:
> >
> >     def fail_silently(par_task, subtask):
> >         try:
> >             subtask.result()
> >         except ValueError as e:
> >             return Future.completed(None) # an already completed future
> that
> > has a value of None, sorry, don't remember the exact spelling
> >         else:
> >             return subtask
> >
> > which allows us:
> >
> >     for task in par(*tasks, subtask_completion=fail_silently):
> >         # ...
> >
> > Which allows us both local error handling, as well as batch error
> handling.
> > But it's very verbose from the side of the callback. Hm.
>
> Hm indeed. Unless you can get your thoughts straight I think I'd
> rather go with the wait_one() API, which can be used to build anything
> else you like, but doesn't require one to be quite so clever with
> callbacks. (Did I say I hate callbacks?)
>
> --
> --Guido van Rossum (python.org/~guido)
>



-- 
  Jasper
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121221/baf5d6ba/attachment.html>

From jonathan at slenders.be  Fri Dec 21 12:47:53 2012
From: jonathan at slenders.be (Jonathan Slenders)
Date: Fri, 21 Dec 2012 12:47:53 +0100
Subject: [Python-ideas] An async facade?
In-Reply-To: <CAP7+vJ+EPq3hYyfYMfBRAOV957jTk3-ZMzOTEsMeyCYM=OTd8Q@mail.gmail.com>
References: <CAKfyG3w9wPmmW4a=e8y2bJQ5CTv5zt3x21E7P3B+s=H4uGwt6w@mail.gmail.com>
	<CAA0H+QQfGY5ao-=be2S9jJENgmoM6EHRJgt4AYptY-j8oSLfQA@mail.gmail.com>
	<CAKfyG3wESUXfFuauavx8qt3f=jL0hd22=9sTjpx8CnBRY+zOgg@mail.gmail.com>
	<CAP7+vJ+EPq3hYyfYMfBRAOV957jTk3-ZMzOTEsMeyCYM=OTd8Q@mail.gmail.com>
Message-ID: <CAKfyG3xMPkNLY1Y9Q3XsLo7=2aAj1wtz2q-PzgcW=92eSo6GMA@mail.gmail.com>

Thank you, Guido! I didn't know about this PEP, but it looks interesting.
I'll try to find some spare time this weekend to read through the PEP,
maybe giving some feedback.

Cheers!



2012/12/21 Guido van Rossum <guido at python.org>

> On Thu, Dec 20, 2012 at 3:34 PM, Jonathan Slenders <jonathan at slenders.be>
> wrote:
> > So, the difference is still that the "await" proposal makes the @async
> > decorator implicit. I'm still in favor of this because in asynchronous
> code,
> > you can have really many functions with this decorator. And if someone
> > forgets about that, getting a generator object instead of a Future is
> quite
> > different in semantics.
>
> Carefully read PEP 3156, and the tulip implementation:
> http://code.google.com/p/tulip/source/browse/tulip/tasks.py . The
> @coroutine decorator is technically redundant when you use yield from.
>
> --
> --Guido van Rossum (python.org/~guido)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121221/c41335ca/attachment.html>

From geertj at gmail.com  Fri Dec 21 15:31:33 2012
From: geertj at gmail.com (Geert Jansen)
Date: Fri, 21 Dec 2012 15:31:33 +0100
Subject: [Python-ideas] Tulip patches
Message-ID: <CADbA=FVwNDqgPOHAa_zwnzdiBnw=PwdyXW6ijyVQTLd+oQioxw@mail.gmail.com>

Hi,

[if this is not the right forum to post patches for tulip, please
redirect me to the correct one. There doesn't appear to be a mailing
list for tulip at the moment. And this list is where most/all of the
discussion is taking place.]

Please find attached 4 patches:

0001-run-fd-callbacks.patch

This patch will run callbacks for readers and writers in the same loop
iteration as when the fd got ready. Copying from my previous email,
this is to support the following idiom:

    # handle_read() sets the "ready" flag
    loop.add_reader(fd, handle_read)
    while not ready:
        loop.run_once()

The patch currently dispatches callbacks twice in each iteration, once
before blocking and once after. I tried to dispatch only once after
blocking, but this made the SSL transport test hang. The reason is
that the create_transport task is scheduled with call_soon(), and only
when the task first runs, a file descriptor is added. So unless you
dispatch before blocking, this task will never get started.

0002-call-every-iteration.patch

This adds a call_every_iteration() method to the event loop. This
callback runs *before* blocking.

0003-fix-typo.patch
0004-remove-wrong-comments.patch

Two trivial patches.

Regards,
Geert
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-run-fd-callbacks.patch
Type: application/octet-stream
Size: 2088 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121221/69b01974/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-call-every-iteration.patch
Type: application/octet-stream
Size: 2699 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121221/69b01974/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-fix-typo.patch
Type: application/octet-stream
Size: 640 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121221/69b01974/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0004-remove-wrong-comments.patch
Type: application/octet-stream
Size: 856 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121221/69b01974/attachment-0003.obj>

From guido at python.org  Fri Dec 21 16:45:46 2012
From: guido at python.org (Guido van Rossum)
Date: Fri, 21 Dec 2012 07:45:46 -0800
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CAA0H+QTUimdD9X7qskJKjTU3=rGuxdgVqB7cKmxAPm3w-eGVuQ@mail.gmail.com>
References: <3B90AC3A-C73B-4BD1-9BE8-9ECF21F0D243@umbrellacode.com>
	<CAP7+vJJrn2arYjYGgkALt8wGJPMCqw3oQaMS3OqCdRSjCjOg+w@mail.gmail.com>
	<CAA0H+QQpfV6K4jYEZ5N13cCJyv+RJRgspVA_T_-N8re0hgfwxw@mail.gmail.com>
	<CAP7+vJLSpaw5bL9pMXCTP0fEhe93NkaK1Y8bCH5_THQ-RNqcyw@mail.gmail.com>
	<CAA0H+QQF-WPEviYhAd9iWNdCmNh_OL0q9iytUj5mCEgGz1hDJw@mail.gmail.com>
	<CAP7+vJJ2bkQEGPXp6TW-DCKee-r67_fD2=42W6d9xWgtKE5P3g@mail.gmail.com>
	<CAA0H+QTc24JHbhiZZAfohzSNd=haUt_xLiXZ5m40SUqAEtf_pQ@mail.gmail.com>
	<CAP7+vJLqAe06+kOLWHZzP2My3aanM1xR3wAQ82WwpEtF7q62Og@mail.gmail.com>
	<CAA0H+QTUimdD9X7qskJKjTU3=rGuxdgVqB7cKmxAPm3w-eGVuQ@mail.gmail.com>
Message-ID: <CAP7+vJL0_-ss7bHZPBtpb2ikY8pq48Qju8Ckm+qMSBOiv_m=9g@mail.gmail.com>

On Fri, Dec 21, 2012 at 3:38 AM, Jasper St. Pierre <jstpierre at mecheye.net>
wrote:
> I read over the wait_one() proposal again, and I still don't understand
it,
> so it would need more explanation to me.
>
> But I don't see the point of avoiding callbacks. In this case, we have two
> or more in-flight requests that can be finished at any time. This does not
> have a synchronous code equivalent -- callbacks are pretty much the only
> mechanism we can use to be notified when something is done.

Perhaps you haven't quite gotten used to coroutines? There are callbacks
underneath making it all work, but the user code rarely sees those. Let's
start with the following *synchronous* code as an example.

def indexer(urls):
    # urls is a set of strings
    done = {}  # dict mapping url to (data, links)
    while urls:
        data = urlfetch(url.pop())
        links = parse(data)
        done[url] = (data, links)
        for link in link:
            if link not in urls and link not in done:
                urls.add(link)
    return done

(Let's hope this is indexing a small static site and not the entire
internet. :-)

Now suppose we make urlfetch() a coroutine and we want to run all the
urlfetches in parallel. The toplevel index() function becomes a coroutine
too. We use the convention that coroutines' names end in _async, to remind
us that they return Futures. The phrase "x = yield from foo_async()" is
equivalent to the synchronous call "x = foo()".

@coroutine
def indexer_async(urls):
    done = {}
    # A dict mapping tasks to urls:
    running = {Task(urlfetch_async(url)), url for url in urls}
    while running:
        # The yield from will return a Future
        tsk = *yield from* wait_one_async(running)
        url = running.pop(tsk)
        data = tsk.result()  # May raise
        links = parse(data)
        done[url] = (data, links)
        for link in links:
            if link not in urls and link not in done:
                urls.add(link)
                tsk = Task(urlfetch_async(link)
                running[tsk] = link
    return done

This creates len(urls) initial tasks to parse the urls, and creates new
urls as new links are parsed. The assumption here is that the only blocking
I/O is done in the urlfetch_async() task. The indexer blocks at the *yield
from* in the marked line, at which point any or all of the urlfetch tasks
get to run some, and once one of them completes, wait_one_async() returns
that task. (A task is a Future that wraps a coroutine, by the way.
wait_one_async() works with Futures too.) We then inspect the completed
task with .result(), which gives us the data, which we parse as usual. The
data structures are a little more elaborate because we have to keep track
of the mapping from task to url. We add new tasks to the running dict as
soon as we have parsed their links, so they can all get started.

Note that in PEP 3156, I don't use the _async convention, but everything in
this example will work there once wait_one() is added.

Also note that the trick is that wait_one_async() must return a Future
whose result is another Future. The first Future is used (and thrown away)
by *yield from*; that Future's result is one of the original Futures
representing a completed task.

I hope this is clearer. I'm not saying this is the best or only way of
writing an async indexer using yield from (and I left out error handling)
but hopefully it is an illustrative example.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121221/b0b61eb1/attachment.html>

From jstpierre at mecheye.net  Fri Dec 21 16:57:06 2012
From: jstpierre at mecheye.net (Jasper St. Pierre)
Date: Fri, 21 Dec 2012 10:57:06 -0500
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CAP7+vJL0_-ss7bHZPBtpb2ikY8pq48Qju8Ckm+qMSBOiv_m=9g@mail.gmail.com>
References: <3B90AC3A-C73B-4BD1-9BE8-9ECF21F0D243@umbrellacode.com>
	<CAP7+vJJrn2arYjYGgkALt8wGJPMCqw3oQaMS3OqCdRSjCjOg+w@mail.gmail.com>
	<CAA0H+QQpfV6K4jYEZ5N13cCJyv+RJRgspVA_T_-N8re0hgfwxw@mail.gmail.com>
	<CAP7+vJLSpaw5bL9pMXCTP0fEhe93NkaK1Y8bCH5_THQ-RNqcyw@mail.gmail.com>
	<CAA0H+QQF-WPEviYhAd9iWNdCmNh_OL0q9iytUj5mCEgGz1hDJw@mail.gmail.com>
	<CAP7+vJJ2bkQEGPXp6TW-DCKee-r67_fD2=42W6d9xWgtKE5P3g@mail.gmail.com>
	<CAA0H+QTc24JHbhiZZAfohzSNd=haUt_xLiXZ5m40SUqAEtf_pQ@mail.gmail.com>
	<CAP7+vJLqAe06+kOLWHZzP2My3aanM1xR3wAQ82WwpEtF7q62Og@mail.gmail.com>
	<CAA0H+QTUimdD9X7qskJKjTU3=rGuxdgVqB7cKmxAPm3w-eGVuQ@mail.gmail.com>
	<CAP7+vJL0_-ss7bHZPBtpb2ikY8pq48Qju8Ckm+qMSBOiv_m=9g@mail.gmail.com>
Message-ID: <CAA0H+QT2Oy60hLA0qPOMO7TemKEGCg+DDr3wt=m2P3o2Wraphg@mail.gmail.com>

On Fri, Dec 21, 2012 at 10:45 AM, Guido van Rossum <guido at python.org> wrote:
... snip ... (gmail messed up parsing this, apparently)

Aha, that cleared it up, thanks. wait_one_async() takes an iterable of
tasks, and returns a Future that will fire when a Future completes,
containing that Future.

I can't think of anything *wrong* with that, except that if anything, 1) it
feels like a bit of an abuse to use Futures this way, 2) it feels a bit
low-level.

-- 
  Jasper
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121221/bf1135d7/attachment.html>

From guido at python.org  Fri Dec 21 16:57:04 2012
From: guido at python.org (Guido van Rossum)
Date: Fri, 21 Dec 2012 07:57:04 -0800
Subject: [Python-ideas] Tulip patches
In-Reply-To: <CADbA=FVwNDqgPOHAa_zwnzdiBnw=PwdyXW6ijyVQTLd+oQioxw@mail.gmail.com>
References: <CADbA=FVwNDqgPOHAa_zwnzdiBnw=PwdyXW6ijyVQTLd+oQioxw@mail.gmail.com>
Message-ID: <CAP7+vJ+vny5oipaTDu_-cPfzEYy+YTvXx-HPeoTR7+4jR3D9Ow@mail.gmail.com>

On Fri, Dec 21, 2012 at 6:31 AM, Geert Jansen <geertj at gmail.com> wrote:
>
> Hi,
>
> [if this is not the right forum to post patches for tulip, please
> redirect me to the correct one. There doesn't appear to be a mailing
> list for tulip at the moment. And this list is where most/all of the
> discussion is taking place.]

This is a fine place, but you would make my life even easier by
uploading the patches to codereview.appspot.com, so I can review them
and send comments in-line.

I've given you checkin permissions. Please send a contributor form to
the PSF (http://www.python.org/psf/contrib/contrib-form/).

> Please find attached 4 patches:
>
> 0001-run-fd-callbacks.patch
>
> This patch will run callbacks for readers and writers in the same loop
> iteration as when the fd got ready. Copying from my previous email,
> this is to support the following idiom:
>
>     # handle_read() sets the "ready" flag
>     loop.add_reader(fd, handle_read)
>     while not ready:
>         loop.run_once()
>
> The patch currently dispatches callbacks twice in each iteration, once
> before blocking and once after. I tried to dispatch only once after
> blocking, but this made the SSL transport test hang. The reason is
> that the create_transport task is scheduled with call_soon(), and only
> when the task first runs, a file descriptor is added. So unless you
> dispatch before blocking, this task will never get started.

Interesting. Go ahead and submit.

> 0002-call-every-iteration.patch
>
> This adds a call_every_iteration() method to the event loop. This
> callback runs *before* blocking.

There's one odd thing here: you remove cancelled everytime handlers
*after* already scheduling them. It would seem to make more sense to
schedule them first. Also, a faster way to do this would be

    self._everytime = [handler in self._everytime if not handler.cancelled]

(Even if you iterate from the back, remove() is still O(N), so if half
the handlers are to be removed, your original code would be O(N**2).)

> 0003-fix-typo.patch
> 0004-remove-wrong-comments.patch
>
> Two trivial patches.

Go ahead!

PS. If you want to set up a mailing list or other cleverness I can set
you up as a project admin. (I currently have all patches mailed to me
but we may want to set up a separate list for that.)

--
--Guido van Rossum (python.org/~guido)


From guido at python.org  Fri Dec 21 17:03:21 2012
From: guido at python.org (Guido van Rossum)
Date: Fri, 21 Dec 2012 08:03:21 -0800
Subject: [Python-ideas] async: feedback on EventLoop API
In-Reply-To: <CAA0H+QT2Oy60hLA0qPOMO7TemKEGCg+DDr3wt=m2P3o2Wraphg@mail.gmail.com>
References: <3B90AC3A-C73B-4BD1-9BE8-9ECF21F0D243@umbrellacode.com>
	<CAP7+vJJrn2arYjYGgkALt8wGJPMCqw3oQaMS3OqCdRSjCjOg+w@mail.gmail.com>
	<CAA0H+QQpfV6K4jYEZ5N13cCJyv+RJRgspVA_T_-N8re0hgfwxw@mail.gmail.com>
	<CAP7+vJLSpaw5bL9pMXCTP0fEhe93NkaK1Y8bCH5_THQ-RNqcyw@mail.gmail.com>
	<CAA0H+QQF-WPEviYhAd9iWNdCmNh_OL0q9iytUj5mCEgGz1hDJw@mail.gmail.com>
	<CAP7+vJJ2bkQEGPXp6TW-DCKee-r67_fD2=42W6d9xWgtKE5P3g@mail.gmail.com>
	<CAA0H+QTc24JHbhiZZAfohzSNd=haUt_xLiXZ5m40SUqAEtf_pQ@mail.gmail.com>
	<CAP7+vJLqAe06+kOLWHZzP2My3aanM1xR3wAQ82WwpEtF7q62Og@mail.gmail.com>
	<CAA0H+QTUimdD9X7qskJKjTU3=rGuxdgVqB7cKmxAPm3w-eGVuQ@mail.gmail.com>
	<CAP7+vJL0_-ss7bHZPBtpb2ikY8pq48Qju8Ckm+qMSBOiv_m=9g@mail.gmail.com>
	<CAA0H+QT2Oy60hLA0qPOMO7TemKEGCg+DDr3wt=m2P3o2Wraphg@mail.gmail.com>
Message-ID: <CAP7+vJLnB5Bunn8Rk0XbGyNMbowGbGNbJa0ov4sdx3ejyi0B+A@mail.gmail.com>

On Fri, Dec 21, 2012 at 7:57 AM, Jasper St. Pierre
<jstpierre at mecheye.net> wrote:
> Aha, that cleared it up, thanks. wait_one_async() takes an iterable of
> tasks, and returns a Future that will fire when a Future completes,
> containing that Future.
>
> I can't think of anything *wrong* with that, except that if anything, 1) it
> feels like a bit of an abuse to use Futures this way, 2) it feels a bit
> low-level.

But not more low-level than callbacks. Once you're used to coroutines
and Futures, you don't want things that use callbacks. Fortunately
there's an easy way to turn a callback into a Future:

f = Future()
old_style_async(callback=f.set_result)
result = yield from f

Assuming old_style_async() calls its callback with one arg, a useful
result, that result will now end up in the variable 'result'. If this
happens a lot it's easy to wrap it in a helper function, so you can
write:

result = yield from wrap_in_future(old_style_async)
-- 
--Guido van Rossum (python.org/~guido)


From guido at python.org  Fri Dec 21 18:30:46 2012
From: guido at python.org (Guido van Rossum)
Date: Fri, 21 Dec 2012 09:30:46 -0800
Subject: [Python-ideas] Tulip patches
In-Reply-To: <CAP7+vJ+vny5oipaTDu_-cPfzEYy+YTvXx-HPeoTR7+4jR3D9Ow@mail.gmail.com>
References: <CADbA=FVwNDqgPOHAa_zwnzdiBnw=PwdyXW6ijyVQTLd+oQioxw@mail.gmail.com>
	<CAP7+vJ+vny5oipaTDu_-cPfzEYy+YTvXx-HPeoTR7+4jR3D9Ow@mail.gmail.com>
Message-ID: <CAP7+vJJnFFJ-ATWHi2hRnNNuZXdxc9NGjyjosZRXSZ5ah4dnAw@mail.gmail.com>

On Fri, Dec 21, 2012 at 7:57 AM, Guido van Rossum <guido at python.org> wrote:
> On Fri, Dec 21, 2012 at 6:31 AM, Geert Jansen <geertj at gmail.com> wrote:
>> Please find attached 4 patches:
>>
>> 0001-run-fd-callbacks.patch
>>
>> This patch will run callbacks for readers and writers in the same loop
>> iteration as when the fd got ready. Copying from my previous email,
>> this is to support the following idiom:
>>
>>     # handle_read() sets the "ready" flag
>>     loop.add_reader(fd, handle_read)
>>     while not ready:
>>         loop.run_once()
>>
>> The patch currently dispatches callbacks twice in each iteration, once
>> before blocking and once after. I tried to dispatch only once after
>> blocking, but this made the SSL transport test hang. The reason is
>> that the create_transport task is scheduled with call_soon(), and only
>> when the task first runs, a file descriptor is added. So unless you
>> dispatch before blocking, this task will never get started.
>
> Interesting. Go ahead and submit.

Whoa! I just figured out the problem. You don't have to run the ready
queue twice. You just have to set the poll timeout to 0 if there's
anything in the ready queue. Please send me an updated patch before
submitting.

-- 
--Guido van Rossum (python.org/~guido)


From felipecruz at loogica.net  Fri Dec 21 19:09:05 2012
From: felipecruz at loogica.net (Felipe Cruz)
Date: Fri, 21 Dec 2012 16:09:05 -0200
Subject: [Python-ideas] Tulip patches
In-Reply-To: <CAP7+vJJnFFJ-ATWHi2hRnNNuZXdxc9NGjyjosZRXSZ5ah4dnAw@mail.gmail.com>
References: <CADbA=FVwNDqgPOHAa_zwnzdiBnw=PwdyXW6ijyVQTLd+oQioxw@mail.gmail.com>
	<CAP7+vJ+vny5oipaTDu_-cPfzEYy+YTvXx-HPeoTR7+4jR3D9Ow@mail.gmail.com>
	<CAP7+vJJnFFJ-ATWHi2hRnNNuZXdxc9NGjyjosZRXSZ5ah4dnAw@mail.gmail.com>
Message-ID: <CAEynwxs03W4ghtrCEPWbx45Now+w22-zbjwy9GMBbwyOv2Knzg@mail.gmail.com>

Hi!

I've been working in some tests to the pollers (Kqueue, Epoll ..) that may
interest you guys.. My goal is to create test cases for each poller
situation (ie: how to detect client disconnection with epoll and unix
pipes? or tcp sockets..) and understand how all those pollers are different
from each other and how we can map a generic events with all those possible
underlying implementations.

I already  did some Epoll and Kqueue tests here:
https://bitbucket.org/felipecruz/tulip/commits

best regards,
Felipe Cruz


2012/12/21 Guido van Rossum <guido at python.org>

> On Fri, Dec 21, 2012 at 7:57 AM, Guido van Rossum <guido at python.org>
> wrote:
> > On Fri, Dec 21, 2012 at 6:31 AM, Geert Jansen <geertj at gmail.com> wrote:
> >> Please find attached 4 patches:
> >>
> >> 0001-run-fd-callbacks.patch
> >>
> >> This patch will run callbacks for readers and writers in the same loop
> >> iteration as when the fd got ready. Copying from my previous email,
> >> this is to support the following idiom:
> >>
> >>     # handle_read() sets the "ready" flag
> >>     loop.add_reader(fd, handle_read)
> >>     while not ready:
> >>         loop.run_once()
> >>
> >> The patch currently dispatches callbacks twice in each iteration, once
> >> before blocking and once after. I tried to dispatch only once after
> >> blocking, but this made the SSL transport test hang. The reason is
> >> that the create_transport task is scheduled with call_soon(), and only
> >> when the task first runs, a file descriptor is added. So unless you
> >> dispatch before blocking, this task will never get started.
> >
> > Interesting. Go ahead and submit.
>
> Whoa! I just figured out the problem. You don't have to run the ready
> queue twice. You just have to set the poll timeout to 0 if there's
> anything in the ready queue. Please send me an updated patch before
> submitting.
>
> --
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121221/48a81f49/attachment.html>

From guido at python.org  Fri Dec 21 19:38:35 2012
From: guido at python.org (Guido van Rossum)
Date: Fri, 21 Dec 2012 10:38:35 -0800
Subject: [Python-ideas] Tulip patches
In-Reply-To: <CAEynwxs03W4ghtrCEPWbx45Now+w22-zbjwy9GMBbwyOv2Knzg@mail.gmail.com>
References: <CADbA=FVwNDqgPOHAa_zwnzdiBnw=PwdyXW6ijyVQTLd+oQioxw@mail.gmail.com>
	<CAP7+vJ+vny5oipaTDu_-cPfzEYy+YTvXx-HPeoTR7+4jR3D9Ow@mail.gmail.com>
	<CAP7+vJJnFFJ-ATWHi2hRnNNuZXdxc9NGjyjosZRXSZ5ah4dnAw@mail.gmail.com>
	<CAEynwxs03W4ghtrCEPWbx45Now+w22-zbjwy9GMBbwyOv2Knzg@mail.gmail.com>
Message-ID: <CAP7+vJKTpnP9g1bMEF7wFE4GqAoxcMYfNJFWTw86Aa7CbJv9-g@mail.gmail.com>

On Fri, Dec 21, 2012 at 10:09 AM, Felipe Cruz <felipecruz at loogica.net> wrote:
> I've been working in some tests to the pollers (Kqueue, Epoll ..) that may
> interest you guys.. My goal is to create test cases for each poller
> situation (ie: how to detect client disconnection with epoll and unix pipes?
> or tcp sockets..) and understand how all those pollers are different from
> each other and how we can map a generic events with all those possible
> underlying implementations.

That goal sounds great.

> I already  did some Epoll and Kqueue tests here:
> https://bitbucket.org/felipecruz/tulip/commits

Hm... Your clone is behind, a lot has changed since you made those
commits. You may have to merge from the main repo.

Specific comments:

- I prefer to use my existing test infrastructure rather than 3rd
party tools; dependencies in this early stage make it too hard for
people to experiment. (It's okay to add a rule to the Makefile to
invoke your favorite test discovery tool; but it's not okay to add
imports to the Python code that depends on a 3rd party test
framework.)

- Your code to add flags or eventmask to the events list seems
incomplete -- the UnixEventLoop doesn't expect poll() to return a
tuple so all my own tests break...

-- 
--Guido van Rossum (python.org/~guido)


From felipecruz at loogica.net  Fri Dec 21 19:46:58 2012
From: felipecruz at loogica.net (Felipe Cruz)
Date: Fri, 21 Dec 2012 16:46:58 -0200
Subject: [Python-ideas] Tulip patches
In-Reply-To: <CAP7+vJKTpnP9g1bMEF7wFE4GqAoxcMYfNJFWTw86Aa7CbJv9-g@mail.gmail.com>
References: <CADbA=FVwNDqgPOHAa_zwnzdiBnw=PwdyXW6ijyVQTLd+oQioxw@mail.gmail.com>
	<CAP7+vJ+vny5oipaTDu_-cPfzEYy+YTvXx-HPeoTR7+4jR3D9Ow@mail.gmail.com>
	<CAP7+vJJnFFJ-ATWHi2hRnNNuZXdxc9NGjyjosZRXSZ5ah4dnAw@mail.gmail.com>
	<CAEynwxs03W4ghtrCEPWbx45Now+w22-zbjwy9GMBbwyOv2Knzg@mail.gmail.com>
	<CAP7+vJKTpnP9g1bMEF7wFE4GqAoxcMYfNJFWTw86Aa7CbJv9-g@mail.gmail.com>
Message-ID: <CAEynwxvnvO+nMrOhzJoHBJiZ153LH3byRoVtPwS2BkVGPD6C_Q@mail.gmail.com>

Hi Guido!

I was just hacking without thinking about actually make patches. I can make
patches without 3rd parties dependencies and no Makefile modification. :)

I'll fix the second issue.




2012/12/21 Guido van Rossum <guido at python.org>

> On Fri, Dec 21, 2012 at 10:09 AM, Felipe Cruz <felipecruz at loogica.net>
> wrote:
> > I've been working in some tests to the pollers (Kqueue, Epoll ..) that
> may
> > interest you guys.. My goal is to create test cases for each poller
> > situation (ie: how to detect client disconnection with epoll and unix
> pipes?
> > or tcp sockets..) and understand how all those pollers are different from
> > each other and how we can map a generic events with all those possible
> > underlying implementations.
>
> That goal sounds great.
>
> > I already  did some Epoll and Kqueue tests here:
> > https://bitbucket.org/felipecruz/tulip/commits
>
> Hm... Your clone is behind, a lot has changed since you made those
> commits. You may have to merge from the main repo.
>
> Specific comments:
>
> - I prefer to use my existing test infrastructure rather than 3rd
> party tools; dependencies in this early stage make it too hard for
> people to experiment. (It's okay to add a rule to the Makefile to
> invoke your favorite test discovery tool; but it's not okay to add
> imports to the Python code that depends on a 3rd party test
> framework.)
>
> - Your code to add flags or eventmask to the events list seems
> incomplete -- the UnixEventLoop doesn't expect poll() to return a
> tuple so all my own tests break...
>
> --
> --Guido van Rossum (python.org/~guido)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121221/0e6e8dc0/attachment.html>

From jnoller at gmail.com  Fri Dec 21 20:06:47 2012
From: jnoller at gmail.com (Jesse Noller)
Date: Fri, 21 Dec 2012 14:06:47 -0500
Subject: [Python-ideas] [Python-Dev] PEP 3156 - Asynchronous IO Support
	Rebooted
In-Reply-To: <CAP7+vJLrbi0jJkQe6f+MLWv2WatO4FmGJWs28TrkfcpXfSE4vQ@mail.gmail.com>
References: <CAP7+vJLrbi0jJkQe6f+MLWv2WatO4FmGJWs28TrkfcpXfSE4vQ@mail.gmail.com>
Message-ID: <FADD4950E0EA483BA34DEE1C1BCFBB14@gmail.com>



On Friday, December 21, 2012 at 1:57 PM, Guido van Rossum wrote:

> Dear python-dev *and* python-ideas,
> 
> I am posting PEP 3156 here for early review and discussion. As you can
> see from the liberally sprinkled TBD entries it is not done, but I am
> about to disappear on vacation for a few weeks and I am reasonably
> happy with the state of things so far. (Of course feedback may change
> this. :-) Also, there has already been some discussion on python-ideas
> (and even on Twitter) so I don't want python-dev to feel out of the
> loop -- this *is* a proposal for a new standard library module. (But
> no, I haven't picked the module name yet. :-)
> 
> There's an -- also incomplete -- reference implementation at
> http://code.google.com/p/tulip/ -- unlike the first version of tulip,
> this version actually has (some) unittests.
> 
> Let the bikeshedding begin!
> 
> (Oh, happy holidays too. :-)
> 
> -- 
> --Guido van Rossum (python.org/~guido (http://python.org/~guido))
> 
I really do like tulip as the name. It's quite pretty. 




From guido at python.org  Fri Dec 21 20:09:39 2012
From: guido at python.org (Guido van Rossum)
Date: Fri, 21 Dec 2012 11:09:39 -0800
Subject: [Python-ideas] [Python-Dev] PEP 3156 - Asynchronous IO Support
	Rebooted
In-Reply-To: <FADD4950E0EA483BA34DEE1C1BCFBB14@gmail.com>
References: <CAP7+vJLrbi0jJkQe6f+MLWv2WatO4FmGJWs28TrkfcpXfSE4vQ@mail.gmail.com>
	<FADD4950E0EA483BA34DEE1C1BCFBB14@gmail.com>
Message-ID: <CAP7+vJLGstFdYqrzwfb2V1WZKTz0s67UY6kmnSnO+nZ0sfL=2g@mail.gmail.com>

On Fri, Dec 21, 2012 at 11:06 AM, Jesse Noller <jnoller at gmail.com> wrote:
> I really do like tulip as the name. It's quite pretty.

I chose it because Twisted and Tornado both start with T. But those
have kind of dark associations; I wanted to offset that with something
lighter. (OTOH we could use a black tulip as a logo. :-)

Regardless, it's not the kind of name we tend to use for the stdlib.
It'll probably end up being asynclib or something...

-- 
--Guido van Rossum (python.org/~guido)


From guido at python.org  Fri Dec 21 19:57:12 2012
From: guido at python.org (Guido van Rossum)
Date: Fri, 21 Dec 2012 10:57:12 -0800
Subject: [Python-ideas] PEP 3156 - Asynchronous IO Support Rebooted
Message-ID: <CAP7+vJLrbi0jJkQe6f+MLWv2WatO4FmGJWs28TrkfcpXfSE4vQ@mail.gmail.com>

Dear python-dev *and* python-ideas,

I am posting PEP 3156 here for early review and discussion. As you can
see from the liberally sprinkled TBD entries it is not done, but I am
about to disappear on vacation for a few weeks and I am reasonably
happy with the state of things so far. (Of course feedback may change
this. :-) Also, there has already been some discussion on python-ideas
(and even on Twitter) so I don't want python-dev to feel out of the
loop -- this *is* a proposal for a new standard library module. (But
no, I haven't picked the module name yet. :-)

There's an -- also incomplete -- reference implementation at
http://code.google.com/p/tulip/ -- unlike the first version of tulip,
this version actually has (some) unittests.

Let the bikeshedding begin!

(Oh, happy holidays too. :-)

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
PEP: 3156
Title: Asynchronous IO Support Rebooted
Version: $Revision$
Last-Modified: $Date$
Author: Guido van Rossum <guido at python.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 12-Dec-2012
Post-History: TBD

Abstract
========

This is a proposal for asynchronous I/O in Python 3, starting with
Python 3.3.  Consider this the concrete proposal that is missing from
PEP 3153.  The proposal includes a pluggable event loop API, transport
and protocol abstractions similar to those in Twisted, and a
higher-level scheduler based on ``yield from`` (PEP 380).  A reference
implementation is in the works under the code name tulip.


Introduction
============

The event loop is the place where most interoperability occurs.  It
should be easy for (Python 3.3 ports of) frameworks like Twisted,
Tornado, or ZeroMQ to either adapt the default event loop
implementation to their needs using a lightweight wrapper or proxy, or
to replace the default event loop implementation with an adaptation of
their own event loop implementation.  (Some frameworks, like Twisted,
have multiple event loop implementations.  This should not be a
problem since these all have the same interface.)

It should even be possible for two different third-party frameworks to
interoperate, either by sharing the default event loop implementation
(each using its own adapter), or by sharing the event loop
implementation of either framework.  In the latter case two levels of
adaptation would occur (from framework A's event loop to the standard
event loop interface, and from there to framework B's event loop).
Which event loop implementation is used should be under control of the
main program (though a default policy for event loop selection is
provided).

Thus, two separate APIs are defined:

- getting and setting the current event loop object
- the interface of a conforming event loop and its minimum guarantees

An event loop implementation may provide additional methods and
guarantees.

The event loop interface does not depend on ``yield from``.  Rather, it
uses a combination of callbacks, additional interfaces (transports and
protocols), and Futures.  The latter are similar to those defined in
PEP 3148, but have a different implementation and are not tied to
threads.  In particular, they have no wait() method; the user is
expected to use callbacks.

For users (like myself) who don't like using callbacks, a scheduler is
provided for writing asynchronous I/O code as coroutines using the PEP
380 ``yield from`` expressions.  The scheduler is not pluggable;
pluggability occurs at the event loop level, and the scheduler should
work with any conforming event loop implementation.

For interoperability between code written using coroutines and other
async frameworks, the scheduler has a Task class that behaves like a
Future.  A framework that interoperates at the event loop level can
wait for a Future to complete by adding a callback to the Future.
Likewise, the scheduler offers an operation to suspend a coroutine
until a callback is called.

Limited interoperability with threads is provided by the event loop
interface; there is an API to submit a function to an executor (see
PEP 3148) which returns a Future that is compatible with the event
loop.


Non-goals
=========

Interoperability with systems like Stackless Python or
greenlets/gevent is not a goal of this PEP.


Specification
=============

Dependencies
------------

Python 3.3 is required.  No new language or standard library features
beyond Python 3.3 are required.  No third-party modules or packages
are required.

Module Namespace
----------------

The specification here will live in a new toplevel package.  Different
components will live in separate submodules of that package.  The
package will import common APIs from their respective submodules and
make them available as package attributes (similar to the way the
email package works).

The name of the toplevel package is currently unspecified.  The
reference implementation uses the name 'tulip', but the name will
change to something more boring if and when the implementation is
moved into the standard library (hopefully for Python 3.4).

Until the boring name is chosen, this PEP will use 'tulip' as the
toplevel package name.  Classes and functions given without a module
name are assumed to be accessed via the toplevel package.

Event Loop Policy: Getting and Setting the Event Loop
-----------------------------------------------------

To get the current event loop, use ``get_event_loop()``.  This returns
an instance of the ``EventLoop`` class defined below or an equivalent
object.  It is possible that ``get_event_loop()`` returns a different
object depending on the current thread, or depending on some other
notion of context.

To set the current event loop, use ``set_event_loop(event_loop)``,
where ``event_loop`` is an instance of the ``EventLoop`` class or
equivalent.  This uses the same notion of context as
``get_event_loop()``.

For the benefit of unit tests and other special cases there's a third
policy function: ``init_event_loop()``, which creates a new EventLoop
instance and calls ``set_event_loop()`` with it.  TBD: Maybe we should
have a ``create_default_event_loop_instance()`` function instead?

To change the way the above three functions work
(including their notion of context), call
``set_event_loop_policy(policy)``, where ``policy`` is an event loop
policy object.  The policy object can be any object that has methods
``get_event_loop()``, ``set_event_loop(event_loop)``
and ``init_event_loop()`` behaving like
the functions described above.  The default event loop policy is an
instance of the class ``DefaultEventLoopPolicy``.  The current event loop
policy object can be retrieved by calling ``get_event_loop_policy()``.

An event loop policy may but does not have to enforce that there is
only one event loop in existence.  The default event loop policy does
not enforce this, but it does enforce that there is only one event
loop per thread.

Event Loop Interface
--------------------

(A note about times: as usual in Python, all timeouts, intervals and
delays are measured in seconds, and may be ints or floats.  The
accuracy and precision of the clock are up to the implementation; the
default implementation uses ``time.monotonic()``.)

A conforming event loop object has the following methods:

- ``run()``.  Runs the event loop until there is nothing left to do.
  This means, in particular:

  - No more calls scheduled with ``call_later()``,
  ``call_repeatedly()``, ``call_soon()``, or
  ``call_soon_threadsafe()``, except for cancelled calls.

  - No more registered file descriptors.  It is up to the registering
    party to unregister a file descriptor when it is closed.

  Note: ``run()`` blocks until the termination condition is met,
  or until ``stop()`` is called.

  Note: if you schedule a call with ``call_repeatedly()``, ``run()``
  will not exit until you cancel it.

  TBD: How many variants of this do we really need?

- ``stop()``.  Stops the event loop as soon as it is convenient.  It
  is fine to restart the loop with ``run()`` (or one of its variants)
  subsequently.

  Note: How soon exactly is up to the implementation.  All immediate
  callbacks that were already scheduled to run before ``stop()`` is
  called must still be run, but callbacks scheduled after it is called
  (or scheduled to be run later) will not be run.

- ``run_forever()``.  Runs the event loop until ``stop()`` is called.

- ``run_until_complete(future, timeout=None)``.  Runs the event loop
  until the Future is done.  If a timeout is given, it waits at most
  that long.  If the Future is done, its result is returned, or its
  exception is raised; if the timeout expires before the Future is
  done, or if ``stop()`` is called, ``TimeoutError`` is raised (but
  the Future is not cancelled).  This cannot be called when the event
  loop is already running.
  
  Note: This API is most useful for tests and the like.  It should not
  be used as a substitute for ``yield from future`` or other ways to
  wait for a Future (e.g. registering a done callback).

- ``run_once(timeout=None)``.  Run the event loop for a little while.
  If a timeout is given, an I/O poll made will block at most that
  long; otherwise, an I/O poll is not constrained in time.

  Note: Exactlly how much work this does is up to the implementation.
  One constraint: if a callback immediately schedules itself using
  ``call_soon()``, causing an infinite loop, ``run_once()`` should
  still return.

- ``call_later(delay, callback, *args)``.  Arrange for
  ``callback(*args)`` to be called approximately ``delay`` seconds in
  the future, once, unless cancelled.  Returns
  a ``Handler`` object representing the callback, whose
  ``cancel()`` method can be used to cancel the callback.

- ``call_repeatedly(interval, callback, **args)``.  Like ``call_later()``
  but calls the callback repeatedly, every ``interval`` seconds,
  until the ``Handler`` returned is cancelled.  The first call is in
  ``interval`` seconds.

- ``call_soon(callback, *args)``.  Equivalent to ``call_later(0,
  callback, *args)``.

- ``call_soon_threadsafe(callback, *args)``.  Like
  ``call_soon(callback, *args)``, but when called from another thread
  while the event loop is blocked waiting for I/O, unblocks the event
  loop.  This is the *only* method that is safe to call from another
  thread or from a signal handler.  (To schedule a callback for a
  later time in a threadsafe manner, you can use
  ``ev.call_soon_threadsafe(ev.call_later, when, callback, *args)``.)

- TBD: A way to register a callback that is already wrapped in a
  ``Handler``.  Maybe ``call_soon()`` could just check
  ``isinstance(callback, Handler)``?  It should silently skip
  a cancelled callback.

Some methods in the standard conforming interface return Futures:

- ``wrap_future(future)``.  This takes a PEP 3148 Future (i.e., an
  instance of ``concurrent.futures.Future``) and returns a Future
  compatible with the event loop (i.e., a ``tulip.Future`` instance).

- ``run_in_executor(executor, function, *args)``.  Arrange to call
  ``function(*args)`` in an executor (see PEP 3148).  Returns a Future
  whose result on success is the return value that call.  This is
  equivalent to ``wrap_future(executor.submit(function, *args))``.  If
  ``executor`` is ``None``, a default ``ThreadPoolExecutor`` with 5
  threads is used.  (TBD: Should the default executor be shared
  between different event loops?  Should we even have a default
  executor?  Should be be able to set its thread count?  Shoul we even
  have this method?)

- ``set_default_executor(executor)``.  Set the default executor used
  by ``run_in_executor()``.

- ``getaddrinfo(host, port, family=0, type=0, proto=0, flags=0)``.
  Similar to the ``socket.getaddrinfo()`` function but returns a
  Future.  The Future's result on success will be a list of the same
  format as returned by ``socket.getaddrinfo()``.  The default
  implementation calls ``socket.getaddrinfo()`` using
  ``run_in_executor()``, but other implementations may choose to
  implement their own DNS lookup.

- ``getnameinfo(sockaddr, flags=0)``.  Similar to
  ``socket.getnameinfo()`` but returns a Future.  The Future's result
  on success will be a tuple ``(host, port)``.  Same implementation
  remarks as for ``getaddrinfo()``.

- ``create_transport(protocol_factory, host, port, **kwargs)``.
  Creates a transport and a protocol and ties them together.  Returns
  a Future whose result on success is a (transport, protocol) pair.
  Note that when the Future completes, the protocol's
  ``connection_made()`` method has not yet been called; that will
  happen when the connection handshake is complete.  When it is
  impossible to connect to the given host and port, the Future will
  raise an exception instead.

  Optional keyword arguments:

  - ``family``, ``type``, ``proto``, ``flags``: Address familty,
    socket type, protcol, and miscellaneous flags to be passed through
    to ``getaddrinfo()``.  These all default to ``0`` except ``type``
    which defaults to ``socket.SOCK_STREAM``.

  - ``ssl``: Pass ``True`` to create an SSL transport (by default a
    plain TCP is created).  Or pass an ``ssl.SSLContext`` object to
    override the default SSL context object to be used.

  TBD: Should this be called create_connection()?

- ``start_serving(...)``.  Enters a loop that accepts connections.
  TBD: Signature.  There are two possibilities:

  1. You pass it a non-blocking socket that you have already prepared
     with ``bind()`` and ``listen()`` (these system calls do not block
     AFAIK), a protocol factory (I hesitate to use this word :-), and
     optional flags that control the transport creation (e.g. ssl).

  2. Instead of a socket, you pass it a host and port, and some more
     optional flags (e.g. to control IPv4 vs IPv6, or to set the
     backlog value to be passed to ``listen()``).

  In either case, once it has a socket, it will wrap it in a
  transport, and then enter a loop accepting connections (the best way
  to implement such a loop depends on the platform).  Each time a
  connection is accepted, a transport and protocol are created for it.

  This should return an object that can be used to control the serving
  loop, e.g. to stop serving, abort all active connections, and (if
  supported) adjust the backlog or other parameters.  It may also have
  an API to inquire about active connections.  If version (2) is
  selected, it should probably return a Future whose result on success
  will be that control object, and which becomes done once the accept
  loop is started.

  TBD: It may be best to use version (2), since on some platforms the
  best way to start a server may not involve sockets (but will still
  involve transports and protocols).

  TBD: Be more specific.

TBD: Some platforms may not be interested in implementing all of
these, e.g. start_serving() may be of no interest to mobile apps.
(Although, there's a Minecraft server on my iPad...)

The following methods for registering callbacks for file descriptors
are optional.  If they are not implemented, accessing the method
(without calling it) returns AttributeError.  The default
implementation provides them but the user normally doesn't use these
directly -- they are used by the transport implementations
exclusively.  Also, on Windows these may be present or not depending
on whether a select-based or IOCP-based event loop is used.  These
take integer file descriptors only, not objects with a fileno()
method.  The file descriptor should represent something pollable --
i.e. no disk files.

- ``add_reader(fd, callback, *args)``.  Arrange for
  ``callback(*args)`` to be called whenever file descriptor ``fd`` is
  ready for reading.  Returns a ``Handler`` object which can be
  used to cancel the callback.  Note that, unlike ``call_later()``,
  the callback may be called many times.  Calling ``add_reader()``
  again for the same file descriptor implicitly cancels the previous
  callback for that file descriptor.  (TBD: Returning a
  ``Handler`` that can be cancelled seems awkward.  Let's forget
  about that.)  (TBD: Change this to raise an exception if a handler
  is already set.)

- ``add_writer(fd, callback, *args)``.  Like ``add_reader()``,
  but registers the callback for writing instead of for reading.

- ``remove_reader(fd)``.  Cancels the current read callback for file
  descriptor ``fd``, if one is set.  A no-op if no callback is
  currently set for the file descriptor.  (The reason for providing
  this alternate interface is that it is often more convenient to
  remember the file descriptor than to remember the ``Handler``
  object.)  (TBD: Return ``True`` if a handler was removed, ``False``
  if not.)

- ``remove_writer(fd)``.  This is to ``add_writer()`` as
  ``remove_reader()`` is to ``add_reader()``.

- ``add_connector(fd, callback, *args)``.  Like ``add_writer()`` but
  meant to wait for ``connect()`` operations, which on some platforms
  require different handling (e.g. ``WSAPoll()`` on Windows).

- ``remove_connector(fd)``.  This is to ``remove_writer()`` as
  ``add_connector()`` is to ``add_writer()``.

TBD: What about multiple callbacks per fd?  The current semantics is
that ``add_reader()/add_writer()`` replace a previously registered
callback.  Change this to raise an exception if a callback is already
registered.

The following methods for doing async I/O on sockets are optional.
They are alternative to the previous set of optional methods, intended
for transport implementations on Windows using IOCP (if the event loop
supports it).  The socket argument has to be a non-blocking socket.

- ``sock_recv(sock, n)``.  Receive up to ``n`` bytes from socket
  ``sock``.  Returns a Future whose result on success will be a
  bytes object on success.

- ``sock_sendall(sock, data)``.  Send bytes ``data`` to the socket
  ``sock``.  Returns a Future whose result on success will be
  ``None``.  (TBD: Is it better to emulate ``sendall()`` or ``send()``
  semantics?  I think ``sendall()`` -- but perhaps it should still
  be *named* ``send()``?)

- ``sock_connect(sock, address)``.  Connect to the given address.
  Returns a Future whose result on success will be ``None``.

- ``sock_accept(sock)``.  Accept a connection from a socket.  The
  socket must be in listening mode and bound to an address.  Returns a
  Future whose result on success will be a tuple ``(conn, peer)``
  where ``conn`` is a connected non-blocking socket and ``peer`` is
  the peer address.  (TBD: People tell me that this style of API is
  too slow for high-volume servers.  So there's also
  ``start_serving()`` above.  Then do we still need this?)

TBD: Optional methods are not so good.  Perhaps these should be
required?  It may still depend on the platform which set is more
efficient.

Callback Sequencing
-------------------

When two callbacks are scheduled for the same time, they are run
in the order in which they are registered.  For example::

  ev.call_soon(foo)
  ev.call_soon(bar)

guarantees that ``foo()`` is called before ``bar()``.

If ``call_soon()`` is used, this guarantee is true even if the system
clock were to run backwards.  This is also the case for
``call_later(0, callback, *args)``.  However, if ``call_later()`` is
used with a nonzero delay, all bets are off if the system
clock were to runs backwards.  (A good event loop implementation
should use ``time.monotonic()`` to avoid problems when the clock runs
backward.  See PEP 418.)

Context
-------

All event loops have a notion of context.  For the default event loop
implementation, the context is a thread.  An event loop implementation
should run all callbacks in the same context.  An event loop
implementation should run only one callback at a time, so callbacks
can assume automatic mutual exclusion with other callbacks scheduled
in the same event loop.

Exceptions
----------

There are two categories of exceptions in Python: those that derive
from the ``Exception`` class and those that derive from
``BaseException``.  Exceptions deriving from ``Exception`` will
generally be caught and handled appropriately; for example, they will
be passed through by Futures, and they will be logged and ignored when
they occur in a callback.

However, exceptions deriving only from ``BaseException`` are never
caught, and will usually cause the program to terminate with a
traceback.  (Examples of this category include ``KeyboardInterrupt``
and ``SystemExit``; it is usually unwise to treat these the same as
most other exceptions.)

The Handler Class
-----------------

The various methods for registering callbacks (e.g. ``call_later()``)
all return an object representing the registration that can be used to
cancel the callback.  For want of a better name this object is called
a ``Handler``, although the user never needs to instantiate
instances of this class.  There is one public method:

- ``cancel()``.  Attempt to cancel the callback.
  TBD: Exact specification.

Read-only public attributes:

- ``callback``.  The callback function to be called.

- ``args``.  The argument tuple with which to call the callback function.

- ``cancelled``.  True if ``cancel()`` has been called.

Note that some callbacks (e.g. those registered with ``call_later()``)
are meant to be called only once.  Others (e.g. those registered with
``add_reader()``) are meant to be called multiple times.

TBD: An API to call the callback (encapsulating the exception handling
necessary)?  Should it record how many times it has been called?
Maybe this API should just be ``__call__()``?  (But it should suppress
exceptions.)

TBD: Public attribute recording the realtime value when the callback
is scheduled?  (Since this is needed anyway for storing it in a heap.)

Futures
-------

The ``tulip.Future`` class here is intentionally similar to the
``concurrent.futures.Future`` class specified by PEP 3148, but there
are slight differences.  The supported public API is as follows,
indicating the differences with PEP 3148:

- ``cancel()``.
  TBD: Exact specification.

- ``cancelled()``.

- ``running()``.  Note that the meaning of this method is essentially
  "cannot be cancelled and isn't done yet".  (TBD: Would be nice if
  this could be set *and* cleared in some cases, e.g. sock_recv().)

- ``done()``.

- ``result()``.  Difference with PEP 3148: This has no timeout
  argument and does *not* wait; if the future is not yet done, it
  raises an exception.

- ``exception()``.  Difference with PEP 3148: This has no timeout
  argument and does *not* wait; if the future is not yet done, it
  raises an exception.

- ``add_done_callback(fn)``.  Difference with PEP 3148: The callback
  is never called immediately, and always in the context of the
  caller.  (Typically, a context is a thread.)  You can think of this
  as calling the callback through ``call_soon_threadsafe()``.  Note
  that the callback (unlike all other callbacks defined in this PEP,
  and ignoring the convention from the section "Callback Style" below)
  is always called with a single argument, the Future object.

The internal methods defined in PEP 3148 are not supported.  (TBD:
Maybe we do need to support these, in order to make it easy to write
user code that returns a Future?)

A ``tulip.Future`` object is not acceptable to the ``wait()`` and
``as_completed()`` functions in the ``concurrent.futures`` package.

A ``tulip.Future`` object is acceptable to a ``yield from`` expression
when used in a coroutine.  This is implemented through the
``__iter__()`` interface on the Future.  See the section "Coroutines
and the Scheduler" below.

When a Future is garbage-collected, if it has an associated exception
but neither ``result()`` nor ``exception()`` nor ``__iter__()`` has
ever been called (or the latter hasn't raised the exception yet --
details TBD), the exception should be logged.  TBD: At what level?

In the future (pun intended) we may unify ``tulip.Future`` and
``concurrent.futures.Future``, e.g. by adding an ``__iter__()`` method
to the latter that works with ``yield from``.  To prevent accidentally
blocking the event loop by calling e.g. ``result()`` on a Future
that's not don yet, the blocking operation may detect that an event
loop is active in the current thread and raise an exception instead.
However the current PEP strives to have no dependencies beyond Python
3.3, so changes to ``concurrent.futures.Future`` are off the table for
now.

Transports
----------

A transport is an abstraction on top of a socket or something similar
(for example, a UNIX pipe or an SSL connection).  Transports are
strongly influenced by Twisted and PEP 3153.  Users rarely implement
or instantiate transports -- rather, event loops offer utility methods
to set up transports.

Transports work in conjunction with protocols.  Protocols are
typically written without knowing or caring about the exact type of
transport used, and transports can be used with a wide variety of
protocols.  For example, an HTTP client protocol implementation may be
used with either a plain socket transport or an SSL transport.  The
plain socket transport can be used with many different protocols
besides HTTP (e.g. SMTP, IMAP, POP, FTP, IRC, SPDY).

Most connections have an asymmetric nature: the client and server
usually have very different roles and behaviors.  Hence, the interface
between transport and protocol is also asymmetric.  From the
protocol's point of view, *writing* data is done by calling the
``write()`` method on the transport object; this buffers the data and
returns immediately.  However, the transport takes a more active role
in *reading* data: whenever some data is read from the socket (or
other data source), the transport calls the protocol's
``data_received()`` method.

Transports have the following public methods:

- ``write(data)``.  Write some bytes.  The argument must be a bytes
  object.  Returns ``None``.  The transport is free to buffer the
  bytes, but it must eventually cause the bytes to be transferred to
  the entity at the other end, and it must maintain stream behavior.
  That is, ``t.write(b'abc'); t.write(b'def')`` is equivalent to
  ``t.write(b'abcdef')``, as well as to::

    t.write(b'a')
    t.write(b'b')
    t.write(b'c')
    t.write(b'd')
    t.write(b'e')
    t.write(b'f')

- ``writelines(iterable)``.  Equivalent to::

    for data in iterable:
        self.write(data)

- ``write_eof()``.  Close the writing end of the connection.
  Subsequent calls to ``write()`` are not allowed.  Once all buffered
  data is transferred, the transport signals to the other end that no
  more data will be received.  Some protocols don't support this
  operation; in that case, calling ``write_eof()`` will raise an
  exception.  (Note: This used to be called ``half_close()``, but
  unless you already know what it is for, that name doesn't indicate
  *which* end is closed.)

- ``can_write_eof()``.  Return ``True`` if the protocol supports
  ``write_eof()``, ``False`` if it does not.  (This method is needed
  because some protocols need to change their behavior when
  ``write_eof()`` is unavailable.  For example, in HTTP, to send data
  whose size is not known ahead of time, the end of the data is
  typically indicated using ``write_eof()``; however, SSL does not
  support this, and an HTTP protocol implementation would have to use
  the "chunked" transfer encoding in this case.  But if the data size
  is known ahead of time, the best approach in both cases is to use
  the Content-Length header.)

- ``pause()``.  Suspend delivery of data to the protocol until a
  subsequent ``resume()`` call.  Between ``pause()`` and ``resume()``,
  the protocol's ``data_received()`` method will not be called.  This
  has no effect on ``write()``.

- ``resume()``.  Restart delivery of data to the protocol via
  ``data_received()``.

- ``close()``.  Sever the connection with the entity at the other end.
  Any data buffered by ``write()`` will (eventually) be transferred
  before the connection is actually closed.  The protocol's
  ``data_received()`` method will not be called again.  Once all
  buffered data has been flushed, the protocol's ``connection_lost()``
  method will be called with ``None`` as the argument.  Note that
  this method does not wait for all that to happen.

- ``abort()``.  Immediately sever the connection.  Any data still
  buffered by the transport is thrown away.  Soon, the protocol's
  ``connection_lost()`` method will be called with ``None`` as
  argument.  (TBD: Distinguish in the ``connection_lost()`` argument
  between ``close()``, ``abort()`` or a close initated by the other
  end?  Or add a transport method to inquire about this?  Glyph's
  proposal was to pass different exceptions for this purpose.)

TBD: Provide flow control the other way -- the transport may need to
suspend the protocol if the amount of data buffered becomes a burden.
Proposal: let the transport call ``protocol.pause()`` and
``protocol.resume()`` if they exist; if they don't exist, the
protocol doesn't support flow control.  (Perhaps different names
to avoid confusion between protocols and transports?)

Protocols
---------

Protocols are always used in conjunction with transports.  While a few
common protocols are provided (e.g. decent though not necessarily
excellent HTTP client and server implementations), most protocols will
be implemented by user code or third-party libraries.

A protocol must implement the following methods, which will be called
by the transport.  Consider these callbacks that are always called by
the event loop in the right context.  (See the "Context" section
above.)

- ``connection_made(transport)``.  Indicates that the transport is
  ready and connected to the entity at the other end.  The protocol
  should probably save the transport reference as an instance variable
  (so it can call its ``write()`` and other methods later), and may
  write an initial greeting or request at this point.

- ``data_received(data)``.  The transport has read some bytes from the
  connection.  The argument is always a non-empty bytes object.  There
  are no guarantees about the minimum or maximum size of the data
  passed along this way.  ``p.data_received(b'abcdef')`` should be
  treated exactly equivalent to::

    p.data_received(b'abc')
    p.data_received(b'def')

- ``eof_received()``.  This is called when the other end called
  ``write_eof()`` (or something equivalent).  The default
  implementation calls ``close()`` on the transport, which causes
  ``connection_lost()`` to be called (eventually) on the protocol.

- ``connection_lost(exc)``.  The transport has been closed or aborted,
  has detected that the other end has closed the connection cleanly,
  or has encountered an unexpected error.  In the first three cases
  the argument is ``None``; for an unexpected error, the argument is
  the exception that caused the transport to give up.  (TBD: Do we
  need to distinguish between the first three cases?)

Here is a chart indicating the order and multiplicity of calls:

  1. ``connection_made()`` -- exactly once
  2. ``data_received()`` -- zero or more times
  3. ``eof_received()`` -- at most once
  4. ``connection_lost()`` -- exactly once

TBD: Discuss whether user code needs to do anything to make sure that
protocol and transport aren't garbage-collected prematurely.

Callback Style
--------------

Most interfaces taking a callback also take positional arguments.  For
instance, to arrange for ``foo("abc", 42)`` to be called soon, you
call ``ev.call_soon(foo, "abc", 42)``.  To schedule the call
``foo()``, use ``ev.call_soon(foo)``.  This convention greatly reduces
the number of small lambdas required in typical callback programming.

This convention specifically does *not* support keyword arguments.
Keyword arguments are used to pass optional extra information about
the callback.  This allows graceful evolution of the API without
having to worry about whether a keyword might be significant to a
callee somewhere.  If you have a callback that *must* be called with a
keyword argument, you can use a lambda or ``functools.partial``.  For
example::

  ev.call_soon(functools.partial(foo, "abc", repeat=42))

Choosing an Event Loop Implementation
-------------------------------------

TBD.  (This is about the choice to use e.g. select vs. poll vs. epoll,
and how to override the choice.  Probably belongs in the event loop
policy.)


Coroutines and the Scheduler
============================

This is a separate toplevel section because its status is different
from the event loop interface.  Usage of coroutines is optional, and
it is perfectly fine to write code using callbacks only.  On the other
hand, there is only one implementation of the scheduler/coroutine API,
and if you're using coroutines, that's the one you're using.

Coroutines
----------

A coroutine is a generator that follows certain conventions.  For
documentation purposes, all coroutines should be decorated with
``@tulip.coroutine``, but this cannot be strictly enforced.

Coroutines use the ``yield from`` syntax introduced in PEP 380,
instead of the original ``yield`` syntax.

The word "coroutine", like the word "generator", is used for two
different (though related) concepts:

- The function that defines a coroutine (a function definition
  decorated with ``tulip.coroutine``).  If disambiguation is needed,
  we call this a *coroutine function*.

- The object obtained by calling a coroutine function.  This object
  represents a computation or an I/O operation (usually a combination)
  that will complete eventually.  For disambiguation we call it a
  *coroutine object*.

Things a coroutine can do:

- ``result = yield from future`` -- suspends the coroutine until the
  future is done, then returns the future's result, or raises its
  exception, which will be propagated.

- ``result = yield from coroutine`` -- wait for another coroutine to
  produce a result (or raise an exception, which will be propagated).
  The ``coroutine`` expression must be a *call* to another coroutine.

- ``results = yield from tulip.par(futures_and_coroutines)`` -- Wait
  for a list of futures and/or coroutines to complete and return a
  list of their results.  If one of the futures or coroutines raises
  an exception, that exception is propagated, after attempting to
  cancel all other futures and coroutines in the list.

- ``return result`` -- produce a result to the coroutine that is
  waiting for this one using ``yield from``.

- ``raise exception`` -- raise an exception in the coroutine that is
  waiting for this one using ``yield from``.

Calling a coroutine does not start its code running -- it is just a
generator, and the coroutine object returned by the call is really a
generator object, which doesn't do anything until you iterate over it.
In the case of a coroutine object, there are two basic ways to start
it running: call ``yield from coroutine`` from another coroutine
(assuming the other coroutine is already running!), or convert it to a
Task.

Coroutines can only run when the event loop is running.

Tasks
-----

A Task is an object that manages an independently running coroutine.
The Task interface is the same as the Future interface.  The task
becomes done when its coroutine returns or raises an exception; if it
returns a result, that becomes the task's result, if it raises an
exception, that becomes the task's exception.

Cancelling a task that's not done yet prevents its coroutine from
completing; in this case an exception is thrown into the coroutine
that it may catch to further handle cancellation, but it doesn't have
to (this is done using the standard ``close()`` method on generators,
described in PEP 342).

The ``par()`` function described above runs coroutines in parallel by
converting them to Tasks.  (Arguments that are already Tasks or
Futures are not converted.)

Tasks are also useful for interoperating between coroutines and
callback-based frameworks like Twisted.  After converting a coroutine
into a Task, callbacks can be added to the Task.

You may ask, why not convert all coroutines to Tasks?  The
``@tulip.coroutine`` decorator could do this.  This would slow things
down considerably in the case where one coroutine calls another (and
so on), as switching to a "bare" coroutine has much less overhead than
switching to a Task.

The Scheduler
-------------

The scheduler has no public interface.  You interact with it by using
``yield from future`` and ``yield from task``.  In fact, there is no
single object representing the scheduler -- its behavior is
implemented by the ``Task`` and ``Future`` classes using only the
public interface of the event loop, so it will work with third-party
event loop implementations, too.

Sleeping
--------

TBD: ``yield sleep(seconds)``.  Can use ``sleep(0)`` to suspend to
poll for I/O.

Wait for First
--------------

TBD: Need an interface to wait for the first of a collection of Futures.

Coroutines and Protocols
------------------------

The best way to use coroutines to implement protocols is probably to
use a streaming buffer that gets filled by ``data_received()`` and can
be read asynchronously using methods like ``read(n)`` and
``readline()`` that return a Future.  When the connection is closed,
``read()`` should return a Future whose result is ``b''``, or raise an
exception if ``connection_closed()`` is called with an exception.

To write, the ``write()`` method (and friends) on the transport can be
used -- these do not return Futures.  A standard protocol
implementation should be provided that sets this up and kicks off the
coroutine when ``connection_made()`` is called.

TBD: Be more specific.

Cancellation
------------

TBD.  When a Task is cancelled its coroutine may see an exception at
any point where it is yielding to the scheduler (i.e., potentially at
any ``yield from`` operation).  We need to spell out which exception
is raised.

Also TBD: timeouts.


Open Issues
===========

- A debugging API?  E.g. something that logs a lot of stuff, or logs
  unusual conditions (like queues filling up faster than they drain)
  or even callbacks taking too much time...

- Do we need introspection APIs?  E.g. asking for the read callback
  given a file descriptor.  Or when the next scheduled call is.  Or
  the list of file descriptors registered with callbacks.

- Should we have ``future.add_callback(callback, *args)``, using the
  convention from the section "Callback Style" above, or should we
  stick with the PEP 3148 specification of
  ``future.add_done_callback(callback)`` which calls
  ``callback(future)``?  (Glyph suggested using a different method
  name since add_done_callback() does not guarantee that the callback
  will be called in the right context.)

- Returning a Future is relatively expensive, and it is quite possible
  that some types of calls *usually* complete immediately
  (e.g. writing small amounts of data to a socket).  A trick used by
  Richard Oudkerk in the tulip project's proactor branch makes calls
  like recv() either return a regular result or *raise* a Future.  The
  caller (likely a transport) must then write code like this::

    try:
        res = ev.sock_recv(sock, 8192)
    except Future as f:
        yield from sch.block_future(f)
        res = f.result()

- Do we need a larger vocabulary of operations for combining
  coroutines and/or futures?  E.g. in addition to par() we could have
  a way to run several coroutines sequentially (returning all results
  or passing the result of one to the next and returning the final
  result?).  We might also introduce explicit locks (though these will
  be a bit of a pain to use, as we can't use the ``with lock: block``
  syntax).  Anyway, I think all of these are easy enough to write
  using ``Task``.

  Proposal: ``f = yield from wait_one(fs)`` takes a set of Futures and
  sets f to the first of those that is done.  (Yes, this requires an
  intermediate Future to wait for.)  You can then write::

    while fs:
        f = tulip.wait_one(fs)
        fs.remove(f)
        <inspect f>        

- Support for datagram protocols, "connected" or otherwise?  Probably
  need more socket I/O methods, e.g. ``sock_sendto()`` and
  ``sock_recvfrom()``.  Or users can write their own (it's not rocket
  science).  Is it reasonable to map ``write()``, ``writelines()``,
  ``data_received()`` to single datagrams?

- Task or callback priorities?  (I hope not.)

- An EventEmitter in the style of NodeJS?  Or make this a separate
  PEP?  It's easy enough to do in user space, though it may benefit
  from standardization.  (See
  https://github.com/mnot/thor/blob/master/thor/events.py and
  https://github.com/mnot/thor/blob/master/doc/events.md for examples.)


Acknowledgments
===============

Apart from PEP 3153, influences include PEP 380 and Greg Ewing's
tutorial for ``yield from``, Twisted, Tornado, ZeroMQ, pyftpdlib, tulip
(the author's attempts at synthesis of all these), wattle (Steve
Dower's counter-proposal), numerous discussions on python-ideas from
September through December 2012, a Skype session with Steve Dower and
Dino Viehland, email exchanges with Ben Darnell, an audience with
Niels Provos (original author of libevent), and two in-person meetings
with several Twisted developers, including Glyph, Brian Warner, David
Reid, and Duncan McGreggor.  Also, the author's previous work on async
support in the NDB library for Google App Engine was an important
influence.


Copyright
=========

This document has been placed in the public domain.



..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:

From _ at lvh.cc  Fri Dec 21 22:04:04 2012
From: _ at lvh.cc (Laurens Van Houtven)
Date: Fri, 21 Dec 2012 22:04:04 +0100
Subject: [Python-ideas] [Python-Dev] PEP 3156 - Asynchronous IO Support
	Rebooted
In-Reply-To: <CAP7+vJLGstFdYqrzwfb2V1WZKTz0s67UY6kmnSnO+nZ0sfL=2g@mail.gmail.com>
References: <CAP7+vJLrbi0jJkQe6f+MLWv2WatO4FmGJWs28TrkfcpXfSE4vQ@mail.gmail.com>
	<FADD4950E0EA483BA34DEE1C1BCFBB14@gmail.com>
	<CAP7+vJLGstFdYqrzwfb2V1WZKTz0s67UY6kmnSnO+nZ0sfL=2g@mail.gmail.com>
Message-ID: <CAE_Hg6ahOSXipk=fwpiFgaB3vEQSOnW4T8g5EpLYH1fe6rZypw@mail.gmail.com>

Looks reasonable to me :) Comments:

create_transport "combines" a transport and a protocol. Is that process
reversible? that might seem like an exotic thing (and I guess it kind of
is), but I've wanted this e.g for websockets, and I guess there's a few
other cases where it could be useful :)

eof_received on protocols seems unusual. What's the rationale?

I know we disagree that callbacks (of the line_received variety) are a good
idea for blocking IO (I think we should have universal protocol
implementations), but can we agree that they're what we want for tulip? If
so, I can try to figure out a way to get them to fit together :) I'm
assuming that this means you'd like protocols and transports in this PEP?

A generic comment on yield from APIs that I'm sure has been discussed in
some e-mail I missed: is there an obvious way to know up front whether
something needs to be yielded or yield frommed? In twisted, which is what
I'm used to it's all deferreds; but here a future's yield from but sleep's
yield?

Will comment more as I keep reading I'm sure :)


On Fri, Dec 21, 2012 at 8:09 PM, Guido van Rossum <guido at python.org> wrote:

> On Fri, Dec 21, 2012 at 11:06 AM, Jesse Noller <jnoller at gmail.com> wrote:
> > I really do like tulip as the name. It's quite pretty.
>
> I chose it because Twisted and Tornado both start with T. But those
> have kind of dark associations; I wanted to offset that with something
> lighter. (OTOH we could use a black tulip as a logo. :-)
>
> Regardless, it's not the kind of name we tend to use for the stdlib.
> It'll probably end up being asynclib or something...
>
> --
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
cheers
lvh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121221/262c7713/attachment.html>

From geertj at gmail.com  Fri Dec 21 22:59:23 2012
From: geertj at gmail.com (Geert Jansen)
Date: Fri, 21 Dec 2012 22:59:23 +0100
Subject: [Python-ideas] Tulip patches
In-Reply-To: <CAP7+vJ+vny5oipaTDu_-cPfzEYy+YTvXx-HPeoTR7+4jR3D9Ow@mail.gmail.com>
References: <CADbA=FVwNDqgPOHAa_zwnzdiBnw=PwdyXW6ijyVQTLd+oQioxw@mail.gmail.com>
	<CAP7+vJ+vny5oipaTDu_-cPfzEYy+YTvXx-HPeoTR7+4jR3D9Ow@mail.gmail.com>
Message-ID: <CADbA=FWxVxbM+EjnjqGkcah6Y0isv6bT-8yATUetOs410ETgfQ@mail.gmail.com>

On Fri, Dec 21, 2012 at 4:57 PM, Guido van Rossum <guido at python.org> wrote:

> This is a fine place, but you would make my life even easier by
> uploading the patches to codereview.appspot.com, so I can review them
> and send comments in-line.

I tried to get Tulip added as a new repository there, but i'm probably
doing something wrong.. In the mean time i'm sending my updated
patches below..

> I've given you checkin permissions. Please send a contributor form to
> the PSF (http://www.python.org/psf/contrib/contrib-form/).

Done!

>> 0001-run-fd-callbacks.patch
[...]
> Interesting. Go ahead and submit.
[from your other email]
> Whoa! I just figured out the problem. You don't have to run the ready
> queue twice. You just have to set the poll timeout to 0 if there's
> anything in the ready queue. Please send me an updated patch before
> submitting.

New patch attached.

>> 0002-call-every-iteration.patch
[...]
> There's one odd thing here: you remove cancelled everytime handlers
> *after* already scheduling them. It would seem to make more sense to
> schedule them first. Also, a faster way to do this would be
>
>     self._everytime = [handler in self._everytime if not handler.cancelled]
>
> (Even if you iterate from the back, remove() is still O(N), so if half
> the handlers are to be removed, your original code would be O(N**2).)

ACK regarding the comment on O(N^2). The reason i implemented it like
this is that i didn't want to regenerate the list at every iteration
of the loop (maybe i'm unduly worried though...). The attached patch
does as you suggest but only in case there are cancelled handlers.

> PS. If you want to set up a mailing list or other cleverness I can set
> you up as a project admin. (I currently have all patches mailed to me
> but we may want to set up a separate list for that.)

I'm happy to be an admin and set up a Google Groups for this. On the
other hand, tulip is supposed to become part of the standard library,
right? Maybe python-dev is as a good place to discuss tulip? Your
call..

I'll go ahead and commit the two trivial patches, and wait for your
ACK on the updated versions of the other two.

Regards,
Geert
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-run-fd-callbacks-v2.patch
Type: application/octet-stream
Size: 3151 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121221/8126bb79/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-call-every-iteration-v2.patch
Type: application/octet-stream
Size: 2791 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121221/8126bb79/attachment-0001.obj>

From jonathan at slenders.be  Fri Dec 21 23:26:09 2012
From: jonathan at slenders.be (Jonathan Slenders)
Date: Fri, 21 Dec 2012 23:26:09 +0100
Subject: [Python-ideas] [Python-Dev] PEP 3156 - Asynchronous IO Support
	Rebooted
In-Reply-To: <CAE_Hg6ahOSXipk=fwpiFgaB3vEQSOnW4T8g5EpLYH1fe6rZypw@mail.gmail.com>
References: <CAP7+vJLrbi0jJkQe6f+MLWv2WatO4FmGJWs28TrkfcpXfSE4vQ@mail.gmail.com>
	<FADD4950E0EA483BA34DEE1C1BCFBB14@gmail.com>
	<CAP7+vJLGstFdYqrzwfb2V1WZKTz0s67UY6kmnSnO+nZ0sfL=2g@mail.gmail.com>
	<CAE_Hg6ahOSXipk=fwpiFgaB3vEQSOnW4T8g5EpLYH1fe6rZypw@mail.gmail.com>
Message-ID: <CAKfyG3x54b2CSbzfgC0Hdxs5_q8kcZ0LCvcwEfv=mOyO12nX2Q@mail.gmail.com>

As far as I understand, "yield from" will always work, because a Future
object can act like an iterator, and you can delegate your own generator to
this iterator at the place of "yield from".
"yield" only works if the parameter behind yield is already a Future
object. Right Guido?

In case of sleep, sleep could be implemented to return a Future object.




2012/12/21 Laurens Van Houtven <_ at lvh.cc>

> A generic comment on yield from APIs that I'm sure has been discussed in
> some e-mail I missed: is there an obvious way to know up front whether
> something needs to be yielded or yield frommed? In twisted, which is what
> I'm used to it's all deferreds; but here a future's yield from but sleep's
> yield?
>
>
>
> On Fri, Dec 21, 2012 at 8:09 PM, Guido van Rossum <guido at python.org>wrote:
>
>> On Fri, Dec 21, 2012 at 11:06 AM, Jesse Noller <jnoller at gmail.com> wrote:
>> > I really do like tulip as the name. It's quite pretty.
>>
>> I chose it because Twisted and Tornado both start with T. But those
>> have kind of dark associations; I wanted to offset that with something
>> lighter. (OTOH we could use a black tulip as a logo. :-)
>>
>> Regardless, it's not the kind of name we tend to use for the stdlib.
>> It'll probably end up being asynclib or something...
>>
>> --
>> --Guido van Rossum (python.org/~guido)
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>>
>
>
>
> --
> cheers
> lvh
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121221/6106bd45/attachment.html>

From jonathan at slenders.be  Fri Dec 21 23:21:05 2012
From: jonathan at slenders.be (Jonathan Slenders)
Date: Fri, 21 Dec 2012 23:21:05 +0100
Subject: [Python-ideas] An async facade?
In-Reply-To: <CAKfyG3xMPkNLY1Y9Q3XsLo7=2aAj1wtz2q-PzgcW=92eSo6GMA@mail.gmail.com>
References: <CAKfyG3w9wPmmW4a=e8y2bJQ5CTv5zt3x21E7P3B+s=H4uGwt6w@mail.gmail.com>
	<CAA0H+QQfGY5ao-=be2S9jJENgmoM6EHRJgt4AYptY-j8oSLfQA@mail.gmail.com>
	<CAKfyG3wESUXfFuauavx8qt3f=jL0hd22=9sTjpx8CnBRY+zOgg@mail.gmail.com>
	<CAP7+vJ+EPq3hYyfYMfBRAOV957jTk3-ZMzOTEsMeyCYM=OTd8Q@mail.gmail.com>
	<CAKfyG3xMPkNLY1Y9Q3XsLo7=2aAj1wtz2q-PzgcW=92eSo6GMA@mail.gmail.com>
Message-ID: <CAKfyG3yxCaVchv_6r9KvGyKmZ6BCP8Y1mbDVgGQ+vgKkMD_X_A@mail.gmail.com>

Just read through the PEP3156.
It's interesting to see. (I had no idea that yield from would return the
result of the generator. It's clever, given that at this point it behaves
different than a normal 'yield'.)

One question. Why does @coroutine not convert the generator into a Future
object right away?
Just like @defer.inlineCallbacks in Twisted. This has the advantage that
calling the function would simply start the coroutine.

The point of my 'await' experiment was that I could do the following:


>>> def do_something():
>>>    result = await "query" # Query could be a Task object.
>>>    return result

>>> do_something()
Task('do_something')

# (And there it starts executing)



It's very personal, but I find it nicer to see the name of the called
function as a Future instead of seeing a generator. Technically, coroutines
and generators may be the same, but normally you wouldn't write a for-loop
over a coroutine, and you can't make a Future of -say- an
xrange-generator. And when not calling from another coroutine (like from
the global scope during start-up), it's also a little more work to turn the
generator into a Future every time.

Here, "await" does what "yield" does. If you automatically turn coroutines
into a Future object when calling, you'll never need a "yield from" in this
case. I agree that "await" would be redundant, but somehow, if we had a
hint to the interpreter that it would turn generator functions into Future
objects during calling, that would be nice.

I'm happy to get convinced otherwise. :)

Jonathan


2012/12/21 Jonathan Slenders <jonathan at slenders.be>

> Thank you, Guido! I didn't know about this PEP, but it looks interesting.
> I'll try to find some spare time this weekend to read through the PEP,
> maybe giving some feedback.
>
> Cheers!
>
>
>
> 2012/12/21 Guido van Rossum <guido at python.org>
>
>> On Thu, Dec 20, 2012 at 3:34 PM, Jonathan Slenders <jonathan at slenders.be>
>> wrote:
>> > So, the difference is still that the "await" proposal makes the @async
>> > decorator implicit. I'm still in favor of this because in asynchronous
>> code,
>> > you can have really many functions with this decorator. And if someone
>> > forgets about that, getting a generator object instead of a Future is
>> quite
>> > different in semantics.
>>
>> Carefully read PEP 3156, and the tulip implementation:
>> http://code.google.com/p/tulip/source/browse/tulip/tasks.py . The
>> @coroutine decorator is technically redundant when you use yield from.
>>
>> --
>> --Guido van Rossum (python.org/~guido)
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121221/2ea34ef0/attachment.html>

From ncoghlan at gmail.com  Sat Dec 22 00:07:51 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 22 Dec 2012 09:07:51 +1000
Subject: [Python-ideas] [Python-Dev] PEP 3156 - Asynchronous IO Support
	Rebooted
In-Reply-To: <CAP7+vJLGstFdYqrzwfb2V1WZKTz0s67UY6kmnSnO+nZ0sfL=2g@mail.gmail.com>
References: <CAP7+vJLrbi0jJkQe6f+MLWv2WatO4FmGJWs28TrkfcpXfSE4vQ@mail.gmail.com>
	<FADD4950E0EA483BA34DEE1C1BCFBB14@gmail.com>
	<CAP7+vJLGstFdYqrzwfb2V1WZKTz0s67UY6kmnSnO+nZ0sfL=2g@mail.gmail.com>
Message-ID: <CADiSq7dTi_QgH-JNU4VyA7m4e-=o9T9UXmuhx3MMLtRj3HvxBQ@mail.gmail.com>

We were tentatively calling it "concurrent.eventloop" at the 2011 language
summit.

--
Sent from my phone, thus the relative brevity :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121222/062999f2/attachment.html>

From guido at python.org  Sat Dec 22 01:45:53 2012
From: guido at python.org (Guido van Rossum)
Date: Fri, 21 Dec 2012 16:45:53 -0800
Subject: [Python-ideas] Tulip patches
In-Reply-To: <CADbA=FWxVxbM+EjnjqGkcah6Y0isv6bT-8yATUetOs410ETgfQ@mail.gmail.com>
References: <CADbA=FVwNDqgPOHAa_zwnzdiBnw=PwdyXW6ijyVQTLd+oQioxw@mail.gmail.com>
	<CAP7+vJ+vny5oipaTDu_-cPfzEYy+YTvXx-HPeoTR7+4jR3D9Ow@mail.gmail.com>
	<CADbA=FWxVxbM+EjnjqGkcah6Y0isv6bT-8yATUetOs410ETgfQ@mail.gmail.com>
Message-ID: <CAP7+vJ+FSj931EXbA5VciLmdZ3wnFXz3+vJ+2+_GXFDxQ0wyyQ@mail.gmail.com>

On Fri, Dec 21, 2012 at 1:59 PM, Geert Jansen <geertj at gmail.com> wrote:
> On Fri, Dec 21, 2012 at 4:57 PM, Guido van Rossum <guido at python.org> wrote:
>
>> This is a fine place, but you would make my life even easier by
>> uploading the patches to codereview.appspot.com, so I can review them
>> and send comments in-line.
>
> I tried to get Tulip added as a new repository there, but i'm probably
> doing something wrong.. In the mean time i'm sending my updated
> patches below..

Yeah, sorry, the upload form is not to be used. You should use the
upload.py utility instead:
https://codereview.appspot.com/static/upload.py

>> I've given you checkin permissions. Please send a contributor form to
>> the PSF (http://www.python.org/psf/contrib/contrib-form/).
>
> Done!
>
>>> 0001-run-fd-callbacks.patch
> [...]
>> Interesting. Go ahead and submit.
> [from your other email]
>> Whoa! I just figured out the problem. You don't have to run the ready
>> queue twice. You just have to set the poll timeout to 0 if there's
>> anything in the ready queue. Please send me an updated patch before
>> submitting.
>
> New patch attached.

Looks good to me. Check it in!

>>> 0002-call-every-iteration.patch
> [...]
>> There's one odd thing here: you remove cancelled everytime handlers
>> *after* already scheduling them. It would seem to make more sense to
>> schedule them first. Also, a faster way to do this would be
>>
>>     self._everytime = [handler in self._everytime if not handler.cancelled]
>>
>> (Even if you iterate from the back, remove() is still O(N), so if half
>> the handlers are to be removed, your original code would be O(N**2).)
>
> ACK regarding the comment on O(N^2). The reason i implemented it like
> this is that i didn't want to regenerate the list at every iteration
> of the loop (maybe i'm unduly worried though...). The attached patch
> does as you suggest but only in case there are cancelled handlers.

LG, except:

- Maybe rename 'cancelled' to 'any_cancelled'.
- PEP 8 conformance: [foo bar], not [ foo bar ].

You can check it in after fixing those issues.

>> PS. If you want to set up a mailing list or other cleverness I can set
>> you up as a project admin. (I currently have all patches mailed to me
>> but we may want to set up a separate list for that.)
>
> I'm happy to be an admin and set up a Google Groups for this.

Made you an admin. Go ahead.

> On the
> other hand, tulip is supposed to become part of the standard library,
> right? Maybe python-dev is as a good place to discuss tulip? Your
> call..

I think it's too soon to flood python-dev with every little detail
(though I just did post there about the PEP).

> I'll go ahead and commit the two trivial patches, and wait for your
> ACK on the updated versions of the other two.

Thanks!

-- 
--Guido van Rossum (python.org/~guido)


From guido at python.org  Sat Dec 22 01:50:06 2012
From: guido at python.org (Guido van Rossum)
Date: Fri, 21 Dec 2012 16:50:06 -0800
Subject: [Python-ideas] An async facade?
In-Reply-To: <CAKfyG3yxCaVchv_6r9KvGyKmZ6BCP8Y1mbDVgGQ+vgKkMD_X_A@mail.gmail.com>
References: <CAKfyG3w9wPmmW4a=e8y2bJQ5CTv5zt3x21E7P3B+s=H4uGwt6w@mail.gmail.com>
	<CAA0H+QQfGY5ao-=be2S9jJENgmoM6EHRJgt4AYptY-j8oSLfQA@mail.gmail.com>
	<CAKfyG3wESUXfFuauavx8qt3f=jL0hd22=9sTjpx8CnBRY+zOgg@mail.gmail.com>
	<CAP7+vJ+EPq3hYyfYMfBRAOV957jTk3-ZMzOTEsMeyCYM=OTd8Q@mail.gmail.com>
	<CAKfyG3xMPkNLY1Y9Q3XsLo7=2aAj1wtz2q-PzgcW=92eSo6GMA@mail.gmail.com>
	<CAKfyG3yxCaVchv_6r9KvGyKmZ6BCP8Y1mbDVgGQ+vgKkMD_X_A@mail.gmail.com>
Message-ID: <CAP7+vJJkTkCF=7SN-GKw5oCmofMk1jNL_fA-QsF51O04-Q2Pkw@mail.gmail.com>

On Fri, Dec 21, 2012 at 2:21 PM, Jonathan Slenders <jonathan at slenders.be> wrote:
> Just read through the PEP3156.
> It's interesting to see. (I had no idea that yield from would return the
> result of the generator. It's clever, given that at this point it behaves
> different than a normal 'yield'.)
>
> One question. Why does @coroutine not convert the generator into a Future
> object right away?

Because once it is a Future, the scheduler has to get involved every
time it yields, even if the yield doesn't do any I/O but just
transfers control to a "subroutine". This is hard to get your head
around, but it is worth it.

> Just like @defer.inlineCallbacks in Twisted. This has the advantage that
> calling the function would simply start the coroutine.

But it would be much slower.

> The point of my 'await' experiment was that I could do the following:
>
>
>>>> def do_something():
>>>>    result = await "query" # Query could be a Task object.
>>>>    return result
>
>>>> do_something()
> Task('do_something')
>
> # (And there it starts executing)

Yeah, and the same works with yield from in TUlip. The @coroutine
decorator is not needed.

> It's very personal, but I find it nicer to see the name of the called
> function as a Future instead of seeing a generator. Technically, coroutines
> and generators may be the same, but normally you wouldn't write a for-loop
> over a coroutine, and you can't make a Future of -say- an xrange-generator.
> And when not calling from another coroutine (like from the global scope
> during start-up), it's also a little more work to turn the generator into a
> Future every time.
>
> Here, "await" does what "yield" does. If you automatically turn coroutines
> into a Future object when calling, you'll never need a "yield from" in this
> case. I agree that "await" would be redundant, but somehow, if we had a hint
> to the interpreter that it would turn generator functions into Future
> objects during calling, that would be nice.
>
> I'm happy to get convinced otherwise. :)

It's water under the bridge. We have PEP 380 in Python 3.3. I don't
want to change the language again in 3.4. Maybe after that we can
reconsider.

-- 
--Guido van Rossum (python.org/~guido)


From guido at python.org  Sat Dec 22 02:02:09 2012
From: guido at python.org (Guido van Rossum)
Date: Fri, 21 Dec 2012 17:02:09 -0800
Subject: [Python-ideas] [Python-Dev] PEP 3156 - Asynchronous IO Support
	Rebooted
In-Reply-To: <CAE_Hg6ahOSXipk=fwpiFgaB3vEQSOnW4T8g5EpLYH1fe6rZypw@mail.gmail.com>
References: <CAP7+vJLrbi0jJkQe6f+MLWv2WatO4FmGJWs28TrkfcpXfSE4vQ@mail.gmail.com>
	<FADD4950E0EA483BA34DEE1C1BCFBB14@gmail.com>
	<CAP7+vJLGstFdYqrzwfb2V1WZKTz0s67UY6kmnSnO+nZ0sfL=2g@mail.gmail.com>
	<CAE_Hg6ahOSXipk=fwpiFgaB3vEQSOnW4T8g5EpLYH1fe6rZypw@mail.gmail.com>
Message-ID: <CAP7+vJKzDC616V5LOSEV7=oLQ=96YiCNR1Xua46n3T0v09Zh5g@mail.gmail.com>

On Fri, Dec 21, 2012 at 1:04 PM, Laurens Van Houtven <_ at lvh.cc> wrote:
> Looks reasonable to me :) Comments:
>
> create_transport "combines" a transport and a protocol. Is that process
> reversible? that might seem like an exotic thing (and I guess it kind of
> is), but I've wanted this e.g for websockets, and I guess there's a few
> other cases where it could be useful :)

If you really need this, it's probably best to start out doing this as
a nonstandard extension of an implementation. The current
*implementation* makes it simple enough, but I don't think it's worth
complicating the PEP. Working code might convince me otherwise.

> eof_received on protocols seems unusual. What's the rationale?

Well how else would you indicate that the other end did a half-close
(in Twisted terminology)? You can't call connection_lost() because you
might still want to write more. E.g. this is how HTTP servers work if
there's no Content-length or chunked encoding on a request body: they
read until EOF, then do their thing and write the response.

> I know we disagree that callbacks (of the line_received variety) are a good
> idea for blocking IO (I think we should have universal protocol
> implementations), but can we agree that they're what we want for tulip? If
> so, I can try to figure out a way to get them to fit together :) I'm
> assuming that this means you'd like protocols and transports in this PEP?

Sorry, I have no idea what you're talking about. Can you clarify?

I do know that the PEP is weakest in specifying how a coroutine can
implement a transport. However my plans are clear: ild the old tulip
code there's a BufferedReader; somehow the coroutine will receive a
"stdin" and a "stdout" where the "stdin" is a BufferedReader, which
has methods like read(), readline() etc. which return Futures and must
be invoked using yield from; and "stdout" is a transport, which has
write() and friends that don't return anything but just buffer stuff
and start the I/O asynchronous (and may try to slow down the protocol
by calling its pause() method).

> A generic comment on yield from APIs that I'm sure has been discussed in
> some e-mail I missed: is there an obvious way to know up front whether
> something needs to be yielded or yield frommed? In twisted, which is what
> I'm used to it's all deferreds; but here a future's yield from but sleep's
> yield?

In PEP 3156 conformant code you're supposed always to use 'yield
from'. The only time you see a bare yield is when it's part of the
implementation's internals. (However I think tulip actually will
handle a yield the same way as a yield from, except that it's slower
because it makes a roundtrip to the scheduler, a.k.a. trampoline.)

> Will comment more as I keep reading I'm sure :)

Please do!

-- 
--Guido van Rossum (python.org/~guido)


From guido at python.org  Sat Dec 22 02:03:26 2012
From: guido at python.org (Guido van Rossum)
Date: Fri, 21 Dec 2012 17:03:26 -0800
Subject: [Python-ideas] [Python-Dev] PEP 3156 - Asynchronous IO Support
	Rebooted
In-Reply-To: <CAKfyG3x54b2CSbzfgC0Hdxs5_q8kcZ0LCvcwEfv=mOyO12nX2Q@mail.gmail.com>
References: <CAP7+vJLrbi0jJkQe6f+MLWv2WatO4FmGJWs28TrkfcpXfSE4vQ@mail.gmail.com>
	<FADD4950E0EA483BA34DEE1C1BCFBB14@gmail.com>
	<CAP7+vJLGstFdYqrzwfb2V1WZKTz0s67UY6kmnSnO+nZ0sfL=2g@mail.gmail.com>
	<CAE_Hg6ahOSXipk=fwpiFgaB3vEQSOnW4T8g5EpLYH1fe6rZypw@mail.gmail.com>
	<CAKfyG3x54b2CSbzfgC0Hdxs5_q8kcZ0LCvcwEfv=mOyO12nX2Q@mail.gmail.com>
Message-ID: <CAP7+vJLf8=UE7PusHFL2CDP8j_JAze+OvWHYVKxAvn+vA-YXEg@mail.gmail.com>

On Fri, Dec 21, 2012 at 2:26 PM, Jonathan Slenders <jonathan at slenders.be> wrote:
> As far as I understand, "yield from" will always work, because a Future
> object can act like an iterator, and you can delegate your own generator to
> this iterator at the place of "yield from".
> "yield" only works if the parameter behind yield is already a Future object.
> Right Guido?

Correct! Sounds like you got it now.

That's the magic of yield from..

> In case of sleep, sleep could be implemented to return a Future object.

It does; in tulip/futures.py:

def sleep(when, result=None):
    future = Future()
    future._event_loop.call_later(when, future.set_result, result)
    return future

-- 
--Guido van Rossum (python.org/~guido)


From jstpierre at mecheye.net  Sat Dec 22 02:13:48 2012
From: jstpierre at mecheye.net (Jasper St. Pierre)
Date: Fri, 21 Dec 2012 20:13:48 -0500
Subject: [Python-ideas] An async facade?
In-Reply-To: <CAP7+vJJkTkCF=7SN-GKw5oCmofMk1jNL_fA-QsF51O04-Q2Pkw@mail.gmail.com>
References: <CAKfyG3w9wPmmW4a=e8y2bJQ5CTv5zt3x21E7P3B+s=H4uGwt6w@mail.gmail.com>
	<CAA0H+QQfGY5ao-=be2S9jJENgmoM6EHRJgt4AYptY-j8oSLfQA@mail.gmail.com>
	<CAKfyG3wESUXfFuauavx8qt3f=jL0hd22=9sTjpx8CnBRY+zOgg@mail.gmail.com>
	<CAP7+vJ+EPq3hYyfYMfBRAOV957jTk3-ZMzOTEsMeyCYM=OTd8Q@mail.gmail.com>
	<CAKfyG3xMPkNLY1Y9Q3XsLo7=2aAj1wtz2q-PzgcW=92eSo6GMA@mail.gmail.com>
	<CAKfyG3yxCaVchv_6r9KvGyKmZ6BCP8Y1mbDVgGQ+vgKkMD_X_A@mail.gmail.com>
	<CAP7+vJJkTkCF=7SN-GKw5oCmofMk1jNL_fA-QsF51O04-Q2Pkw@mail.gmail.com>
Message-ID: <CAA0H+QQUv30da390D8JRqYWWbrSFk9V+iDs68D3qQRJ4T-qkXw@mail.gmail.com>

On Fri, Dec 21, 2012 at 7:50 PM, Guido van Rossum <guido at python.org> wrote:

>
> It's water under the bridge. We have PEP 380 in Python 3.3. I don't
> want to change the language again in 3.4. Maybe after that we can
> reconsider.


One thing I'll say is that I think the coroutine decorator should convert
something like:

    @coroutine
    def blah():
        return "result"

into the generator equivalent. You can do a syntax hack with:

    @coroutine
    def blah():
        if 0: yield
        return "result"

but that feels bad. This sort of bug may seem unlikely, but a user may hit
it if they're commenting out code,

Maybe a generic @force_generator decorator might be useful...

-- 
  Jasper
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121221/c521d8f8/attachment.html>

From guido at python.org  Sat Dec 22 02:16:12 2012
From: guido at python.org (Guido van Rossum)
Date: Fri, 21 Dec 2012 17:16:12 -0800
Subject: [Python-ideas] An async facade?
In-Reply-To: <CAA0H+QQUv30da390D8JRqYWWbrSFk9V+iDs68D3qQRJ4T-qkXw@mail.gmail.com>
References: <CAKfyG3w9wPmmW4a=e8y2bJQ5CTv5zt3x21E7P3B+s=H4uGwt6w@mail.gmail.com>
	<CAA0H+QQfGY5ao-=be2S9jJENgmoM6EHRJgt4AYptY-j8oSLfQA@mail.gmail.com>
	<CAKfyG3wESUXfFuauavx8qt3f=jL0hd22=9sTjpx8CnBRY+zOgg@mail.gmail.com>
	<CAP7+vJ+EPq3hYyfYMfBRAOV957jTk3-ZMzOTEsMeyCYM=OTd8Q@mail.gmail.com>
	<CAKfyG3xMPkNLY1Y9Q3XsLo7=2aAj1wtz2q-PzgcW=92eSo6GMA@mail.gmail.com>
	<CAKfyG3yxCaVchv_6r9KvGyKmZ6BCP8Y1mbDVgGQ+vgKkMD_X_A@mail.gmail.com>
	<CAP7+vJJkTkCF=7SN-GKw5oCmofMk1jNL_fA-QsF51O04-Q2Pkw@mail.gmail.com>
	<CAA0H+QQUv30da390D8JRqYWWbrSFk9V+iDs68D3qQRJ4T-qkXw@mail.gmail.com>
Message-ID: <CAP7+vJ+46rF=6yxJKe5Bz4-GWKmjxg0ywBeDrkyn8iofXPrJhw@mail.gmail.com>

On Fri, Dec 21, 2012 at 5:13 PM, Jasper St. Pierre
<jstpierre at mecheye.net> wrote:
>
>
> On Fri, Dec 21, 2012 at 7:50 PM, Guido van Rossum <guido at python.org> wrote:
>>
>>
>> It's water under the bridge. We have PEP 380 in Python 3.3. I don't
>> want to change the language again in 3.4. Maybe after that we can
>> reconsider.
>
>
> One thing I'll say is that I think the coroutine decorator should convert
> something like:
>
>     @coroutine
>     def blah():
>         return "result"
>
> into the generator equivalent.

There's a tiny part of me that says that this might hide some bugs.
But mostly I agree and not doing it might make certain changes harder.
I did this in NDB too.

> You can do a syntax hack with:
>
>     @coroutine
>     def blah():
>         if 0: yield
>         return "result"
>
> but that feels bad. This sort of bug may seem unlikely, but a user may hit
> it if they're commenting out code,

Right.

> Maybe a generic @force_generator decorator might be useful...

That would be somebody else's PEP. :-)

-- 
--Guido van Rossum (python.org/~guido)


From jstpierre at mecheye.net  Sat Dec 22 02:17:16 2012
From: jstpierre at mecheye.net (Jasper St. Pierre)
Date: Fri, 21 Dec 2012 20:17:16 -0500
Subject: [Python-ideas] [Python-Dev] PEP 3156 - Asynchronous IO Support
	Rebooted
In-Reply-To: <CAP7+vJKzDC616V5LOSEV7=oLQ=96YiCNR1Xua46n3T0v09Zh5g@mail.gmail.com>
References: <CAP7+vJLrbi0jJkQe6f+MLWv2WatO4FmGJWs28TrkfcpXfSE4vQ@mail.gmail.com>
	<FADD4950E0EA483BA34DEE1C1BCFBB14@gmail.com>
	<CAP7+vJLGstFdYqrzwfb2V1WZKTz0s67UY6kmnSnO+nZ0sfL=2g@mail.gmail.com>
	<CAE_Hg6ahOSXipk=fwpiFgaB3vEQSOnW4T8g5EpLYH1fe6rZypw@mail.gmail.com>
	<CAP7+vJKzDC616V5LOSEV7=oLQ=96YiCNR1Xua46n3T0v09Zh5g@mail.gmail.com>
Message-ID: <CAA0H+QS4P7GytLNJNXZoCEzccqOhdkiR20TAntWeKZHuUqNd4Q@mail.gmail.com>

On Fri, Dec 21, 2012 at 8:02 PM, Guido van Rossum <guido at python.org> wrote:

... snip ...

In PEP 3156 conformant code you're supposed always to use 'yield
> from'. The only time you see a bare yield is when it's part of the
> implementation's internals. (However I think tulip actually will
> handle a yield the same way as a yield from, except that it's slower
> because it makes a roundtrip to the scheduler, a.k.a. trampoline.)
>

Would it be possible to fail on "yield"? Silently being slower when you
forget to type a keyword is something I can imagine will creep up a lot by
mistake, and I don't think it's a good idea to silently be slower when the
only different is five more characters.

> Will comment more as I keep reading I'm sure :)
>
> Please do!
>
> --
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
  Jasper
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121221/fcc73cd9/attachment.html>

From guido at python.org  Sat Dec 22 02:24:15 2012
From: guido at python.org (Guido van Rossum)
Date: Fri, 21 Dec 2012 17:24:15 -0800
Subject: [Python-ideas] [Python-Dev] PEP 3156 - Asynchronous IO Support
	Rebooted
In-Reply-To: <CAA0H+QS4P7GytLNJNXZoCEzccqOhdkiR20TAntWeKZHuUqNd4Q@mail.gmail.com>
References: <CAP7+vJLrbi0jJkQe6f+MLWv2WatO4FmGJWs28TrkfcpXfSE4vQ@mail.gmail.com>
	<FADD4950E0EA483BA34DEE1C1BCFBB14@gmail.com>
	<CAP7+vJLGstFdYqrzwfb2V1WZKTz0s67UY6kmnSnO+nZ0sfL=2g@mail.gmail.com>
	<CAE_Hg6ahOSXipk=fwpiFgaB3vEQSOnW4T8g5EpLYH1fe6rZypw@mail.gmail.com>
	<CAP7+vJKzDC616V5LOSEV7=oLQ=96YiCNR1Xua46n3T0v09Zh5g@mail.gmail.com>
	<CAA0H+QS4P7GytLNJNXZoCEzccqOhdkiR20TAntWeKZHuUqNd4Q@mail.gmail.com>
Message-ID: <CAP7+vJKHO=iuQb8pPO67Eug3d1X6N_n-UVOM_uy0d7O2mP2mdw@mail.gmail.com>

On Fri, Dec 21, 2012 at 5:17 PM, Jasper St. Pierre
<jstpierre at mecheye.net> wrote:
> On Fri, Dec 21, 2012 at 8:02 PM, Guido van Rossum <guido at python.org> wrote:
>
> ... snip ...
>
>> In PEP 3156 conformant code you're supposed always to use 'yield
>> from'. The only time you see a bare yield is when it's part of the
>> implementation's internals. (However I think tulip actually will
>> handle a yield the same way as a yield from, except that it's slower
>> because it makes a roundtrip to the scheduler, a.k.a. trampoline.)
>
>
> Would it be possible to fail on "yield"? Silently being slower when you
> forget to type a keyword is something I can imagine will creep up a lot by
> mistake, and I don't think it's a good idea to silently be slower when the
> only different is five more characters.

That's also a possibility. If someone can figure out a patch that
would be great.

-- 
--Guido van Rossum (python.org/~guido)


From ncoghlan at gmail.com  Sat Dec 22 05:46:39 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 22 Dec 2012 14:46:39 +1000
Subject: [Python-ideas] PEP 3156 feedback: wait_one vs par vs
	concurrent.futures.wait
Message-ID: <CADiSq7fuf7=HXUYpgTSML31VHr2XsxQrX315hy8sEQw016oZ9A@mail.gmail.com>

I figure python-ideas is still the best place for PEP 3156 feedback -
I think it's being revised too heavily for in-depth discussion on
python-dev to be a good idea, and I think spinning out a separate list
would lose too many people that are
interested-but-not-enough-to-subscribe-to-yet-another-mailing-list
(including me).

The current draft of the PEP suggests the use of par() for the barrier
operation (waiting for all futures and coroutines in a collection to
be ready), while tentatively suggesting wait_one() as the API for
waiting for the first completed operation in a collection. That
inconsistency is questionable all by itself, but there's a greater
stdlib level inconsistency that I find more concerning

The corresponding blocking API in concurrent.futures is the module
level "wait" function, which accepts a "return_when" parameter, with
the permitted values FIRST_COMPLETED, FIRST_EXCEPTION and
ALL_COMPLETED (the default). In the case where everything succeeds,
FIRST_EXCEPTION is the same as ALL_COMPLETED. This function also
accepts a timeout which allows the operation to finish early if the
operations take too long.

This flexibility also leads to a difference in the structure of the
return type: concurrent.futures.wait always returns a pair of sets,
with the first set being those futures which completed, while the
second contains those which remaining incomplete at the time the call
returned.

It seems to me that this "wait" API can be applied directly to the
equivalent problems in the async space, and, accordingly, *should* be
applied so that the synchronous and asynchronous APIs remain as
consistent as possible.

The low level equivalent to par() would be:

    incomplete = <tasks, futures or coroutines>
    complete, incomplete = yield from tulip.wait(incomplete)
    assert not incomplete # Without a timeout, everything should complete
    for f in complete:
        # Handle the completed operations

Limiting the maximum execution time of any task to 10 seconds is
straightforward:

    incomplete = <tasks, futures or coroutines>
    complete, incomplete = yield from tulip.wait(incomplete, timeout=10)
    for f in incomplete:
        f.cancel() # Took too long, kill it
    for f in complete:
        # Handle the completed operations

The low level equivalent to the wait_one() example would become:

    incomplete = <tasks, futures or coroutines>
    while incomplete:
        complete, incomplete = yield from tulip.wait(incomplete,
return_when=FIRST_COMPLETED)
        for f in complete:
            # Handle the completed operations

par() becomes easy to define as a coroutine:

    @coroutine
    def par(fs):
        complete, incomplete = yield from tulip.wait(fs,
return_when=FIRST_EXCEPTION)
        for f in incomplete:
            f.cancel() # Something must have failed, so cancel the rest
        # If something failed, calling f.result() will raise that exception
        return [f.result() for f in complete]

Defining wait_one() is also straightforward (although it isn't clearly
superior to just
using the underlying API directly):

    @coroutine
    def wait_one(fs):
        complete, incomplete = yield from tulip.wait(fs,
return_when=FIRST_COMPLETED)
        return complete.pop()

The async equivalent to "as_completed" under this scheme is far more
interesting, as it would be an iterator that produces coroutines:

    def as_completed(fs):
        incomplete = fs
        while incomplete:
            # Phase 1 of the loop, we yield a coroutine that actually
starts operations running
            @coroutine
            def _wait_for_some():
                nonlocal complete, incomplete
                complete, incomplete = yield from tulip.wait(fs,
return_when=FIRST_COMPLETED)
                return complete.pop().result()
            yield _wait_for_some()
            # Phase 2 of the loop, we pass back the already complete operations
            while complete:
                # Note this use case for @coroutine *forcing* objects
to behave like a generator,
                # as well as exploiting the ability to avoid trips
around the event loop
                @coroutine
                def _next_result():
                    return complete.pop().result()
                yield _next_result()

    # This is almost as easy to use as the synchronous equivalent, the
only difference
    # is the use of "yield from f" instead of the synchronous "f.result()"
    for f in as_completed(fs):
        next = yield from f

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


From guido at python.org  Sat Dec 22 06:17:07 2012
From: guido at python.org (Guido van Rossum)
Date: Fri, 21 Dec 2012 21:17:07 -0800
Subject: [Python-ideas] PEP 3156 feedback: wait_one vs par vs
	concurrent.futures.wait
In-Reply-To: <CADiSq7fuf7=HXUYpgTSML31VHr2XsxQrX315hy8sEQw016oZ9A@mail.gmail.com>
References: <CADiSq7fuf7=HXUYpgTSML31VHr2XsxQrX315hy8sEQw016oZ9A@mail.gmail.com>
Message-ID: <CAP7+vJJqydN=Rg8Bksn5b7+iEnXKRiqmoOSuAgY6OQ3+KK1y3Q@mail.gmail.com>

On Fri, Dec 21, 2012 at 8:46 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> I figure python-ideas is still the best place for PEP 3156 feedback -
> I think it's being revised too heavily for in-depth discussion on
> python-dev to be a good idea, and I think spinning out a separate list
> would lose too many people that are
> interested-but-not-enough-to-subscribe-to-yet-another-mailing-list
> (including me).
>
> The current draft of the PEP suggests the use of par() for the barrier
> operation (waiting for all futures and coroutines in a collection to
> be ready), while tentatively suggesting wait_one() as the API for
> waiting for the first completed operation in a collection. That
> inconsistency is questionable all by itself, but there's a greater
> stdlib level inconsistency that I find more concerning
>
> The corresponding blocking API in concurrent.futures is the module
> level "wait" function, which accepts a "return_when" parameter, with
> the permitted values FIRST_COMPLETED, FIRST_EXCEPTION and
> ALL_COMPLETED (the default). In the case where everything succeeds,
> FIRST_EXCEPTION is the same as ALL_COMPLETED. This function also
> accepts a timeout which allows the operation to finish early if the
> operations take too long.
>
> This flexibility also leads to a difference in the structure of the
> return type: concurrent.futures.wait always returns a pair of sets,
> with the first set being those futures which completed, while the
> second contains those which remaining incomplete at the time the call
> returned.
>
> It seems to me that this "wait" API can be applied directly to the
> equivalent problems in the async space, and, accordingly, *should* be
> applied so that the synchronous and asynchronous APIs remain as
> consistent as possible.

You've convinced me. I've never used the wait() and as_completed()
APIs in c.f, but you're right that with the exception of requiring
'yield from' they can be carried over exactly, and given that we're
doing the same thing with Future, this is eminently reasonable.

I may not get to implementing these for two weeks (I'll be traveling
without a computer) but they will not be forgotten.

--Guido

> The low level equivalent to par() would be:
>
>     incomplete = <tasks, futures or coroutines>
>     complete, incomplete = yield from tulip.wait(incomplete)
>     assert not incomplete # Without a timeout, everything should complete
>     for f in complete:
>         # Handle the completed operations
>
> Limiting the maximum execution time of any task to 10 seconds is
> straightforward:
>
>     incomplete = <tasks, futures or coroutines>
>     complete, incomplete = yield from tulip.wait(incomplete, timeout=10)
>     for f in incomplete:
>         f.cancel() # Took too long, kill it
>     for f in complete:
>         # Handle the completed operations
>
> The low level equivalent to the wait_one() example would become:
>
>     incomplete = <tasks, futures or coroutines>
>     while incomplete:
>         complete, incomplete = yield from tulip.wait(incomplete,
> return_when=FIRST_COMPLETED)
>         for f in complete:
>             # Handle the completed operations
>
> par() becomes easy to define as a coroutine:
>
>     @coroutine
>     def par(fs):
>         complete, incomplete = yield from tulip.wait(fs,
> return_when=FIRST_EXCEPTION)
>         for f in incomplete:
>             f.cancel() # Something must have failed, so cancel the rest
>         # If something failed, calling f.result() will raise that exception
>         return [f.result() for f in complete]
>
> Defining wait_one() is also straightforward (although it isn't clearly
> superior to just
> using the underlying API directly):
>
>     @coroutine
>     def wait_one(fs):
>         complete, incomplete = yield from tulip.wait(fs,
> return_when=FIRST_COMPLETED)
>         return complete.pop()
>
> The async equivalent to "as_completed" under this scheme is far more
> interesting, as it would be an iterator that produces coroutines:
>
>     def as_completed(fs):
>         incomplete = fs
>         while incomplete:
>             # Phase 1 of the loop, we yield a coroutine that actually
> starts operations running
>             @coroutine
>             def _wait_for_some():
>                 nonlocal complete, incomplete
>                 complete, incomplete = yield from tulip.wait(fs,
> return_when=FIRST_COMPLETED)
>                 return complete.pop().result()
>             yield _wait_for_some()
>             # Phase 2 of the loop, we pass back the already complete operations
>             while complete:
>                 # Note this use case for @coroutine *forcing* objects
> to behave like a generator,
>                 # as well as exploiting the ability to avoid trips
> around the event loop
>                 @coroutine
>                 def _next_result():
>                     return complete.pop().result()
>                 yield _next_result()
>
>     # This is almost as easy to use as the synchronous equivalent, the
> only difference
>     # is the use of "yield from f" instead of the synchronous "f.result()"
>     for f in as_completed(fs):
>         next = yield from f
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas



-- 
--Guido van Rossum (python.org/~guido)


From guido at python.org  Sat Dec 22 07:20:12 2012
From: guido at python.org (Guido van Rossum)
Date: Fri, 21 Dec 2012 22:20:12 -0800
Subject: [Python-ideas] PEP 3156 feedback: wait_one vs par vs
	concurrent.futures.wait
In-Reply-To: <CAP7+vJJqydN=Rg8Bksn5b7+iEnXKRiqmoOSuAgY6OQ3+KK1y3Q@mail.gmail.com>
References: <CADiSq7fuf7=HXUYpgTSML31VHr2XsxQrX315hy8sEQw016oZ9A@mail.gmail.com>
	<CAP7+vJJqydN=Rg8Bksn5b7+iEnXKRiqmoOSuAgY6OQ3+KK1y3Q@mail.gmail.com>
Message-ID: <CAP7+vJJ-FnugRuPi1oMZM+=Z8PdPEp0vx56oEhsZXqdHotppkA@mail.gmail.com>

On Fri, Dec 21, 2012 at 9:17 PM, Guido van Rossum <guido at python.org> wrote:
> On Fri, Dec 21, 2012 at 8:46 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> I figure python-ideas is still the best place for PEP 3156 feedback -
>> I think it's being revised too heavily for in-depth discussion on
>> python-dev to be a good idea, and I think spinning out a separate list
>> would lose too many people that are
>> interested-but-not-enough-to-subscribe-to-yet-another-mailing-list
>> (including me).
>>
>> The current draft of the PEP suggests the use of par() for the barrier
>> operation (waiting for all futures and coroutines in a collection to
>> be ready), while tentatively suggesting wait_one() as the API for
>> waiting for the first completed operation in a collection. That
>> inconsistency is questionable all by itself, but there's a greater
>> stdlib level inconsistency that I find more concerning
>>
>> The corresponding blocking API in concurrent.futures is the module
>> level "wait" function, which accepts a "return_when" parameter, with
>> the permitted values FIRST_COMPLETED, FIRST_EXCEPTION and
>> ALL_COMPLETED (the default). In the case where everything succeeds,
>> FIRST_EXCEPTION is the same as ALL_COMPLETED. This function also
>> accepts a timeout which allows the operation to finish early if the
>> operations take too long.
>>
>> This flexibility also leads to a difference in the structure of the
>> return type: concurrent.futures.wait always returns a pair of sets,
>> with the first set being those futures which completed, while the
>> second contains those which remaining incomplete at the time the call
>> returned.
>>
>> It seems to me that this "wait" API can be applied directly to the
>> equivalent problems in the async space, and, accordingly, *should* be
>> applied so that the synchronous and asynchronous APIs remain as
>> consistent as possible.
>
> You've convinced me. I've never used the wait() and as_completed()
> APIs in c.f, but you're right that with the exception of requiring
> 'yield from' they can be carried over exactly, and given that we're
> doing the same thing with Future, this is eminently reasonable.
>
> I may not get to implementing these for two weeks (I'll be traveling
> without a computer) but they will not be forgotten.

I did update the PEP. There are some questions about details; e.g. I
think the 'fs' argument should allow a mixture of Futures and
coroutines (the latter will be wrapped Tasks) and the sets returned by
wait() should contain Futures and Tasks. You propose that
as_completed() returns an iterator whose items are coroutines; why not
Futures? (They're more versatile even if slightly slower that
coroutines.) I can sort of see the reasoning but want to tease out
whether you meant it that way. Also, we can't have __next__() raise
TimeoutError, since it never blocks; it will have to be the coroutine
(or Future) returned by __next__().

> --Guido
>
>> The low level equivalent to par() would be:
>>
>>     incomplete = <tasks, futures or coroutines>
>>     complete, incomplete = yield from tulip.wait(incomplete)
>>     assert not incomplete # Without a timeout, everything should complete
>>     for f in complete:
>>         # Handle the completed operations
>>
>> Limiting the maximum execution time of any task to 10 seconds is
>> straightforward:
>>
>>     incomplete = <tasks, futures or coroutines>
>>     complete, incomplete = yield from tulip.wait(incomplete, timeout=10)
>>     for f in incomplete:
>>         f.cancel() # Took too long, kill it
>>     for f in complete:
>>         # Handle the completed operations
>>
>> The low level equivalent to the wait_one() example would become:
>>
>>     incomplete = <tasks, futures or coroutines>
>>     while incomplete:
>>         complete, incomplete = yield from tulip.wait(incomplete,
>> return_when=FIRST_COMPLETED)
>>         for f in complete:
>>             # Handle the completed operations
>>
>> par() becomes easy to define as a coroutine:
>>
>>     @coroutine
>>     def par(fs):
>>         complete, incomplete = yield from tulip.wait(fs,
>> return_when=FIRST_EXCEPTION)
>>         for f in incomplete:
>>             f.cancel() # Something must have failed, so cancel the rest
>>         # If something failed, calling f.result() will raise that exception
>>         return [f.result() for f in complete]
>>
>> Defining wait_one() is also straightforward (although it isn't clearly
>> superior to just
>> using the underlying API directly):
>>
>>     @coroutine
>>     def wait_one(fs):
>>         complete, incomplete = yield from tulip.wait(fs,
>> return_when=FIRST_COMPLETED)
>>         return complete.pop()
>>
>> The async equivalent to "as_completed" under this scheme is far more
>> interesting, as it would be an iterator that produces coroutines:
>>
>>     def as_completed(fs):
>>         incomplete = fs
>>         while incomplete:
>>             # Phase 1 of the loop, we yield a coroutine that actually
>> starts operations running
>>             @coroutine
>>             def _wait_for_some():
>>                 nonlocal complete, incomplete
>>                 complete, incomplete = yield from tulip.wait(fs,
>> return_when=FIRST_COMPLETED)
>>                 return complete.pop().result()
>>             yield _wait_for_some()
>>             # Phase 2 of the loop, we pass back the already complete operations
>>             while complete:
>>                 # Note this use case for @coroutine *forcing* objects
>> to behave like a generator,
>>                 # as well as exploiting the ability to avoid trips
>> around the event loop
>>                 @coroutine
>>                 def _next_result():
>>                     return complete.pop().result()
>>                 yield _next_result()
>>
>>     # This is almost as easy to use as the synchronous equivalent, the
>> only difference
>>     # is the use of "yield from f" instead of the synchronous "f.result()"
>>     for f in as_completed(fs):
>>         next = yield from f
>>
>> Cheers,
>> Nick.
>>
>> --
>> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>
>
>
> --
> --Guido van Rossum (python.org/~guido)



-- 
--Guido van Rossum (python.org/~guido)


From ncoghlan at gmail.com  Sat Dec 22 09:04:58 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 22 Dec 2012 18:04:58 +1000
Subject: [Python-ideas] PEP 3156 feedback: wait_one vs par vs
	concurrent.futures.wait
In-Reply-To: <CAP7+vJJ-FnugRuPi1oMZM+=Z8PdPEp0vx56oEhsZXqdHotppkA@mail.gmail.com>
References: <CADiSq7fuf7=HXUYpgTSML31VHr2XsxQrX315hy8sEQw016oZ9A@mail.gmail.com>
	<CAP7+vJJqydN=Rg8Bksn5b7+iEnXKRiqmoOSuAgY6OQ3+KK1y3Q@mail.gmail.com>
	<CAP7+vJJ-FnugRuPi1oMZM+=Z8PdPEp0vx56oEhsZXqdHotppkA@mail.gmail.com>
Message-ID: <CADiSq7c4mWeVxbHRqsbQAP6-MskE13BRakb1fXXabaMV0odFog@mail.gmail.com>

On Sat, Dec 22, 2012 at 4:20 PM, Guido van Rossum <guido at python.org> wrote:
> I did update the PEP. There are some questions about details; e.g. I
> think the 'fs' argument should allow a mixture of Futures and
> coroutines (the latter will be wrapped Tasks) and the sets returned by
> wait() should contain Futures and Tasks.

Yes, I think I wrote my examples that way, even though I didn't say
that in the text.

> You propose that
> as_completed() returns an iterator whose items are coroutines; why not
> Futures? (They're more versatile even if slightly slower that
> coroutines.) I can sort of see the reasoning but want to tease out
> whether you meant it that way.

I deliberately chose to return coroutines. My rationale is to be able
to handle the case where multiple operations become ready without
having to make multiple trips around the event loop by having the
iterator switch between two modes: when the complete set is empty, it
yields a coroutine that calls wait and then returns the first complete
future, while when there are already complete futures available, it
yields a coroutine that just returns one of them immediately. It's
really the same rationale as that for having @coroutine not
automatically wrap things in Task - if we can avoid the event loop in
cases that don't actually need to wait for an event, that's a good
thing.

> Also, we can't have __next__() raise
> TimeoutError, since it never blocks; it will have to be the coroutine
> (or Future) returned by __next__().

Yeah, any exceptions should happen at the yield from call inside the
loop. I *think* my implementation achieves that (since the coroutine
instances it creates are passed out to the for loop for further
processing), but it's quite possible I missed something.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


From ncoghlan at gmail.com  Sat Dec 22 10:14:59 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 22 Dec 2012 19:14:59 +1000
Subject: [Python-ideas] Async context managers and iterators with tulip
Message-ID: <CADiSq7f6NpeP_NDEKruJofX7QAsuf42ex=363cRMgL9BHJs+eg@mail.gmail.com>

On Sat, Dec 22, 2012 at 4:17 PM, guido.van.rossum
<python-checkins at python.org> wrote:
> +- We might introduce explicit locks, though these will be a bit of a
> +  pain to use, as we can't use the ``with lock: block`` syntax
> +  (because to wait for a lock we'd have to use ``yield from``, which
> +  the ``with`` statement can't do).

Actually, I just realised that the following can work if the async
lock is defined appropriately:

    with yield from async_lock:
        ...

The secret is that async_lock would need to be a coroutine rather than
a context manager. *Calling* the coroutine would acquire the lock
(potentially registering a callback that is scheduled when the lock is
released) and return a context manager that released the lock. The
async_lock itself wouldn't be a context manager, so you'd get an
immediate error if you left out the "yield from".

We'd be heading even further down the path of
two-languages-for-the-price-of-one if we did that, though (by which I
mean the fact that async code and synchronous code exist in parallel
universes - one, more familiar one, where the ability to block is
assumed, as is the fact that any operation may give concurrent code
the chance to execute, and the universe of Twisted, tulip, et al,
where possible suspension points are required to be explicitly marked
in the function where they occur).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


From _ at lvh.cc  Sat Dec 22 13:26:44 2012
From: _ at lvh.cc (Laurens Van Houtven)
Date: Sat, 22 Dec 2012 13:26:44 +0100
Subject: [Python-ideas] Async context managers and iterators with tulip
In-Reply-To: <CADiSq7f6NpeP_NDEKruJofX7QAsuf42ex=363cRMgL9BHJs+eg@mail.gmail.com>
References: <CADiSq7f6NpeP_NDEKruJofX7QAsuf42ex=363cRMgL9BHJs+eg@mail.gmail.com>
Message-ID: <CAE_Hg6bhnF1S_yN9FSZ=7a-ju_Wv_5LQ-D9Ns=Hpoy8WKiUCeQ@mail.gmail.com>

I can't quite tell by the wording if you consider
two-languages-for-the-price-of-one a good thing or a bad thing; but I can
tell you that at least in Twisted, explicit suspension points have been a
definite boon :) While it may lead to issues in some things (e.g. new users
using blocking urllib calls in a callback), I find the net result much
easier to read and reason about.

cheers,
lvh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121222/d237bac7/attachment.html>

From ncoghlan at gmail.com  Sat Dec 22 13:57:40 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 22 Dec 2012 22:57:40 +1000
Subject: [Python-ideas] Async context managers and iterators with tulip
In-Reply-To: <CAE_Hg6bhnF1S_yN9FSZ=7a-ju_Wv_5LQ-D9Ns=Hpoy8WKiUCeQ@mail.gmail.com>
References: <CADiSq7f6NpeP_NDEKruJofX7QAsuf42ex=363cRMgL9BHJs+eg@mail.gmail.com>
	<CAE_Hg6bhnF1S_yN9FSZ=7a-ju_Wv_5LQ-D9Ns=Hpoy8WKiUCeQ@mail.gmail.com>
Message-ID: <CADiSq7c7CoHQS0n5GqH-+k69V1WwV8qi-5s5rSWXGb2GPyHjhw@mail.gmail.com>

On Sat, Dec 22, 2012 at 10:26 PM, Laurens Van Houtven <_ at lvh.cc> wrote:
> I can't quite tell by the wording if you consider
> two-languages-for-the-price-of-one a good thing or a bad thing; but I can
> tell you that at least in Twisted, explicit suspension points have been a
> definite boon :) While it may lead to issues in some things (e.g. new users
> using blocking urllib calls in a callback), I find the net result much
> easier to read and reason about.

On balance, I consider it better than offering only greenlet-style
implicit switching (which is effectively equivalent to preemptive
threading, since any function call or operator may suspend the task).
I'm also a lot happier about it since realising that the model of
emitting futures and using "yield from f" where synchronous code would
use "f.result()" helps unify the two worlds.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


From guido at python.org  Sat Dec 22 16:54:55 2012
From: guido at python.org (Guido van Rossum)
Date: Sat, 22 Dec 2012 07:54:55 -0800
Subject: [Python-ideas] PEP 3156 feedback: wait_one vs par vs
	concurrent.futures.wait
In-Reply-To: <CADiSq7c4mWeVxbHRqsbQAP6-MskE13BRakb1fXXabaMV0odFog@mail.gmail.com>
References: <CADiSq7fuf7=HXUYpgTSML31VHr2XsxQrX315hy8sEQw016oZ9A@mail.gmail.com>
	<CAP7+vJJqydN=Rg8Bksn5b7+iEnXKRiqmoOSuAgY6OQ3+KK1y3Q@mail.gmail.com>
	<CAP7+vJJ-FnugRuPi1oMZM+=Z8PdPEp0vx56oEhsZXqdHotppkA@mail.gmail.com>
	<CADiSq7c4mWeVxbHRqsbQAP6-MskE13BRakb1fXXabaMV0odFog@mail.gmail.com>
Message-ID: <CAP7+vJ+cuNHPVbu+G=YyGsSzJOMsX4oCpD0c6mR690tBb4vmoA@mail.gmail.com>

On Sat, Dec 22, 2012 at 12:04 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Sat, Dec 22, 2012 at 4:20 PM, Guido van Rossum <guido at python.org> wrote:
>> I did update the PEP. There are some questions about details; e.g. I
>> think the 'fs' argument should allow a mixture of Futures and
>> coroutines (the latter will be wrapped Tasks) and the sets returned by
>> wait() should contain Futures and Tasks.
>
> Yes, I think I wrote my examples that way, even though I didn't say
> that in the text.

Good.

>> You propose that
>> as_completed() returns an iterator whose items are coroutines; why not
>> Futures? (They're more versatile even if slightly slower that
>> coroutines.) I can sort of see the reasoning but want to tease out
>> whether you meant it that way.
>
> I deliberately chose to return coroutines. My rationale is to be able
> to handle the case where multiple operations become ready without
> having to make multiple trips around the event loop by having the
> iterator switch between two modes: when the complete set is empty, it
> yields a coroutine that calls wait and then returns the first complete
> future, while when there are already complete futures available, it
> yields a coroutine that just returns one of them immediately. It's
> really the same rationale as that for having @coroutine not
> automatically wrap things in Task - if we can avoid the event loop in
> cases that don't actually need to wait for an event, that's a good
> thing.

I think I see it now. The first item yielded is the simplest thing
that can be used with yield-from, i.e. a coroutine. Then if multiple
futures are ready at once, you return an item of the same type, i.e. a
coroutine. This is essentially wrapping a Future in a coroutine! If we
could live with the items being alternatingly coroutines and Futures,
we could just return the Future in this case. BTW, yield from <future>
need not go to the scheduler if the Future is already done -- the
Future,__iter__ method should be:

    def __iter__(self):
        if not self.done():
            yield self  # This tells Task to wait for completion.
        return self.result()  # May raise too.

(I forgot this previously.)

>> Also, we can't have __next__() raise
>> TimeoutError, since it never blocks; it will have to be the coroutine
>> (or Future) returned by __next__().
>
> Yeah, any exceptions should happen at the yield from call inside the
> loop. I *think* my implementation achieves that (since the coroutine
> instances it creates are passed out to the for loop for further
> processing), but it's quite possible I missed something.

It'll come out in implementation (in two weeks, maybe).

-- 
--Guido van Rossum (python.org/~guido)


From guido at python.org  Sat Dec 22 17:01:19 2012
From: guido at python.org (Guido van Rossum)
Date: Sat, 22 Dec 2012 08:01:19 -0800
Subject: [Python-ideas] Async context managers and iterators with tulip
In-Reply-To: <CADiSq7f6NpeP_NDEKruJofX7QAsuf42ex=363cRMgL9BHJs+eg@mail.gmail.com>
References: <CADiSq7f6NpeP_NDEKruJofX7QAsuf42ex=363cRMgL9BHJs+eg@mail.gmail.com>
Message-ID: <CAP7+vJJ5_yO-0aZUuxn4kqHupcENzwCU2JKu+6zMBA8=ByEuOQ@mail.gmail.com>

On Sat, Dec 22, 2012 at 1:14 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Sat, Dec 22, 2012 at 4:17 PM, guido.van.rossum
> <python-checkins at python.org> wrote:
>> +- We might introduce explicit locks, though these will be a bit of a
>> +  pain to use, as we can't use the ``with lock: block`` syntax
>> +  (because to wait for a lock we'd have to use ``yield from``, which
>> +  the ``with`` statement can't do).
>
> Actually, I just realised that the following can work if the async
> lock is defined appropriately:
>
>     with yield from async_lock:
>         ...

Syntactically you'd have to say

    with (yield from async_lock):
        ....

> The secret is that async_lock would need to be a coroutine rather than
> a context manager. *Calling* the coroutine would acquire the lock
> (potentially registering a callback that is scheduled when the lock is
> released) and return a context manager that released the lock. The
> async_lock itself wouldn't be a context manager, so you'd get an
> immediate error if you left out the "yield from".

Very nice.

> We'd be heading even further down the path of
> two-languages-for-the-price-of-one if we did that, though (by which I
> mean the fact that async code and synchronous code exist in parallel
> universes - one, more familiar one, where the ability to block is
> assumed, as is the fact that any operation may give concurrent code
> the chance to execute, and the universe of Twisted, tulip, et al,
> where possible suspension points are required to be explicitly marked
> in the function where they occur).

It's inevitable that some patterns work well together while others
don't. I see no big philosophical problem with this. Pragmatically,
we'll have plenty of places where existing stdlib modules can't be
used with tulip, and the tulip-compatible upgrade will have a
different API. (The trickiest part will be that the classic code, e.g.
urllib, must work in any thread and cannot rely on the existence of an
event loop. *Maybe* you can get by with
get_event_loop().run_until_complete(<future>) but that might still
depend on the default event loop policy. Food for thought.)

-- 
--Guido van Rossum (python.org/~guido)


From guido at python.org  Sat Dec 22 17:03:29 2012
From: guido at python.org (Guido van Rossum)
Date: Sat, 22 Dec 2012 08:03:29 -0800
Subject: [Python-ideas] Async context managers and iterators with tulip
In-Reply-To: <CADiSq7c7CoHQS0n5GqH-+k69V1WwV8qi-5s5rSWXGb2GPyHjhw@mail.gmail.com>
References: <CADiSq7f6NpeP_NDEKruJofX7QAsuf42ex=363cRMgL9BHJs+eg@mail.gmail.com>
	<CAE_Hg6bhnF1S_yN9FSZ=7a-ju_Wv_5LQ-D9Ns=Hpoy8WKiUCeQ@mail.gmail.com>
	<CADiSq7c7CoHQS0n5GqH-+k69V1WwV8qi-5s5rSWXGb2GPyHjhw@mail.gmail.com>
Message-ID: <CAP7+vJK_ipr8Azra+8Uw1+VSNi6r+ioSF-3onV4bCTaiogOivA@mail.gmail.com>

On Sat, Dec 22, 2012 at 4:57 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Sat, Dec 22, 2012 at 10:26 PM, Laurens Van Houtven <_ at lvh.cc> wrote:
>> I can't quite tell by the wording if you consider
>> two-languages-for-the-price-of-one a good thing or a bad thing; but I can
>> tell you that at least in Twisted, explicit suspension points have been a
>> definite boon :) While it may lead to issues in some things (e.g. new users
>> using blocking urllib calls in a callback), I find the net result much
>> easier to read and reason about.
>
> On balance, I consider it better than offering only greenlet-style
> implicit switching (which is effectively equivalent to preemptive
> threading, since any function call or operator may suspend the task).
> I'm also a lot happier about it since realising that the model of
> emitting futures and using "yield from f" where synchronous code would
> use "f.result()" helps unify the two worlds.

I wouldn't go so far as to call that unifying, but it definitely helps
people transition. Still, from experience with introducing NDB's async
in some internal App Engine software, it takes some getting used to
even for the best of developers. But it is worth it.

-- 
--Guido van Rossum (python.org/~guido)


From andrew.svetlov at gmail.com  Sat Dec 22 18:11:09 2012
From: andrew.svetlov at gmail.com (Andrew Svetlov)
Date: Sat, 22 Dec 2012 19:11:09 +0200
Subject: [Python-ideas] ``with from`` statement
Message-ID: <CAL3CFcWTUOKNcM3Tm88fUZVQr0mTpP-HqLx-caM2J-jN24kzOw@mail.gmail.com>

Crazy idea.
Guido van Rossum mentioned after working on PEP 3156 that context
managers cannot use
yield from statement inside __enter__ and __exit__ magic methods.
Explicit call for entering and leaving context (for locking for
example) is not convenient.

What do you think about

with from f():
   do_our_work()


``with from ...` construction calls __enter_from__ generator  and
iterates via ``yield from`` for that.
Returned value is our context manager.

The same for __exit_from__ ? do``yield from`` for that and stop on
StopIteration or exception.

--
Thanks,
Andrew Svetlov


From guido at python.org  Sat Dec 22 18:15:00 2012
From: guido at python.org (Guido van Rossum)
Date: Sat, 22 Dec 2012 09:15:00 -0800
Subject: [Python-ideas] ``with from`` statement
In-Reply-To: <CAL3CFcWTUOKNcM3Tm88fUZVQr0mTpP-HqLx-caM2J-jN24kzOw@mail.gmail.com>
References: <CAL3CFcWTUOKNcM3Tm88fUZVQr0mTpP-HqLx-caM2J-jN24kzOw@mail.gmail.com>
Message-ID: <CAP7+vJJFpWuH7_8njPkx1Ss37pHjNhRVrdv8wZOqGtFktmEQ4A@mail.gmail.com>

Nick already proposed "with (yield from ...): ..."

Maybe in 3.4 we can tweak the syntax so the paresns are not needed.

I am quite glad that we had the foresight (when we designed 'with') to make
this possible.

On Saturday, December 22, 2012, Andrew Svetlov wrote:

> Crazy idea.
> Guido van Rossum mentioned after working on PEP 3156 that context
> managers cannot use
> yield from statement inside __enter__ and __exit__ magic methods.
> Explicit call for entering and leaving context (for locking for
> example) is not convenient.
>
> What do you think about
>
> with from f():
>    do_our_work()
>
>
> ``with from ...` construction calls __enter_from__ generator  and
> iterates via ``yield from`` for that.
> Returned value is our context manager.
>
> The same for __exit_from__ ? do``yield from`` for that and stop on
> StopIteration or exception.
>
> --
> Thanks,
> Andrew Svetlov
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org <javascript:;>
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121222/9fc08254/attachment.html>

From andrew.svetlov at gmail.com  Sat Dec 22 18:22:55 2012
From: andrew.svetlov at gmail.com (Andrew Svetlov)
Date: Sat, 22 Dec 2012 19:22:55 +0200
Subject: [Python-ideas] ``with from`` statement
In-Reply-To: <CAP7+vJJFpWuH7_8njPkx1Ss37pHjNhRVrdv8wZOqGtFktmEQ4A@mail.gmail.com>
References: <CAL3CFcWTUOKNcM3Tm88fUZVQr0mTpP-HqLx-caM2J-jN24kzOw@mail.gmail.com>
	<CAP7+vJJFpWuH7_8njPkx1Ss37pHjNhRVrdv8wZOqGtFktmEQ4A@mail.gmail.com>
Message-ID: <CAL3CFcXbCooYarwEEtz0UQEG-7QZ4K0JR4tOaxt7pSiPnOgEgw@mail.gmail.com>

Yes, Nick's proposal is just awesome.

I cannot figure out is __exit__ can be generator which use ``yield
from`` also inside?

On Sat, Dec 22, 2012 at 7:15 PM, Guido van Rossum <guido at python.org> wrote:
> Nick already proposed "with (yield from ...): ..."
>
> Maybe in 3.4 we can tweak the syntax so the paresns are not needed.
>
> I am quite glad that we had the foresight (when we designed 'with') to make
> this possible.
>
>
> On Saturday, December 22, 2012, Andrew Svetlov wrote:
>>
>> Crazy idea.
>> Guido van Rossum mentioned after working on PEP 3156 that context
>> managers cannot use
>> yield from statement inside __enter__ and __exit__ magic methods.
>> Explicit call for entering and leaving context (for locking for
>> example) is not convenient.
>>
>> What do you think about
>>
>> with from f():
>>    do_our_work()
>>
>>
>> ``with from ...` construction calls __enter_from__ generator  and
>> iterates via ``yield from`` for that.
>> Returned value is our context manager.
>>
>> The same for __exit_from__ ? do``yield from`` for that and stop on
>> StopIteration or exception.
>>
>> --
>> Thanks,
>> Andrew Svetlov
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>
>
>
> --
> --Guido van Rossum (python.org/~guido)



-- 
Thanks,
Andrew Svetlov


From andrew.svetlov at gmail.com  Sat Dec 22 18:25:32 2012
From: andrew.svetlov at gmail.com (Andrew Svetlov)
Date: Sat, 22 Dec 2012 19:25:32 +0200
Subject: [Python-ideas] ``with from`` statement
In-Reply-To: <CAL3CFcXbCooYarwEEtz0UQEG-7QZ4K0JR4tOaxt7pSiPnOgEgw@mail.gmail.com>
References: <CAL3CFcWTUOKNcM3Tm88fUZVQr0mTpP-HqLx-caM2J-jN24kzOw@mail.gmail.com>
	<CAP7+vJJFpWuH7_8njPkx1Ss37pHjNhRVrdv8wZOqGtFktmEQ4A@mail.gmail.com>
	<CAL3CFcXbCooYarwEEtz0UQEG-7QZ4K0JR4tOaxt7pSiPnOgEgw@mail.gmail.com>
Message-ID: <CAL3CFcXi+kFa_0oztZE5U+83=4QajOW3jihnN8kqviBwy7dtWg@mail.gmail.com>

Python syntax looks like use of time machine day by day. I like it!

On Sat, Dec 22, 2012 at 7:22 PM, Andrew Svetlov
<andrew.svetlov at gmail.com> wrote:
> Yes, Nick's proposal is just awesome.
>
> I cannot figure out is __exit__ can be generator which use ``yield
> from`` also inside?
>
> On Sat, Dec 22, 2012 at 7:15 PM, Guido van Rossum <guido at python.org> wrote:
>> Nick already proposed "with (yield from ...): ..."
>>
>> Maybe in 3.4 we can tweak the syntax so the paresns are not needed.
>>
>> I am quite glad that we had the foresight (when we designed 'with') to make
>> this possible.
>>
>>
>> On Saturday, December 22, 2012, Andrew Svetlov wrote:
>>>
>>> Crazy idea.
>>> Guido van Rossum mentioned after working on PEP 3156 that context
>>> managers cannot use
>>> yield from statement inside __enter__ and __exit__ magic methods.
>>> Explicit call for entering and leaving context (for locking for
>>> example) is not convenient.
>>>
>>> What do you think about
>>>
>>> with from f():
>>>    do_our_work()
>>>
>>>
>>> ``with from ...` construction calls __enter_from__ generator  and
>>> iterates via ``yield from`` for that.
>>> Returned value is our context manager.
>>>
>>> The same for __exit_from__ ? do``yield from`` for that and stop on
>>> StopIteration or exception.
>>>
>>> --
>>> Thanks,
>>> Andrew Svetlov
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at python.org
>>> http://mail.python.org/mailman/listinfo/python-ideas
>>
>>
>>
>> --
>> --Guido van Rossum (python.org/~guido)
>
>
>
> --
> Thanks,
> Andrew Svetlov



-- 
Thanks,
Andrew Svetlov


From solipsis at pitrou.net  Sat Dec 22 19:26:54 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 22 Dec 2012 19:26:54 +0100
Subject: [Python-ideas] ``with from`` statement
References: <CAL3CFcWTUOKNcM3Tm88fUZVQr0mTpP-HqLx-caM2J-jN24kzOw@mail.gmail.com>
	<CAP7+vJJFpWuH7_8njPkx1Ss37pHjNhRVrdv8wZOqGtFktmEQ4A@mail.gmail.com>
	<CAL3CFcXbCooYarwEEtz0UQEG-7QZ4K0JR4tOaxt7pSiPnOgEgw@mail.gmail.com>
	<CAL3CFcXi+kFa_0oztZE5U+83=4QajOW3jihnN8kqviBwy7dtWg@mail.gmail.com>
Message-ID: <20121222192654.2775cb00@pitrou.net>

On Sat, 22 Dec 2012 19:25:32 +0200
Andrew Svetlov <andrew.svetlov at gmail.com>
wrote:
> Python syntax looks like use of time machine day by day. I like it!

Not sure I like "with yield from". How do you intend to explain that to
an average programmer?

Regards

Antoine.




From guido at python.org  Sat Dec 22 20:09:22 2012
From: guido at python.org (Guido van Rossum)
Date: Sat, 22 Dec 2012 11:09:22 -0800
Subject: [Python-ideas] ``with from`` statement
In-Reply-To: <20121222192654.2775cb00@pitrou.net>
References: <CAL3CFcWTUOKNcM3Tm88fUZVQr0mTpP-HqLx-caM2J-jN24kzOw@mail.gmail.com>
	<CAP7+vJJFpWuH7_8njPkx1Ss37pHjNhRVrdv8wZOqGtFktmEQ4A@mail.gmail.com>
	<CAL3CFcXbCooYarwEEtz0UQEG-7QZ4K0JR4tOaxt7pSiPnOgEgw@mail.gmail.com>
	<CAL3CFcXi+kFa_0oztZE5U+83=4QajOW3jihnN8kqviBwy7dtWg@mail.gmail.com>
	<20121222192654.2775cb00@pitrou.net>
Message-ID: <CAP7+vJK0k1-keakrS0i3JXO8t9C1tOiF_8Qb+Q-nuNiS+ztEXA@mail.gmail.com>

On Sat, Dec 22, 2012 at 10:26 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Sat, 22 Dec 2012 19:25:32 +0200
> Andrew Svetlov <andrew.svetlov at gmail.com>
> wrote:
>> Python syntax looks like use of time machine day by day. I like it!
>
> Not sure I like "with yield from". How do you intend to explain that to
> an average programmer?

Break it down into pieces. The general form is

  with <expr>: <block>

where <expr> can take many forms, including

  yield from <expr>

we just have to handwave a bit about the priorities, but that's
usually okay. People do get

 x = yield from <expr>

It's just that currently somehow you have to surround "yield from
<expr>" in an extra pair of parentheses everywhere except on the RHS
of an assignment; my other pet peeve in this area is that you must
write

  return (yield from <expr>)

(which I end up writing fairly regularly).

I assume that if we can make the parens optional for assignment, we
can make them optional in other places.

-- 
--Guido van Rossum (python.org/~guido)


From tjreedy at udel.edu  Sat Dec 22 23:20:08 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 22 Dec 2012 17:20:08 -0500
Subject: [Python-ideas] ``with from`` statement
In-Reply-To: <CAP7+vJK0k1-keakrS0i3JXO8t9C1tOiF_8Qb+Q-nuNiS+ztEXA@mail.gmail.com>
References: <CAL3CFcWTUOKNcM3Tm88fUZVQr0mTpP-HqLx-caM2J-jN24kzOw@mail.gmail.com>
	<CAP7+vJJFpWuH7_8njPkx1Ss37pHjNhRVrdv8wZOqGtFktmEQ4A@mail.gmail.com>
	<CAL3CFcXbCooYarwEEtz0UQEG-7QZ4K0JR4tOaxt7pSiPnOgEgw@mail.gmail.com>
	<CAL3CFcXi+kFa_0oztZE5U+83=4QajOW3jihnN8kqviBwy7dtWg@mail.gmail.com>
	<20121222192654.2775cb00@pitrou.net>
	<CAP7+vJK0k1-keakrS0i3JXO8t9C1tOiF_8Qb+Q-nuNiS+ztEXA@mail.gmail.com>
Message-ID: <kb5bmv$6li$1@ger.gmane.org>

On 12/22/2012 2:09 PM, Guido van Rossum wrote:
> On Sat, Dec 22, 2012 at 10:26 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:
>> On Sat, 22 Dec 2012 19:25:32 +0200
>> Andrew Svetlov <andrew.svetlov at gmail.com>
>> wrote:
>>> Python syntax looks like use of time machine day by day. I like it!
>>
>> Not sure I like "with yield from".

At the moment, that looks a bit dubious to me too. Maybe just because it 
is new (to me).

>> How do you intend to explain that to
>> an average programmer?
>
> Break it down into pieces. The general form is
>
>    with <expr>: <block>

with <expr> as <expr>: <block>

with yield from x() as y: ...

> where <expr> can take many forms, including
>
>    yield from <expr>
>
> we just have to handwave a bit about the priorities,

Too much dependence on implicit priorities makes the language more 
baroque and less clear. For instance, I am fine with having to 
parenthesize generator expressions (except in calls where it would 
result in doubled parens ((ge))). An explanation need more than a 
handwave ;-).

> but that's usually okay. People do get
>
>   x = yield from <expr>

No problem because = cleanly breaks the statement. More a problem is the 
difference of x coming from a value yielded (or returned?) by the callee 
instead of a value sent by the caller, as in x = yield y.

> It's just that currently somehow you have to surround "yield from
> <expr>" in an extra pair of parentheses everywhere except on the RHS
> of an assignment; my other pet peeve in this area is that you must
> write
>
>    return (yield from <expr>)
>
> (which I end up writing fairly regularly).

I can see how that seems like a nuisance. Why that omitting parens 
bother me less here? Perhaps because return binds the expression to 
location of the call in the calling expression.

> I assume that if we can make the parens optional for assignment, we
> can make them optional in other places.

If the grammar can be written to do that sufficiently clearly, then it 
should be explainable to people.

-- 
Terry Jan Reedy



From tjreedy at udel.edu  Sun Dec 23 00:03:55 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 22 Dec 2012 18:03:55 -0500
Subject: [Python-ideas] Tkinter and tulip
Message-ID: <kb5e93$ojs$1@ger.gmane.org>

Though not mentioned much in the tulip discussion, tkinter is a third 
'T' package with its own event loop. (And by the way, I associate 
'tulip' with 'Floriade', with 10s of thousands of tulips in bloom. It 
was a +++ experience. But I suppose it is too cute for Python ;-)

Yesterday, tk/tkinter expert Kevin Walzer asked on python-list how to 
(easily) read a pipe asynchonously and post the result to a tk text 
widget. I don't know the answer now, but is my understanding correct 
that in the future a) there should be a tk loop adapter that could 
replace the default tulip loop and b) it would then be easy to add i/o 
events to the tk loop?

My personal interest is whether it will some day be possible to re-write 
IDLE to use tulip so one could edit in an edit pane while the shell pane 
asynchronously waits for and displays output from a 'long' computation.* 
It would also be nice if ^C could be made to work better -- which is to 
say, take effect sooner -- by decoupling key processing from socket 
reading. I am thinking that IDLE could be both a simple test and 
showcase for the usefulness of tulip.

*I currently put shell and edit windows side-by-side on my wide-screen 
monitor. I can imagine putting two panes in one window instead.

-- 
Terry Jan Reedy



From ncoghlan at gmail.com  Sun Dec 23 06:46:41 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 23 Dec 2012 15:46:41 +1000
Subject: [Python-ideas] PEP 3156 feedback: wait_one vs par vs
	concurrent.futures.wait
In-Reply-To: <CAP7+vJ+cuNHPVbu+G=YyGsSzJOMsX4oCpD0c6mR690tBb4vmoA@mail.gmail.com>
References: <CADiSq7fuf7=HXUYpgTSML31VHr2XsxQrX315hy8sEQw016oZ9A@mail.gmail.com>
	<CAP7+vJJqydN=Rg8Bksn5b7+iEnXKRiqmoOSuAgY6OQ3+KK1y3Q@mail.gmail.com>
	<CAP7+vJJ-FnugRuPi1oMZM+=Z8PdPEp0vx56oEhsZXqdHotppkA@mail.gmail.com>
	<CADiSq7c4mWeVxbHRqsbQAP6-MskE13BRakb1fXXabaMV0odFog@mail.gmail.com>
	<CAP7+vJ+cuNHPVbu+G=YyGsSzJOMsX4oCpD0c6mR690tBb4vmoA@mail.gmail.com>
Message-ID: <CADiSq7egOLw-W1osbTvQ62o_dmRX9quvEbmTjfOchubdzjfw-Q@mail.gmail.com>

On Sun, Dec 23, 2012 at 1:54 AM, Guido van Rossum <guido at python.org> wrote:
> On Sat, Dec 22, 2012 at 12:04 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> I deliberately chose to return coroutines. My rationale is to be able
>> to handle the case where multiple operations become ready without
>> having to make multiple trips around the event loop by having the
>> iterator switch between two modes: when the complete set is empty, it
>> yields a coroutine that calls wait and then returns the first complete
>> future, while when there are already complete futures available, it
>> yields a coroutine that just returns one of them immediately. It's
>> really the same rationale as that for having @coroutine not
>> automatically wrap things in Task - if we can avoid the event loop in
>> cases that don't actually need to wait for an event, that's a good
>> thing.
>
> I think I see it now. The first item yielded is the simplest thing
> that can be used with yield-from, i.e. a coroutine. Then if multiple
> futures are ready at once, you return an item of the same type, i.e. a
> coroutine. This is essentially wrapping a Future in a coroutine! If we
> could live with the items being alternatingly coroutines and Futures,
> we could just return the Future in this case. BTW, yield from <future>
> need not go to the scheduler if the Future is already done -- the
> Future,__iter__ method should be:
>
>     def __iter__(self):
>         if not self.done():
>             yield self  # This tells Task to wait for completion.
>         return self.result()  # May raise too.
>
> (I forgot this previously.)

And I'd missed it completely :)

In that case, yeah, yielding any already completed Futures directly
from as_completed() should work. The "no completed operations" case
will still need a coroutine, though, as it needs to update the
"complete" and "incomplete" sets inside the iterator. Since we know
we're certain to hit the scheduler in that case, we may as well wrap
it directly in a task so we're always returning some kind of future.
The impl might end up looking something like:

    def as_completed(fs):
        incomplete = fs
        while incomplete:
            # Phase 1 of the loop, we yield a Task that waits for operations
            @coroutine
            def _wait_for_some():
                nonlocal complete, incomplete
                complete, incomplete = yield from tulip.wait(fs,
return_when=FIRST_COMPLETED)
                return complete.pop().result()
            yield Task(_wait_for_some())
            # Phase 2 of the loop, we pass back the already complete operations
            while complete:
                yield complete.pop()

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


From ncoghlan at gmail.com  Sun Dec 23 06:48:00 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 23 Dec 2012 15:48:00 +1000
Subject: [Python-ideas] Async context managers and iterators with tulip
In-Reply-To: <CAP7+vJK_ipr8Azra+8Uw1+VSNi6r+ioSF-3onV4bCTaiogOivA@mail.gmail.com>
References: <CADiSq7f6NpeP_NDEKruJofX7QAsuf42ex=363cRMgL9BHJs+eg@mail.gmail.com>
	<CAE_Hg6bhnF1S_yN9FSZ=7a-ju_Wv_5LQ-D9Ns=Hpoy8WKiUCeQ@mail.gmail.com>
	<CADiSq7c7CoHQS0n5GqH-+k69V1WwV8qi-5s5rSWXGb2GPyHjhw@mail.gmail.com>
	<CAP7+vJK_ipr8Azra+8Uw1+VSNi6r+ioSF-3onV4bCTaiogOivA@mail.gmail.com>
Message-ID: <CADiSq7fGRjWaWtJbzDL4-dQLuNoj+=1cjLWCz8sW=58x5WnaXQ@mail.gmail.com>

On Sun, Dec 23, 2012 at 2:03 AM, Guido van Rossum <guido at python.org> wrote:
> On Sat, Dec 22, 2012 at 4:57 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> On balance, I consider it better than offering only greenlet-style
>> implicit switching (which is effectively equivalent to preemptive
>> threading, since any function call or operator may suspend the task).
>> I'm also a lot happier about it since realising that the model of
>> emitting futures and using "yield from f" where synchronous code would
>> use "f.result()" helps unify the two worlds.
>
> I wouldn't go so far as to call that unifying, but it definitely helps
> people transition. Still, from experience with introducing NDB's async
> in some internal App Engine software, it takes some getting used to
> even for the best of developers. But it is worth it.

Yes, "unify" was the wrong word - "align" would be better.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


From guido at python.org  Sun Dec 23 07:20:37 2012
From: guido at python.org (Guido van Rossum)
Date: Sat, 22 Dec 2012 22:20:37 -0800
Subject: [Python-ideas] PEP 3156 feedback: wait_one vs par vs
	concurrent.futures.wait
In-Reply-To: <CADiSq7egOLw-W1osbTvQ62o_dmRX9quvEbmTjfOchubdzjfw-Q@mail.gmail.com>
References: <CADiSq7fuf7=HXUYpgTSML31VHr2XsxQrX315hy8sEQw016oZ9A@mail.gmail.com>
	<CAP7+vJJqydN=Rg8Bksn5b7+iEnXKRiqmoOSuAgY6OQ3+KK1y3Q@mail.gmail.com>
	<CAP7+vJJ-FnugRuPi1oMZM+=Z8PdPEp0vx56oEhsZXqdHotppkA@mail.gmail.com>
	<CADiSq7c4mWeVxbHRqsbQAP6-MskE13BRakb1fXXabaMV0odFog@mail.gmail.com>
	<CAP7+vJ+cuNHPVbu+G=YyGsSzJOMsX4oCpD0c6mR690tBb4vmoA@mail.gmail.com>
	<CADiSq7egOLw-W1osbTvQ62o_dmRX9quvEbmTjfOchubdzjfw-Q@mail.gmail.com>
Message-ID: <CAP7+vJJ=Xpxk=748wMEPVU__i4AP1Hrot8zbT26CJvPBjYMHUQ@mail.gmail.com>

Yes, I like always returning a future.

On Saturday, December 22, 2012, Nick Coghlan wrote:

> On Sun, Dec 23, 2012 at 1:54 AM, Guido van Rossum <guido at python.org<javascript:;>>
> wrote:
> > On Sat, Dec 22, 2012 at 12:04 AM, Nick Coghlan <ncoghlan at gmail.com<javascript:;>>
> wrote:
> >> I deliberately chose to return coroutines. My rationale is to be able
> >> to handle the case where multiple operations become ready without
> >> having to make multiple trips around the event loop by having the
> >> iterator switch between two modes: when the complete set is empty, it
> >> yields a coroutine that calls wait and then returns the first complete
> >> future, while when there are already complete futures available, it
> >> yields a coroutine that just returns one of them immediately. It's
> >> really the same rationale as that for having @coroutine not
> >> automatically wrap things in Task - if we can avoid the event loop in
> >> cases that don't actually need to wait for an event, that's a good
> >> thing.
> >
> > I think I see it now. The first item yielded is the simplest thing
> > that can be used with yield-from, i.e. a coroutine. Then if multiple
> > futures are ready at once, you return an item of the same type, i.e. a
> > coroutine. This is essentially wrapping a Future in a coroutine! If we
> > could live with the items being alternatingly coroutines and Futures,
> > we could just return the Future in this case. BTW, yield from <future>
> > need not go to the scheduler if the Future is already done -- the
> > Future,__iter__ method should be:
> >
> >     def __iter__(self):
> >         if not self.done():
> >             yield self  # This tells Task to wait for completion.
> >         return self.result()  # May raise too.
> >
> > (I forgot this previously.)
>
> And I'd missed it completely :)
>
> In that case, yeah, yielding any already completed Futures directly
> from as_completed() should work. The "no completed operations" case
> will still need a coroutine, though, as it needs to update the
> "complete" and "incomplete" sets inside the iterator. Since we know
> we're certain to hit the scheduler in that case, we may as well wrap
> it directly in a task so we're always returning some kind of future.
> The impl might end up looking something like:
>
>     def as_completed(fs):
>         incomplete = fs
>         while incomplete:
>             # Phase 1 of the loop, we yield a Task that waits for
> operations
>             @coroutine
>             def _wait_for_some():
>                 nonlocal complete, incomplete
>                 complete, incomplete = yield from tulip.wait(fs,
> return_when=FIRST_COMPLETED)
>                 return complete.pop().result()
>             yield Task(_wait_for_some())
>             # Phase 2 of the loop, we pass back the already complete
> operations
>             while complete:
>                 yield complete.pop()
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com <javascript:;>   |   Brisbane,
> Australia
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121222/423f2052/attachment.html>

From guido at python.org  Sun Dec 23 07:24:12 2012
From: guido at python.org (Guido van Rossum)
Date: Sat, 22 Dec 2012 22:24:12 -0800
Subject: [Python-ideas] Tkinter and tulip
In-Reply-To: <kb5e93$ojs$1@ger.gmane.org>
References: <kb5e93$ojs$1@ger.gmane.org>
Message-ID: <CAP7+vJJR0wr=iZr7oDGucyN_HhhioNm5-XwV1S7aEQ2X+JYx7g@mail.gmail.com>

I hadn't thought of Tkinter, but it is an excellent idea to see how it and
tulip could integrate. Maybe it is possible to add Tkinter as a file
descriptor to tulip? I won't have time to look into this myself for a while
but would love it if someone tried this and gave feedback.

--Guido

On Saturday, December 22, 2012, Terry Reedy wrote:

> Though not mentioned much in the tulip discussion, tkinter is a third 'T'
> package with its own event loop. (And by the way, I associate 'tulip' with
> 'Floriade', with 10s of thousands of tulips in bloom. It was a +++
> experience. But I suppose it is too cute for Python ;-)
>
> Yesterday, tk/tkinter expert Kevin Walzer asked on python-list how to
> (easily) read a pipe asynchonously and post the result to a tk text widget.
> I don't know the answer now, but is my understanding correct that in the
> future a) there should be a tk loop adapter that could replace the default
> tulip loop and b) it would then be easy to add i/o events to the tk loop?
>
> My personal interest is whether it will some day be possible to re-write
> IDLE to use tulip so one could edit in an edit pane while the shell pane
> asynchronously waits for and displays output from a 'long' computation.* It
> would also be nice if ^C could be made to work better -- which is to say,
> take effect sooner -- by decoupling key processing from socket reading. I
> am thinking that IDLE could be both a simple test and showcase for the
> usefulness of tulip.
>
> *I currently put shell and edit windows side-by-side on my wide-screen
> monitor. I can imagine putting two panes in one window instead.
>
> --
> Terry Jan Reedy
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121222/fd39c2b1/attachment.html>

From geertj at gmail.com  Sun Dec 23 12:06:31 2012
From: geertj at gmail.com (Geert Jansen)
Date: Sun, 23 Dec 2012 12:06:31 +0100
Subject: [Python-ideas] Async context managers and iterators with tulip
In-Reply-To: <CADiSq7f6NpeP_NDEKruJofX7QAsuf42ex=363cRMgL9BHJs+eg@mail.gmail.com>
References: <CADiSq7f6NpeP_NDEKruJofX7QAsuf42ex=363cRMgL9BHJs+eg@mail.gmail.com>
Message-ID: <CADbA=FVrbjiVzWBmq=EFNpbytqzVtLEVPXPDZMYBzixMvAREMQ@mail.gmail.com>

On Sat, Dec 22, 2012 at 10:14 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

[...]
> We'd be heading even further down the path of
> two-languages-for-the-price-of-one if we did that, though (by which I
> mean the fact that async code and synchronous code exist in parallel
> universes - one, more familiar one, where the ability to block is
> assumed, as is the fact that any operation may give concurrent code
> the chance to execute, and the universe of Twisted, tulip, et al,
> where possible suspension points are required to be explicitly marked
> in the function where they occur).

The two languages/parallel universes (sync and asyc) is a big concern
IMHO. I looked at a greenlet based program that I'm writing and i'm
using call stacks that are 10 deep or so. I would need to change all
these layers from the scheduler down to use yield-from to make my
program async.

The higher levels are typically application specific and could decide
to either be sync or async. For the lower levels (e.g. transports and
protocols): those are typically library code and you'd need two
versions. The latter can amount to quite a bit of duplication: there's
a lot of protocol code currently in the standard library.

I wonder if the greenlet idea was thrown out too early. If I
understand the discussion correctly, the #1 disadvantage that was
identified is that calling code does not know if called code will
switch or not. Therefore it doesn't know whether to lock, and where.

What about the following (straw man) approach to fix that issue using
greenlets: functions can state if they are safe with regards to
switching using a decorator. The default is off (non-safe). When at
some point in the call graph you need to switch, you only to this if
all frames starting from the current one up to the scheduler are
async-safe. This should be achievable without any language changes.

Usually the upper layers in a concurrent program are connection
handlers. These can be marked safe quite easily as they usually only
use local stated tied to the connection and are not called from other
connections. Any code that they call would need to be explicitly
marked async-safe otherwise it could block.

I think the straw man above is identical to the current yield-from
approach in safety because there is no automatic asynchronicity.

However, this approach it has the benefit that there can be one
implementation of lower layers (protocols and transports) that
supports both sync and async, and higher layers can use the natural
calling syntax that they are currently used to. Also making a program
async can be an incremental process, and you could use e.g. a
sys.settrace() handler to identify spots where safe code calls into
unsafe code.

Regards,
Geert


From solipsis at pitrou.net  Sun Dec 23 12:25:58 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 23 Dec 2012 12:25:58 +0100
Subject: [Python-ideas] Async context managers and iterators with tulip
References: <CADiSq7f6NpeP_NDEKruJofX7QAsuf42ex=363cRMgL9BHJs+eg@mail.gmail.com>
	<CADbA=FVrbjiVzWBmq=EFNpbytqzVtLEVPXPDZMYBzixMvAREMQ@mail.gmail.com>
Message-ID: <20121223122558.7d6c7e36@pitrou.net>

On Sun, 23 Dec 2012 12:06:31 +0100
Geert Jansen <geertj at gmail.com> wrote:
> On Sat, Dec 22, 2012 at 10:14 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> [...]
> > We'd be heading even further down the path of
> > two-languages-for-the-price-of-one if we did that, though (by which I
> > mean the fact that async code and synchronous code exist in parallel
> > universes - one, more familiar one, where the ability to block is
> > assumed, as is the fact that any operation may give concurrent code
> > the chance to execute, and the universe of Twisted, tulip, et al,
> > where possible suspension points are required to be explicitly marked
> > in the function where they occur).
> 
> The two languages/parallel universes (sync and asyc) is a big concern
> IMHO. I looked at a greenlet based program that I'm writing and i'm
> using call stacks that are 10 deep or so. I would need to change all
> these layers from the scheduler down to use yield-from to make my
> program async.
> 
> The higher levels are typically application specific and could decide
> to either be sync or async. For the lower levels (e.g. transports and
> protocols): those are typically library code and you'd need two
> versions. The latter can amount to quite a bit of duplication: there's
> a lot of protocol code currently in the standard library.

Protocols written using a callback style (data_received(), etc.), as
pointed by Laurens, can be used with both blocking and non-blocking
coding styles. Only the transports would need to be duplicated, but
that's expected.

Regards

Antoine.




From ncoghlan at gmail.com  Sun Dec 23 13:25:09 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 23 Dec 2012 22:25:09 +1000
Subject: [Python-ideas] Async context managers and iterators with tulip
In-Reply-To: <CADbA=FVrbjiVzWBmq=EFNpbytqzVtLEVPXPDZMYBzixMvAREMQ@mail.gmail.com>
References: <CADiSq7f6NpeP_NDEKruJofX7QAsuf42ex=363cRMgL9BHJs+eg@mail.gmail.com>
	<CADbA=FVrbjiVzWBmq=EFNpbytqzVtLEVPXPDZMYBzixMvAREMQ@mail.gmail.com>
Message-ID: <CADiSq7enGtaiUb7_kJ3BxpQosDo2i+8d-+bDmedt+DW+JAMjgA@mail.gmail.com>

On Sun, Dec 23, 2012 at 9:06 PM, Geert Jansen <geertj at gmail.com> wrote:
> I wonder if the greenlet idea was thrown out too early. If I
> understand the discussion correctly, the #1 disadvantage that was
> identified is that calling code does not know if called code will
> switch or not. Therefore it doesn't know whether to lock, and where.

Greenlets aren't going anywhere. The thing is that "asynchronous
programming" is used to describe both an execution model that's
limited by the number of concurrent I/O operations rather than the
number of OS level threads as well as a programming model based on
cooperative (rather than preemptive) multi-threading.

Greenlets are designed to provide the scaling benefits of I/O limited
concurrency while continuing to use a preemptive multi-threading
programming model where any operation is permitted to block the thread
of execution (implicitly switching to another thread at the lowest
layer). That's *wonderful* for getting the scaling benefits of the
execution model without needing to rewrite a program to use a
drastically different programming model.

PEP 3156, on the other hand, is about providing the cooperative
multi-threading *programming* model. Greenlets can't do that, because
they're not intended to.

However, gevent/greenlets will still benefit from the explicit
asynchronous APIs in the future, as those protocols and transports
will be usable by the *networking* side of gevent. And that's a ley
part of the aim here - reducing the duplication of effort between
gevent/Twisted/Tornado/et al by eventually allowing them to share more
of the event driven protocol stacks.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


From techtonik at gmail.com  Sun Dec 23 17:21:43 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Sun, 23 Dec 2012 19:21:43 +0300
Subject: [Python-ideas] Tree as a data structure (Was: Graph class)
In-Reply-To: <CA+OGgf64Az2g3vq+3UY7WRA1y_Z0PWiSy==Co4-zfnCey6O+Bg@mail.gmail.com>
References: <CAPkN8xK_g0=TskvGjcOpbtgnffXZVC2ce8qrg=D7dGmDLfUmLQ@mail.gmail.com>
	<CA+OGgf64Az2g3vq+3UY7WRA1y_Z0PWiSy==Co4-zfnCey6O+Bg@mail.gmail.com>
Message-ID: <CAPkN8xK00b1opL8rLfOC4JYsL+cgWJKfEf=h74K_xmxOBWMQ_A@mail.gmail.com>

On Wed, Dec 19, 2012 at 7:38 PM, Jim Jewett <jimjjewett at gmail.com> wrote:

> On 12/19/12, anatoly techtonik <techtonik at gmail.com> wrote:
> > On Sun, Dec 16, 2012 at 6:41 PM, Guido van Rossum <guido at python.org>
> wrote:
>
> >> I think of graphs and trees as patterns, not data structures.
>
> > In my world strings, ints and lists are 1D data types, and tree can be a
> > very important 2D data structure.
>
> Yes; the catch is that the details of that data structure will differ
> depending on the problem.  Most problems do not need the fancy
> algorithms -- or the extra overhead that supports them.  Since a
> simple tree (or graph) is easy to write, and the fiddly details are
> often -- but not always -- wasted overhead, it doesn't make sense to
> designate a single physical structure as "the" tree (or graph)
> representation.  So it stays a pattern, rather than a concrete data
> structure.


Right. Creating a tree structure is not the problem. The problem arise when
you have to study the code or work collaboratively with other developers.
It takes time to see an ordinary namedtuple in the magic of some custom
made tuple subclass. But you can easily add a comment that it is a
reimplementation of namedtuple and the code immediately becomes clear.

With trees it is impossible to add such a comment, because there is no
known reference tree type you can refer to.

Making a sum out this to go from patters vs structure. Patterns and data
structures are interconnected. The absence of tree definition makes it
really hard to communicate about the usage, potential and outcomes or
particular approach between developers. What data structure or pattern do
we need for <this certain case> - a tree, but which tree exactly and why?


> > Speaking of tree as a data structure, I assume that it has a very basic
> > definition:
>
> > 1. tree consists of nodes
> > 2. some nodes are containers for other nodes
>
> Are the leaves a different type, or just nodes that happen to have
> zero children at the moment?


For the 'reference tree' I'd choose the most common trees human beings work
daily, can see and as a result - easily imagine.
1. leaves can not mutate into containers
2. container property structure is different from leaves structure, but may
share elements

Spoiler: This is a pattern or data structure of filesystem tree.

I'd call a tree, which leaves can mutate into containers, a 'mutatable
tree', and the one, where leaves are containers with 0 elements, a 'uniform
tree' data structure name. A 'flexible tree` could be the better name, but
it is too generic to draw a clear association to the behavior.


> > 3. every node has properties
>
> What sort of properties?
>

I've meant the user level properties, not internal required for maintaining
tree structure.


> A single value of a given class, plus some binary flags that are
> internal to the graph implementation?
>

I am afraid to become lost in the depths of implementation details, because
it is where 2D concept jumps in.
The 'reference tree' I mentioned above is a 1:1 mapping between the set
of user level properties and a node. This means each container node is
"assigned" one user level set of properties (the given class) and each leaf
node contains another. It is the opposite to the tree, where each node can
have different user class (set of properties) assigned. The 2nd dimension
is the mapping between node types (leaf and container) and user level types.


> A fixed set of values that occur on every node?  (Possibly differing
> between leaves and regular nodes?)
> A fixed value (used for ordering) plus an arbitrary collection that
> can vary by node?
>

For the 'reference tree' every leaf contains the same set of properties,
each property has its own value.
Every container has the different set of properties, each property has its
own value.
I can't say if should be implemented as a class, but I can only propose how
this should behave:

For example, I want to access filesystem with , the syntax is the following:

  for node in container.nodes():
    if node is File:
      print node.name
      print node.hidden
    if node is Directory:
      print node.name + '/'

>From the other side I want to access:

  for file in directory.files():
    print file.name
    print file.hidden

The latter is more intuitive, but only possible if we can map 'files'
accessor name to 'node.type == leaf' query (which is hardcoded for 'generic
tree' implementation).

> More ideas:
>
> >   [ ] every element in a tree can be accessed by its address specificator
> > as 'root/node[3]/last'
>
> That assumes an arbitrary number of children, and that the children
> are ordered.  A sensible choice, but it adds way too much overhead for
> some cases.
>
> (And of course, the same goes for the overhead of balancing, etc.)


Maintaining data structure (order and nesting of elements) is the key
concept for a generic tree, and it also helps in development when you need
an easy way to "run a diff over it". Even for unordered children there
should be some way to sort them out for the comparisons.

One important operation over tree can be "data structure hash", which can
be used to detect if the structure of some tree is equal to the given
structure. For this operation the actual values of the properties are
irrelevant. Only types, positions of the nodes and names of their
properties. For the 'reference tree' we have 1:1 mapping between node type,
and the user level type, so the type of the node is not relevant. If set of
fields is fixed, it is not relevant too, so only the data structure -
nesting and order of elements plays role.

Actually, after rereading this sounds too abstract. When we compare the
filesystem trees for identity, the name of the directory (container) is its
address that participates in the hash, and the order of elements is
irrelevant.

When we compare two data structures that web framework passes to template
engine, we also not interested in the order of first level key:value pairs,
but the names of these keys are important. This is only the first level of
the data structure, though, data structure for the values part can also be
a tree, where the order is important. So, for the most generic comparison
keys there should be a way to present unordered tree in ordered manner for
hash comparison.


== More ideas (feel free to skip the brain dump or split it into different
thread)

For a generic, filesystem-like tree I need to iterate over the lees in
specified container, over containers there and over both leaves and
containers. I want to choose the failure behavior when I iterate over the
non-existing node property. And if given the default choice, I prefer to
avoid exception if possible. If field doesn't exist, return None. If field
doesn't have a value, supply an Empty class. In the data structure the
'None' is not a value, but a fact, that there is no field in a data
structure.

Why avoid exceptions? Exception is like an emergency procedure where you
lose the jet and can non resume the flight from the point you've stopped.
You need to supply the parachute beforehand and make sure it fits in the
structure of your cabin. I mean that it is very hard to resume processing
after the exception if you're interrupted in the middle of a cycle. The
exceptions will occur anyway, but for the first iteration I'd like to see
exception-less data structure handling, using None semantics for absent
properties.

It will also make check for field existence more consistent. Instead of "if
property.__name__ in node.__dict__" or even instead of "if property in
node" use "if node.property != None", because the latter is not easy to
confuse with "if node in container".

Another concept if the set of properties should be fixed or expandable for
a given node instance in a 'reference tree'. For flexibility I like the
latter, but for the static analysis in IDE, it is better to get a warning
early when you assign a value to non-existing tree node property.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121223/fb6c80aa/attachment.html>

From techtonik at gmail.com  Tue Dec 25 07:28:18 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Tue, 25 Dec 2012 09:28:18 +0300
Subject: [Python-ideas] Dynamic code NOPing
Message-ID: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>

For the logging module it will be extremely useful if Python included a way
to disactivate processing certain blocks to avoid making sacrifices between
extensive logging harness and performance. For example, instead of writing:

if log.DEBUG==True:
  log(factorial(2**15))

It should be possible to just write:
  log(factorial(2**15))

if if log() is an instance of some Nopable class, the statement in log's
braces is not executed.
-- 
anatoly t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121225/95f8a923/attachment.html>

From andrew at ei-grad.ru  Tue Dec 25 08:23:53 2012
From: andrew at ei-grad.ru (Andrew Grigorev)
Date: Tue, 25 Dec 2012 13:23:53 +0600
Subject: [Python-ideas] Dynamic code NOPing
In-Reply-To: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>
References: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>
Message-ID: <50D95489.8020307@ei-grad.ru>

It is possible now, use just have to move a resource consuming 
operations to the __str__ or __repr__ class methods and use logging.log 
feature, that it doesn't format string with specified format and 
arguments if the logging level is greater than the specified message 
level. For example:

class Factorial:
     def __init__(self, n):
         self.n  = n
     def calculate(self):
         return factorial(n)
     def __str__(self):
         return str(self.calculate)

logging.debug("Factorial of %d is %s", 2**15, Factorial(2**15))

25.12.2012 12:28, anatoly techtonik ?????:
> For the logging module it will be extremely useful if Python included 
> a way to disactivate processing certain blocks to avoid making 
> sacrifices between extensive logging harness and performance. For 
> example, instead of writing:
>
> if log.DEBUG==True:
>   log(factorial(2**15))
>
> It should be possible to just write:
>   log(factorial(2**15))
>
> if if log() is an instance of some Nopable class, the statement in 
> log's braces is not executed.
> -- 
> anatoly t.
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas


-- 
Andrew

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121225/7e927afe/attachment.html>

From jstpierre at mecheye.net  Tue Dec 25 08:40:10 2012
From: jstpierre at mecheye.net (Jasper St. Pierre)
Date: Tue, 25 Dec 2012 02:40:10 -0500
Subject: [Python-ideas] Dynamic code NOPing
In-Reply-To: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>
References: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>
Message-ID: <CAA0H+QTTUudo6CWCvFT72dhn7ibK=+n+Vp3hCdWp3J40iEbtsg@mail.gmail.com>

if __debug__:
    log(factorial(2**15))

Running python with -O will squash this statement. To have something
inline, you could also abuse assert statements to do the job.

def debug_log(x):
    log(x)
    return True

assert debug_log(factorial(2**15))

In optimized builds, the statement will be removed entirely.



On Tue, Dec 25, 2012 at 1:28 AM, anatoly techtonik <techtonik at gmail.com>wrote:

> For the logging module it will be extremely useful if Python included a
> way to disactivate processing certain blocks to avoid making sacrifices
> between extensive logging harness and performance. For example, instead of
> writing:
>
> if log.DEBUG==True:
>   log(factorial(2**15))
>
> It should be possible to just write:
>   log(factorial(2**15))
>
> if if log() is an instance of some Nopable class, the statement in log's
> braces is not executed.
> --
> anatoly t.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>


-- 
  Jasper
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121225/1f9ca0dd/attachment.html>

From jkbbwr at gmail.com  Tue Dec 25 10:35:23 2012
From: jkbbwr at gmail.com (Jakob Bowyer)
Date: Tue, 25 Dec 2012 09:35:23 +0000
Subject: [Python-ideas] Dynamic code NOPing
In-Reply-To: <CAA0H+QTTUudo6CWCvFT72dhn7ibK=+n+Vp3hCdWp3J40iEbtsg@mail.gmail.com>
References: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>
	<CAA0H+QTTUudo6CWCvFT72dhn7ibK=+n+Vp3hCdWp3J40iEbtsg@mail.gmail.com>
Message-ID: <CAA+RL7FckXhCzHOSXPtUrDrCNzEQewSc1tLQHgZv_xQdqc-J9Q@mail.gmail.com>

Why not pass the function/method, args, kwargs to log.debug and let
log.debug decide if it should execute or not,
e.g.

log.debug(factorial, 2**15)


On Tue, Dec 25, 2012 at 7:40 AM, Jasper St. Pierre <jstpierre at mecheye.net>wrote:

> if __debug__:
>     log(factorial(2**15))
>
> Running python with -O will squash this statement. To have something
> inline, you could also abuse assert statements to do the job.
>
> def debug_log(x):
>     log(x)
>     return True
>
> assert debug_log(factorial(2**15))
>
> In optimized builds, the statement will be removed entirely.
>
>
>
> On Tue, Dec 25, 2012 at 1:28 AM, anatoly techtonik <techtonik at gmail.com>wrote:
>
>> For the logging module it will be extremely useful if Python included a
>> way to disactivate processing certain blocks to avoid making sacrifices
>> between extensive logging harness and performance. For example, instead of
>> writing:
>>
>> if log.DEBUG==True:
>>   log(factorial(2**15))
>>
>> It should be possible to just write:
>>   log(factorial(2**15))
>>
>> if if log() is an instance of some Nopable class, the statement in log's
>> braces is not executed.
>> --
>> anatoly t.
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>>
>>
>
>
> --
>   Jasper
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121225/c2199405/attachment.html>

From rene at stranden.com  Tue Dec 25 12:11:09 2012
From: rene at stranden.com (Rene Nejsum)
Date: Tue, 25 Dec 2012 12:11:09 +0100
Subject: [Python-ideas] Dynamic code NOPing
In-Reply-To: <CAA+RL7FckXhCzHOSXPtUrDrCNzEQewSc1tLQHgZv_xQdqc-J9Q@mail.gmail.com>
References: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>
	<CAA0H+QTTUudo6CWCvFT72dhn7ibK=+n+Vp3hCdWp3J40iEbtsg@mail.gmail.com>
	<CAA+RL7FckXhCzHOSXPtUrDrCNzEQewSc1tLQHgZv_xQdqc-J9Q@mail.gmail.com>
Message-ID: <5DFDB30C-9A3D-4939-81D7-F34727284148@stranden.com>

Interessting alternatives, but they do not quite come on as flexible/usefull enough?

Often debug statements have a lot of text and variables, like:

	log.debug( "The value of X, Y, Z is now: %d %s %d" % ( x, lookup(y), factorial(2**15))

It would be nice if args in log.debug() was only evaluated if debug was on. But I don't think this is possible with the current Python evaluation rules.

But if debug() was indeed NOP'able, maybe it could be done ?

/Rene

On Dec 25, 2012, at 10:35 AM, Jakob Bowyer <jkbbwr at gmail.com> wrote:

> Why not pass the function/method, args, kwargs to log.debug and let log.debug decide if it should execute or not,
> e.g.
> 
> log.debug(factorial, 2**15)
> 
> 
> On Tue, Dec 25, 2012 at 7:40 AM, Jasper St. Pierre <jstpierre at mecheye.net> wrote:
> if __debug__:
>     log(factorial(2**15))
> 
> Running python with -O will squash this statement. To have something inline, you could also abuse assert statements to do the job.
> 
> def debug_log(x):
>     log(x)
>     return True
> 
> assert debug_log(factorial(2**15))
> 
> In optimized builds, the statement will be removed entirely.
> 
> 
> 
> On Tue, Dec 25, 2012 at 1:28 AM, anatoly techtonik <techtonik at gmail.com> wrote:
> For the logging module it will be extremely useful if Python included a way to disactivate processing certain blocks to avoid making sacrifices between extensive logging harness and performance. For example, instead of writing:
> 
> if log.DEBUG==True:
>   log(factorial(2**15))
> 
> It should be possible to just write:
>   log(factorial(2**15))
> 
> if if log() is an instance of some Nopable class, the statement in log's braces is not executed.
> -- 
> anatoly t.
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
> 
> 
> 
> 
> -- 
>   Jasper
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
> 
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121225/b9876bf5/attachment.html>

From ncoghlan at gmail.com  Tue Dec 25 13:28:55 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 25 Dec 2012 22:28:55 +1000
Subject: [Python-ideas] Dynamic code NOPing
In-Reply-To: <5DFDB30C-9A3D-4939-81D7-F34727284148@stranden.com>
References: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>
	<CAA0H+QTTUudo6CWCvFT72dhn7ibK=+n+Vp3hCdWp3J40iEbtsg@mail.gmail.com>
	<CAA+RL7FckXhCzHOSXPtUrDrCNzEQewSc1tLQHgZv_xQdqc-J9Q@mail.gmail.com>
	<5DFDB30C-9A3D-4939-81D7-F34727284148@stranden.com>
Message-ID: <CADiSq7fxJFGOnh8hhxYCHMwMJQCpSHa0q-UUnnsbF6rMBKvz_A@mail.gmail.com>

On Tue, Dec 25, 2012 at 9:11 PM, Rene Nejsum <rene at stranden.com> wrote:
> But if debug() was indeed NOP'able, maybe it could be done ?

If someone *really* wants to do this, they can abuse assert statements
(which will be optimised out under "-O", just like code guarded by "if
__debug__"). That doesn't make it a good idea - you most need log
messages to investigate faults in production systems that you can't
(or are still trying to) reproduce in development and integration
environments. Compiling them out instead of deactivating them with
runtime configuration settings means you can't switch them on without
restarting the system with different options.

This does mean that you have to factor in the cost of logging into
your performance targets and hardware requirements, but the payoff is
an increased ability to correctly diagnose system faults (as well as
improving your ability to extract interesting metrics from log
messages).

Excessive logging calls certainly *can* cause performance problems due
to the function call overhead, as can careless calculation of
expensive values that aren't needed.  One alternatives occasional
noted is that you could design a logging API that can accept lazily
evaluated callables instead of ordinary parameters.

However, one danger of such expensive logging it that enabling that
logging level becomes infeasible in practice, because the performance
hit is too significant. The typical aim for logging is that your
overhead should be such that enabling it in production means your
servers run a little hotter, or your task takes a little longer, not
that your application grinds to a halt. One good way to achieve this
is to decouple the expensive calculations from the main application -
you instead log the necessary pieces of information, which can be
picked up by an external service and the calculation performed in a
separate process (or even on a separate machine) where it won't affect
the main application, and where you only calculate it if you actually
need it for some reason.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


From rene at stranden.com  Tue Dec 25 13:42:34 2012
From: rene at stranden.com (Rene Nejsum)
Date: Tue, 25 Dec 2012 13:42:34 +0100
Subject: [Python-ideas] Dynamic code NOPing
In-Reply-To: <CADiSq7fxJFGOnh8hhxYCHMwMJQCpSHa0q-UUnnsbF6rMBKvz_A@mail.gmail.com>
References: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>
	<CAA0H+QTTUudo6CWCvFT72dhn7ibK=+n+Vp3hCdWp3J40iEbtsg@mail.gmail.com>
	<CAA+RL7FckXhCzHOSXPtUrDrCNzEQewSc1tLQHgZv_xQdqc-J9Q@mail.gmail.com>
	<5DFDB30C-9A3D-4939-81D7-F34727284148@stranden.com>
	<CADiSq7fxJFGOnh8hhxYCHMwMJQCpSHa0q-UUnnsbF6rMBKvz_A@mail.gmail.com>
Message-ID: <563458C3-9580-46AE-B343-6987116A3F08@stranden.com>

I understand and agree with all your arguments on debugging.

At my company we typically make some kind of backend/server control software, with a LOT of debugging lines across many modules. We have 20+ debugging flags and in different situations we enable a few of those, if we were to enable all at once it would defently have an impact on production, but hopefully just a hotter CPU and a lot of disk space being used.

debug statements in our code is probably one per 10-20 lines of code.

I think my main issue (and what I therefore read into the original suggestion) was the extra "if" statement at every log statement

So doing:

if log.debug.enabled():
	log.debug( bla. bla. )

Add's 5-10% extra code lines, whereas if we could do:

log.debug( bla. bla )

at the same cost would save a lot of lines.

And when you have 43 lines in your editor, it will give you 3-5 lines more of real code to look at  :-)

/Rene



On Dec 25, 2012, at 1:28 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On Tue, Dec 25, 2012 at 9:11 PM, Rene Nejsum <rene at stranden.com> wrote:
>> But if debug() was indeed NOP'able, maybe it could be done ?
> 
> If someone *really* wants to do this, they can abuse assert statements
> (which will be optimised out under "-O", just like code guarded by "if
> __debug__"). That doesn't make it a good idea - you most need log
> messages to investigate faults in production systems that you can't
> (or are still trying to) reproduce in development and integration
> environments. Compiling them out instead of deactivating them with
> runtime configuration settings means you can't switch them on without
> restarting the system with different options.
> 
> This does mean that you have to factor in the cost of logging into
> your performance targets and hardware requirements, but the payoff is
> an increased ability to correctly diagnose system faults (as well as
> improving your ability to extract interesting metrics from log
> messages).
> 
> Excessive logging calls certainly *can* cause performance problems due
> to the function call overhead, as can careless calculation of
> expensive values that aren't needed.  One alternatives occasional
> noted is that you could design a logging API that can accept lazily
> evaluated callables instead of ordinary parameters.
> 
> However, one danger of such expensive logging it that enabling that
> logging level becomes infeasible in practice, because the performance
> hit is too significant. The typical aim for logging is that your
> overhead should be such that enabling it in production means your
> servers run a little hotter, or your task takes a little longer, not
> that your application grinds to a halt. One good way to achieve this
> is to decouple the expensive calculations from the main application -
> you instead log the necessary pieces of information, which can be
> picked up by an external service and the calculation performed in a
> separate process (or even on a separate machine) where it won't affect
> the main application, and where you only calculate it if you actually
> need it for some reason.
> 
> Cheers,
> Nick.
> 
> -- 
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia



From ncoghlan at gmail.com  Tue Dec 25 14:00:40 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 25 Dec 2012 23:00:40 +1000
Subject: [Python-ideas] Dynamic code NOPing
In-Reply-To: <563458C3-9580-46AE-B343-6987116A3F08@stranden.com>
References: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>
	<CAA0H+QTTUudo6CWCvFT72dhn7ibK=+n+Vp3hCdWp3J40iEbtsg@mail.gmail.com>
	<CAA+RL7FckXhCzHOSXPtUrDrCNzEQewSc1tLQHgZv_xQdqc-J9Q@mail.gmail.com>
	<5DFDB30C-9A3D-4939-81D7-F34727284148@stranden.com>
	<CADiSq7fxJFGOnh8hhxYCHMwMJQCpSHa0q-UUnnsbF6rMBKvz_A@mail.gmail.com>
	<563458C3-9580-46AE-B343-6987116A3F08@stranden.com>
Message-ID: <CADiSq7dzL0UAP4ZStbfLTHAqoTDD=ZVviMwCrDfUN6_-tVSYPQ@mail.gmail.com>

On Tue, Dec 25, 2012 at 10:42 PM, Rene Nejsum <rene at stranden.com> wrote:
> Add's 5-10% extra code lines, whereas if we could do:
>
> log.debug( bla. bla )
>
> at the same cost would save a lot of lines.

Right, that's where the lazy evaluation API idea comes in where
there's no choice except to do the expensive calculation in process
and you want to factor out the logging level check, it's possible to
replace it with 7 characters embedded in the call:

    debug_lazy(lambda: bla. bla.)

You can also do much more sophisticated things with the logging event
handling system that only trigger if an event passes the initial
priority level check and gets submitted to the rest of the logging
machinery.

There's no magic wand we can wave to say "evaluate this immediately
sometimes, but lazily other times based on some unknown global state".
An API has to choose one or the other. The standard logging APIs
chooses do lazy evaluation of formatting calls, but eager evaluation
of the interpolated values in order to speed up the typical case of
readily accessible data - that's why the active level query API is
exposed. Another logging API could certainly make the other choice,
adapting to the standard APIs via the level query API. I don't know if
such an alternative API exists - my rule of thumb for logging calls is
if something is too expensive to calculate all the time, find a way to
instead pass the necessary pieces for external reconstruction to a
lazy formatting call rather than making a given level of logging
prohibitively expensive.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


From rene at stranden.com  Tue Dec 25 14:24:52 2012
From: rene at stranden.com (Rene Nejsum)
Date: Tue, 25 Dec 2012 14:24:52 +0100
Subject: [Python-ideas] Dynamic code NOPing
In-Reply-To: <CADiSq7dzL0UAP4ZStbfLTHAqoTDD=ZVviMwCrDfUN6_-tVSYPQ@mail.gmail.com>
References: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>
	<CAA0H+QTTUudo6CWCvFT72dhn7ibK=+n+Vp3hCdWp3J40iEbtsg@mail.gmail.com>
	<CAA+RL7FckXhCzHOSXPtUrDrCNzEQewSc1tLQHgZv_xQdqc-J9Q@mail.gmail.com>
	<5DFDB30C-9A3D-4939-81D7-F34727284148@stranden.com>
	<CADiSq7fxJFGOnh8hhxYCHMwMJQCpSHa0q-UUnnsbF6rMBKvz_A@mail.gmail.com>
	<563458C3-9580-46AE-B343-6987116A3F08@stranden.com>
	<CADiSq7dzL0UAP4ZStbfLTHAqoTDD=ZVviMwCrDfUN6_-tVSYPQ@mail.gmail.com>
Message-ID: <DB9E42F0-D486-48FE-99F8-E36A3B508A86@stranden.com>

Thanks, appreciate your answers and comments?

OT: Being brought up with a C/Java background the depth of the Python language itself still amazes me, I wonder if there is a correlation to the Python community when guys like Guido, Nick and all others takes time to answer questions in a friendly, informative and educating way?.

Merry christmas to all on the list?.

/Rene

On Dec 25, 2012, at 2:00 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On Tue, Dec 25, 2012 at 10:42 PM, Rene Nejsum <rene at stranden.com> wrote:
>> Add's 5-10% extra code lines, whereas if we could do:
>> 
>> log.debug( bla. bla )
>> 
>> at the same cost would save a lot of lines.
> 
> Right, that's where the lazy evaluation API idea comes in where
> there's no choice except to do the expensive calculation in process
> and you want to factor out the logging level check, it's possible to
> replace it with 7 characters embedded in the call:
> 
>    debug_lazy(lambda: bla. bla.)
> 
> You can also do much more sophisticated things with the logging event
> handling system that only trigger if an event passes the initial
> priority level check and gets submitted to the rest of the logging
> machinery.
> 
> There's no magic wand we can wave to say "evaluate this immediately
> sometimes, but lazily other times based on some unknown global state".
> An API has to choose one or the other. The standard logging APIs
> chooses do lazy evaluation of formatting calls, but eager evaluation
> of the interpolated values in order to speed up the typical case of
> readily accessible data - that's why the active level query API is
> exposed. Another logging API could certainly make the other choice,
> adapting to the standard APIs via the level query API. I don't know if
> such an alternative API exists - my rule of thumb for logging calls is
> if something is too expensive to calculate all the time, find a way to
> instead pass the necessary pieces for external reconstruction to a
> lazy formatting call rather than making a given level of logging
> prohibitively expensive.
> 
> Cheers,
> Nick.
> 
> -- 
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia



From shibturn at gmail.com  Tue Dec 25 15:43:10 2012
From: shibturn at gmail.com (Richard Oudkerk)
Date: Tue, 25 Dec 2012 14:43:10 +0000
Subject: [Python-ideas] Dynamic code NOPing
In-Reply-To: <50D95489.8020307@ei-grad.ru>
References: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>
	<50D95489.8020307@ei-grad.ru>
Message-ID: <kbce26$kfp$1@ger.gmane.org>

On 25/12/2012 7:23am, Andrew Grigorev wrote:
>
> class Factorial:
>      def __init__(self, n):
>          self.n  = n
>      def calculate(self):
>          return factorial(n)
>      def __str__(self):
>          return str(self.calculate)
>
> logging.debug("Factorial of %d is %s", 2**15, Factorial(2**15))

A more generic alternative would be

     class str_partial(functools.partial):
         def __str__(self):
             return str(self())

     logging.debug("Factorial of %d is %s",
                   2**15, str_partial(factorial, 2**15)))

-- 
Richard



From vinay_sajip at yahoo.co.uk  Tue Dec 25 15:45:20 2012
From: vinay_sajip at yahoo.co.uk (Vinay Sajip)
Date: Tue, 25 Dec 2012 14:45:20 +0000 (UTC)
Subject: [Python-ideas] Dynamic code NOPing
References: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>
	<CAA0H+QTTUudo6CWCvFT72dhn7ibK=+n+Vp3hCdWp3J40iEbtsg@mail.gmail.com>
	<CAA+RL7FckXhCzHOSXPtUrDrCNzEQewSc1tLQHgZv_xQdqc-J9Q@mail.gmail.com>
	<5DFDB30C-9A3D-4939-81D7-F34727284148@stranden.com>
	<CADiSq7fxJFGOnh8hhxYCHMwMJQCpSHa0q-UUnnsbF6rMBKvz_A@mail.gmail.com>
	<563458C3-9580-46AE-B343-6987116A3F08@stranden.com>
Message-ID: <loom.20121225T153325-693@post.gmane.org>

Rene Nejsum <rene at ...> writes:

> So doing:
> 
> if log.debug.enabled():
> 	log.debug( bla. bla. )
> 
> Add's 5-10% extra code lines, whereas if we could do:
> 
> log.debug( bla. bla )
> 
> at the same cost would save a lot of lines.

Bearing in mind that the first statement in the debug (and analogous methods) is
a check for the level, the only thing you gain by having the same check outside
the call is the cost of evaluating arguments. But you can also do this by
passing an arbitrary class as the message object, which lazily evaluates only
when needed. Contrived example:

class Message(object):
    def __init__(self, func, x, y): # params should be cheap to evaluate
        self.func = func
        self.x = x
        self.y = y

    def __str__(self):
        return str(self.func(self.x**self.y)) # expense is incurred here

logger.debug(Message(factorial, 2, 15))

With this setup, no if statements are needed in your code, and the expensive
computations only occur when required.

Regards,

Vinay Sajip



From ram.rachum at gmail.com  Tue Dec 25 22:46:22 2012
From: ram.rachum at gmail.com (Ram Rachum)
Date: Tue, 25 Dec 2012 13:46:22 -0800 (PST)
Subject: [Python-ideas] Allow accessing return value inside finally clause
Message-ID: <213d11b1-e7a5-4336-82a8-fca65a612ad6@googlegroups.com>

Say I have this function:

    def f():
        try:
            return whatever()
        finally:
            pass # I want to get what `whatever()` returned in here

I want to get the return value from inside the `finally` clause.

I understand that this is currently not possible. I'd like that to be 
possible because that would allow post-processing of a function's return 
value.

What do you think?


Thanks,
Ram.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121225/8447fa3b/attachment.html>

From g.brandl at gmx.net  Tue Dec 25 22:55:41 2012
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 25 Dec 2012 22:55:41 +0100
Subject: [Python-ideas] Allow accessing return value inside finally
	clause
In-Reply-To: <213d11b1-e7a5-4336-82a8-fca65a612ad6@googlegroups.com>
References: <213d11b1-e7a5-4336-82a8-fca65a612ad6@googlegroups.com>
Message-ID: <kbd7co$cmv$1@ger.gmane.org>

On 12/25/2012 10:46 PM, Ram Rachum wrote:
> Say I have this function:
> 
>     def f():
>         try:
>             return whatever()
>         finally:
>             pass # I want to get what `whatever()` returned in here
> 
> I want to get the return value from inside the `finally` clause.
> 
> I understand that this is currently not possible. I'd like that to be possible
> because that would allow post-processing of a function's return value.
>
> What do you think?

Please supply a more complete example of what you are trying to achieve.

As it is, I wonder what your motivation is for using a "finally", because
in the case of an exception, there won't even *be* a return value to
postprocess.

If you're trying to use try-finally as a sort of nonlocal exit mechanism (like
the famous "goto done" in CPython sources), you probably would be fine with

ret = None
try:
    if x:
        ret = blah
        return
    # more cases with returns here
finally:
    # post-process ret here
    return ret

But I would consider this an abuse of try-finally, especially since it
suppresses proper propagation of exceptions.

cheers,
Georg




From paul at colomiets.name  Tue Dec 25 23:24:22 2012
From: paul at colomiets.name (Paul Colomiets)
Date: Wed, 26 Dec 2012 00:24:22 +0200
Subject: [Python-ideas] collections.sortedset proposal
Message-ID: <CAA0gF6oAqkY+-x-TUHQeS0TaMF8JAED5YwfJekON3ZbZG2qtNw@mail.gmail.com>

Hi,

I want to propose to include SortedSet data structure into collections module.

SortedSet (name borrowed from Redis) is a basically a mapping of
(unique) keys to scores, that allows fast slicing by ordinal number
and by score.

There are plenty of use cases for the sorted sets:

* Leaderboard for a game
* Priority queue (that supports task deletion)
* Timer list (e.g. can be used for tulip, supports deletion too)
* Caches with TTL-based, LFU or LRU eviction (including `functools.lru_cache`)
* Search databases with relevance scores
* Statistics (many use cases)
* Replacement for `collections.Counter` with faster `most_common()`

I have first draft of pure python implementation:

https://github.com/tailhook/sortedsets
http://pypi.python.org/pypi/sortedsets/1.0

The implementation is closely modeled on Redis. Internally it consists
of a dict for mapping between keys and scores, and a skiplist for
scores. So most operations are done with O(log n) time. The actual
performance is probably very slow for pure-python implementation, but
can be fixed by C code later. The asymptotic performance seems to be
OK.

So my questions are:

1. Do you think SortedSets are eligible for inclusion to stdlib?
2. Do I need a PEP?
3. Any comments on the implementation?


P.S.: Sorted sets in redis are not the same thing as sorted sets in
blist. So maybe a better name?

--
Paul


From techtonik at gmail.com  Wed Dec 26 00:04:15 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Wed, 26 Dec 2012 02:04:15 +0300
Subject: [Python-ideas] Dynamic code NOPing
In-Reply-To: <loom.20121225T153325-693@post.gmane.org>
References: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>
	<CAA0H+QTTUudo6CWCvFT72dhn7ibK=+n+Vp3hCdWp3J40iEbtsg@mail.gmail.com>
	<CAA+RL7FckXhCzHOSXPtUrDrCNzEQewSc1tLQHgZv_xQdqc-J9Q@mail.gmail.com>
	<5DFDB30C-9A3D-4939-81D7-F34727284148@stranden.com>
	<CADiSq7fxJFGOnh8hhxYCHMwMJQCpSHa0q-UUnnsbF6rMBKvz_A@mail.gmail.com>
	<563458C3-9580-46AE-B343-6987116A3F08@stranden.com>
	<loom.20121225T153325-693@post.gmane.org>
Message-ID: <CAPkN8x+uTKEq2KioAoQAzHcON1+hdtCxupoog+CELnqQQuq78w@mail.gmail.com>

On Tue, Dec 25, 2012 at 5:45 PM, Vinay Sajip <vinay_sajip at yahoo.co.uk>wrote:

> Rene Nejsum <rene at ...> writes:
>
> > So doing:
> >
> > if log.debug.enabled():
> >       log.debug( bla. bla. )
> >
> > Add's 5-10% extra code lines, whereas if we could do:
> >
> > log.debug( bla. bla )
> >
> > at the same cost would save a lot of lines.
>
> Bearing in mind that the first statement in the debug (and analogous
> methods) is
> a check for the level, the only thing you gain by having the same check
> outside
> the call is the cost of evaluating arguments. But you can also do this by
> passing an arbitrary class as the message object, which lazily evaluates
> only
> when needed. Contrived example:
>
> class Message(object):
>     def __init__(self, func, x, y): # params should be cheap to evaluate
>         self.func = func
>         self.x = x
>         self.y = y
>
>     def __str__(self):
>         return str(self.func(self.x**self.y)) # expense is incurred here
>
> logger.debug(Message(factorial, 2, 15))
>
> With this setup, no if statements are needed in your code, and the
> expensive
> computations only occur when required.
>

That's still two function calls and three assignments per logging call. Too
expensive and syntax unwieldy. I think everybody agrees now that for
existing CPython implementation there is really no solution for the problem
of expensive logging calls vs code clarity. You have to implement
optimization workaround at the cost of readability.

The idea is to fix the interpreter, introducing a "feature block" -
execution block that works only if it is enabled. Execution block
for logging example below is defined by function name "debug" and braces ().

    debug( <block contents> )

debug is an object of 'feature' type, which is only executed/evaluated, if
the feature is enabled in a table of features.

It might be possible to implement this as a custom version of PyPy. Then by
hardcoding logic for treating logging call as 'featured'  should give an
immediate performance boost to any project. Still it would be nice if
logging was build with supported layout for easy optimization or for 'from
__future__ import features.logging' .
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121226/50f59d92/attachment.html>

From steve at pearwood.info  Wed Dec 26 00:02:24 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 26 Dec 2012 10:02:24 +1100
Subject: [Python-ideas] Allow accessing return value inside finally
	clause
In-Reply-To: <213d11b1-e7a5-4336-82a8-fca65a612ad6@googlegroups.com>
References: <213d11b1-e7a5-4336-82a8-fca65a612ad6@googlegroups.com>
Message-ID: <50DA3080.8080808@pearwood.info>

On 26/12/12 08:46, Ram Rachum wrote:
> Say I have this function:
>
>      def f():
>          try:
>              return whatever()
>          finally:
>              pass # I want to get what `whatever()` returned in here
>
> I want to get the return value from inside the `finally` clause.
>
> I understand that this is currently not possible. I'd like that to be
> possible because that would allow post-processing of a function's return
> value.

The usual ways to do that are:


def f():
     return postprocess(whatever())


or:

def f():
     return whatever()

x = postprocess(f())



Are these usual solution not suitable for your use-case?



-- 
Steven


From rene at stranden.com  Wed Dec 26 00:36:22 2012
From: rene at stranden.com (Rene Nejsum)
Date: Wed, 26 Dec 2012 00:36:22 +0100
Subject: [Python-ideas] Dynamic code NOPing
In-Reply-To: <CAPkN8x+uTKEq2KioAoQAzHcON1+hdtCxupoog+CELnqQQuq78w@mail.gmail.com>
References: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>
	<CAA0H+QTTUudo6CWCvFT72dhn7ibK=+n+Vp3hCdWp3J40iEbtsg@mail.gmail.com>
	<CAA+RL7FckXhCzHOSXPtUrDrCNzEQewSc1tLQHgZv_xQdqc-J9Q@mail.gmail.com>
	<5DFDB30C-9A3D-4939-81D7-F34727284148@stranden.com>
	<CADiSq7fxJFGOnh8hhxYCHMwMJQCpSHa0q-UUnnsbF6rMBKvz_A@mail.gmail.com>
	<563458C3-9580-46AE-B343-6987116A3F08@stranden.com>
	<loom.20121225T153325-693@post.gmane.org>
	<CAPkN8x+uTKEq2KioAoQAzHcON1+hdtCxupoog+CELnqQQuq78w@mail.gmail.com>
Message-ID: <08359A6F-73DB-495A-A580-9B81DB975966@stranden.com>

I think we all agree that it cannot be done in Python right now?

But i doubt there will be support for a solution just for debugging, and I am having a hard time coming up with other examples?

Quick thought (very quick) and I am no expert, but maybe an acceptable/compatible solution could be:

def do_debug(*args):
	print 'DEBUG: ', args

def nop_debug(*args):
	pass # Empty function

debug = do_debug
debug( "Some evaluated text %d %d %d" % (1, 2, fact(22)) )

debug = nop_debug
debug( "Will not be evaluated, since Python is clever enough to optimise out")

At least some kind of -O option could optimise this out ?

Then again, there are probably lot's of reasons for this not to work :-)

/Rene


On Dec 26, 2012, at 12:04 AM, anatoly techtonik <techtonik at gmail.com> wrote:

> On Tue, Dec 25, 2012 at 5:45 PM, Vinay Sajip <vinay_sajip at yahoo.co.uk> wrote:
> Rene Nejsum <rene at ...> writes:
> 
> > So doing:
> >
> > if log.debug.enabled():
> >       log.debug( bla. bla. )
> >
> > Add's 5-10% extra code lines, whereas if we could do:
> >
> > log.debug( bla. bla )
> >
> > at the same cost would save a lot of lines.
> 
> Bearing in mind that the first statement in the debug (and analogous methods) is
> a check for the level, the only thing you gain by having the same check outside
> the call is the cost of evaluating arguments. But you can also do this by
> passing an arbitrary class as the message object, which lazily evaluates only
> when needed. Contrived example:
> 
> class Message(object):
>     def __init__(self, func, x, y): # params should be cheap to evaluate
>         self.func = func
>         self.x = x
>         self.y = y
> 
>     def __str__(self):
>         return str(self.func(self.x**self.y)) # expense is incurred here
> 
> logger.debug(Message(factorial, 2, 15))
> 
> With this setup, no if statements are needed in your code, and the expensive
> computations only occur when required.
> 
> That's still two function calls and three assignments per logging call. Too expensive and syntax unwieldy. I think everybody agrees now that for existing CPython implementation there is really no solution for the problem of expensive logging calls vs code clarity. You have to implement optimization workaround at the cost of readability.
> 
> The idea is to fix the interpreter, introducing a "feature block" - execution block that works only if it is enabled. Execution block for logging example below is defined by function name "debug" and braces ().
> 
>     debug( <block contents> )
> 
> debug is an object of 'feature' type, which is only executed/evaluated, if the feature is enabled in a table of features.
> 
> It might be possible to implement this as a custom version of PyPy. Then by hardcoding logic for treating logging call as 'featured'  should give an immediate performance boost to any project. Still it would be nice if logging was build with supported layout for easy optimization or for 'from __future__ import features.logging' .
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121226/ee1fcbaf/attachment.html>

From wuwei23 at gmail.com  Wed Dec 26 00:54:42 2012
From: wuwei23 at gmail.com (alex23)
Date: Tue, 25 Dec 2012 15:54:42 -0800 (PST)
Subject: [Python-ideas] Allow accessing return value inside finally
	clause
In-Reply-To: <CANXboVZXTfz5ewm0bixmKR2SMj-dBBS=80_Ky3mbi+UW3sxLLA@mail.gmail.com>
References: <213d11b1-e7a5-4336-82a8-fca65a612ad6@googlegroups.com>
	<kbd7co$cmv$1@ger.gmane.org>
	<CANXboVZXTfz5ewm0bixmKR2SMj-dBBS=80_Ky3mbi+UW3sxLLA@mail.gmail.com>
Message-ID: <a859719d-33f2-4f47-a6d3-2b0ebb9820c6@pp8g2000pbb.googlegroups.com>

On 26 Dec, 08:06, Ram Rachum <r... at rachum.com> wrote:
> Now of course, you can find other solutions to this problem. You can write
> a decorator to do the post-processing phase, or you could divide the whole
> thing into 2 functions. But I think that sometimes, the
> `finally`-postprocess idiom I propose will be the most succinct one.

I initially responded to say "use a decorator", but you're already
aware of the common pattern for dealing with this, and yet you'd
rather the language change instead?

> (Regarding lack of return value and propagating exceptions: This all sounds
> solvable to me. Why not let the `finally` clause detect what's going on and
> react appropriately? No `return` value? Don't postprocess. Exception
> raised? Don't interfere.)

People already struggle with understanding the semantics of try/
finally - you yourself demonstrated this in your first post by not
being away that the 'return value' may not be set - and you want to
make it _more_ magic?

You're a programmer. At the end of the day, you're going to have to do
_some_ "heavy" lifting by yourself. Assign your return values to an
object that performs the post-processing on demand. Create a context
manager that does it when it exits. Write a loop to decorate your
functions if typing @decorator is so strenuous.

Making Python less clear to save yourself some typing isn't a decent
trade off.


From wuwei23 at gmail.com  Wed Dec 26 00:55:22 2012
From: wuwei23 at gmail.com (alex23)
Date: Tue, 25 Dec 2012 15:55:22 -0800 (PST)
Subject: [Python-ideas] Allow accessing return value inside finally
	clause
In-Reply-To: <CANXboVYCCh3-=yGxO_EYOZVFeo6AGZ5r1ZK66O3N9L-yAeOKRg@mail.gmail.com>
References: <213d11b1-e7a5-4336-82a8-fca65a612ad6@googlegroups.com>
	<50DA3080.8080808@pearwood.info>
	<CANXboVYCCh3-=yGxO_EYOZVFeo6AGZ5r1ZK66O3N9L-yAeOKRg@mail.gmail.com>
Message-ID: <0b2afe6b-7ea3-4e22-9696-c9a5021fefe5@t6g2000pba.googlegroups.com>

On 26 Dec, 09:11, Ram Rachum <r... at rachum.com> wrote:
> That works, sure. I've mentioned this in my email above. But I think that
> in some cases making the post-process in the `finally` clause will be more
> elegant.

What you deem "elegant", I see as "laziness".


From techtonik at gmail.com  Wed Dec 26 01:10:42 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Wed, 26 Dec 2012 03:10:42 +0300
Subject: [Python-ideas] Documenting Python warts on Stack Overflow
Message-ID: <CAPkN8x+TF6znvM-Kd2Qad_ExjYf-hv41S3_Uk0U0OqAGO8HteQ@mail.gmail.com>

I am thinking about [python-wart] on SO. There is no currently a list of
Python warts, and building a better language is impossible without a clear
visibility of warts in current implementations.

Why Roundup doesn't work ATM.
- warts are lost among other "won't fix" and "works for me" issues
- no way to edit description to make it more clear
- no voting/stars to percieve how important is this issue
- no comment/noise filtering
and the most valuable
- there is no query to list warts sorted by popularity to explore other
time-consuming areas of Python you are not aware of, but which can popup
one day

SO at least allows:
+ voting
+ community wiki edits
+ useful comment upvoting
+ sorted lists
+ user editable tags (adding new warts is easy)

This post is a result of facing with numerous locals/settrace/exec issues
that are closed on tracker. I also have my own list of other issues
(logging/subprocess) at GC project, which I might be unable to maintain in
future. There is also some undocumented stuff (subprocess deadlocks) that
I'm investigating, but don't have time for a write-up. So I'd rather move
this somewhere where it could be updated.
-- 
anatoly t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121226/eaa95906/attachment.html>

From greg.ewing at canterbury.ac.nz  Tue Dec 25 22:49:54 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 26 Dec 2012 10:49:54 +1300
Subject: [Python-ideas] Dynamic code NOPing
In-Reply-To: <5DFDB30C-9A3D-4939-81D7-F34727284148@stranden.com>
References: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>
	<CAA0H+QTTUudo6CWCvFT72dhn7ibK=+n+Vp3hCdWp3J40iEbtsg@mail.gmail.com>
	<CAA+RL7FckXhCzHOSXPtUrDrCNzEQewSc1tLQHgZv_xQdqc-J9Q@mail.gmail.com>
	<5DFDB30C-9A3D-4939-81D7-F34727284148@stranden.com>
Message-ID: <50DA1F82.7010600@canterbury.ac.nz>

Rene Nejsum wrote:
> Interessting alternatives, but they do not quite come on as 
> flexible/usefull enough?
> 
> Often debug statements have a lot of text and variables, like:
> 
> log.debug( "The value of X, Y, Z is now: %d %s %d" % ( x, lookup(y), 
> factorial(2**15))

That needn't be a problem:

    log.lazydebug(lambda: "The value of X, Y, Z is now: %d %s %d" %
       (x, lookup(y), factorial(2**15)))

-- 
Greg


From haoyi.sg at gmail.com  Wed Dec 26 03:50:21 2012
From: haoyi.sg at gmail.com (Haoyi Li)
Date: Wed, 26 Dec 2012 10:50:21 +0800
Subject: [Python-ideas] Dynamic code NOPing
In-Reply-To: <50DA1F82.7010600@canterbury.ac.nz>
References: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>
	<CAA0H+QTTUudo6CWCvFT72dhn7ibK=+n+Vp3hCdWp3J40iEbtsg@mail.gmail.com>
	<CAA+RL7FckXhCzHOSXPtUrDrCNzEQewSc1tLQHgZv_xQdqc-J9Q@mail.gmail.com>
	<5DFDB30C-9A3D-4939-81D7-F34727284148@stranden.com>
	<50DA1F82.7010600@canterbury.ac.nz>
Message-ID: <CALruUQKKomVrdANrv1nHg-vJ2LjuVWx1apj0q=SrhsLdycGR8g@mail.gmail.com>

I think the lambda: solution really is the best solution. The additional
cost is the construction of one function object and one invocation per
logging call, which i suspect is about the lower limit.

It's also the most generally applicable: it has nothing specific to logging
in it at all! So it seems to me that if we were to change anything,
improving the lambdas (shorter syntax and/or optimizing away the overhead)
would be the way to go over some
string-interpolation-logging-specific special case in the interpreter.


On Wed, Dec 26, 2012 at 5:49 AM, Greg Ewing <greg.ewing at canterbury.ac.nz>wrote:

> Rene Nejsum wrote:
>
>> Interessting alternatives, but they do not quite come on as
>> flexible/usefull enough?
>>
>> Often debug statements have a lot of text and variables, like:
>>
>> log.debug( "The value of X, Y, Z is now: %d %s %d" % ( x, lookup(y),
>> factorial(2**15))
>>
>
> That needn't be a problem:
>
>    log.lazydebug(lambda: "The value of X, Y, Z is now: %d %s %d" %
>       (x, lookup(y), factorial(2**15)))
>
> --
> Greg
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121226/35d6f1da/attachment.html>

From wuwei23 at gmail.com  Wed Dec 26 04:09:30 2012
From: wuwei23 at gmail.com (alex23)
Date: Tue, 25 Dec 2012 19:09:30 -0800 (PST)
Subject: [Python-ideas] Allow accessing return value inside finally
	clause
In-Reply-To: <CANXboVaS41=YU96t_F0FdcKBufYvUMBDwLQs4ZpxW2UXf61aXw@mail.gmail.com>
References: <213d11b1-e7a5-4336-82a8-fca65a612ad6@googlegroups.com>
	<kbd7co$cmv$1@ger.gmane.org>
	<CANXboVZXTfz5ewm0bixmKR2SMj-dBBS=80_Ky3mbi+UW3sxLLA@mail.gmail.com>
	<a859719d-33f2-4f47-a6d3-2b0ebb9820c6@pp8g2000pbb.googlegroups.com>
	<CANXboVaS41=YU96t_F0FdcKBufYvUMBDwLQs4ZpxW2UXf61aXw@mail.gmail.com>
Message-ID: <5b1fc750-96d1-44a7-bb7c-1bf3ddd89aa2@jl13g2000pbb.googlegroups.com>

On 26 Dec, 10:44, Ram Rachum <r... at rachum.com> wrote:
> I don't think that this makes Python less clear;

How can you possibly say this?

You've changed the `finally` clause from _guaranteed_ execution to
something utterly inconsistent. In fact, finally blocks would need to
have _more_ code to guard against all of the different execution
models you're proposing here. I'm not sure why you think forcing me to
write more & less obvious code in a finally block is a better trade
off than you making clear, explicit use of decorators.

Ambiguity does not equate to clarity. Less typing doesn't either.
Creating small re-usable pieces of code that do the "hard work" for
you, however, _is a lot more clear_.

> I think it's just another
> minor feature that might be useful for some people, and for people who
> don't, it won't matter. How many people use the `for..else` feature, for
> example? Very, very few people do. I've used it only several times. But
> it's still part of Python because it helps in a few rare cases, so that
> makes it worth it *despite* the fact that it might confuse a newbie.

The behaviour of `for..else` doesn't change based on arbitrary
conditions, whereas what you propose is that the finally blocks
behaviour is _fundamentally_ different depending on whether the try
block is fully executed or not, whether an exception is raised or not.
This is absolutely not the same thing, and trying to pass this concern
off as "confusing to newbies" is rather disingenuous. The behaviour
would be _confusing to everybody_.

This is not a valid cost to save you from having to type a few more
keystrokes to decorate the return value.


From jstpierre at mecheye.net  Wed Dec 26 04:25:22 2012
From: jstpierre at mecheye.net (Jasper St. Pierre)
Date: Tue, 25 Dec 2012 22:25:22 -0500
Subject: [Python-ideas] Allow accessing return value inside finally
	clause
In-Reply-To: <5b1fc750-96d1-44a7-bb7c-1bf3ddd89aa2@jl13g2000pbb.googlegroups.com>
References: <213d11b1-e7a5-4336-82a8-fca65a612ad6@googlegroups.com>
	<kbd7co$cmv$1@ger.gmane.org>
	<CANXboVZXTfz5ewm0bixmKR2SMj-dBBS=80_Ky3mbi+UW3sxLLA@mail.gmail.com>
	<a859719d-33f2-4f47-a6d3-2b0ebb9820c6@pp8g2000pbb.googlegroups.com>
	<CANXboVaS41=YU96t_F0FdcKBufYvUMBDwLQs4ZpxW2UXf61aXw@mail.gmail.com>
	<5b1fc750-96d1-44a7-bb7c-1bf3ddd89aa2@jl13g2000pbb.googlegroups.com>
Message-ID: <CAA0H+QTBY=9joe935nq1TWCpHb2+utXo-gvtcV6Pw3_yzFb6pw@mail.gmail.com>

Raum, please make sure you reply on-list. I cannot see your replies here.


On Tue, Dec 25, 2012 at 10:09 PM, alex23 <wuwei23 at gmail.com> wrote:

> On 26 Dec, 10:44, Ram Rachum <r... at rachum.com> wrote:
> > I don't think that this makes Python less clear;
>
> How can you possibly say this?
>
> You've changed the `finally` clause from _guaranteed_ execution to
> something utterly inconsistent. In fact, finally blocks would need to
> have _more_ code to guard against all of the different execution
> models you're proposing here. I'm not sure why you think forcing me to
> write more & less obvious code in a finally block is a better trade
> off than you making clear, explicit use of decorators.
>
> Ambiguity does not equate to clarity. Less typing doesn't either.
> Creating small re-usable pieces of code that do the "hard work" for
> you, however, _is a lot more clear_.
>
> > I think it's just another
> > minor feature that might be useful for some people, and for people who
> > don't, it won't matter. How many people use the `for..else` feature, for
> > example? Very, very few people do. I've used it only several times. But
> > it's still part of Python because it helps in a few rare cases, so that
> > makes it worth it *despite* the fact that it might confuse a newbie.
>
> The behaviour of `for..else` doesn't change based on arbitrary
> conditions, whereas what you propose is that the finally blocks
> behaviour is _fundamentally_ different depending on whether the try
> block is fully executed or not, whether an exception is raised or not.
> This is absolutely not the same thing, and trying to pass this concern
> off as "confusing to newbies" is rather disingenuous. The behaviour
> would be _confusing to everybody_.
>
> This is not a valid cost to save you from having to type a few more
> keystrokes to decorate the return value.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
  Jasper
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121225/e9f6297b/attachment.html>

From ned at nedbatchelder.com  Wed Dec 26 05:12:26 2012
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Tue, 25 Dec 2012 23:12:26 -0500
Subject: [Python-ideas] Dynamic code NOPing
In-Reply-To: <CAPkN8x+uTKEq2KioAoQAzHcON1+hdtCxupoog+CELnqQQuq78w@mail.gmail.com>
References: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>
	<CAA0H+QTTUudo6CWCvFT72dhn7ibK=+n+Vp3hCdWp3J40iEbtsg@mail.gmail.com>
	<CAA+RL7FckXhCzHOSXPtUrDrCNzEQewSc1tLQHgZv_xQdqc-J9Q@mail.gmail.com>
	<5DFDB30C-9A3D-4939-81D7-F34727284148@stranden.com>
	<CADiSq7fxJFGOnh8hhxYCHMwMJQCpSHa0q-UUnnsbF6rMBKvz_A@mail.gmail.com>
	<563458C3-9580-46AE-B343-6987116A3F08@stranden.com>
	<loom.20121225T153325-693@post.gmane.org>
	<CAPkN8x+uTKEq2KioAoQAzHcON1+hdtCxupoog+CELnqQQuq78w@mail.gmail.com>
Message-ID: <50DA792A.5020700@nedbatchelder.com>

On 12/25/2012 6:04 PM, anatoly techtonik wrote:
> > logger.debug(Message(factorial, 2, 15))
>
> > With this setup, no if statements are needed in your code, and the 
> expensive
> > computations only occur when required.
>
> That's still two function calls and three assignments per logging 
> call. Too expensive and syntax unwieldy. I think everybody agrees now 
> that for existing CPython implementation there is really no solution 
> for the problem of expensive logging calls vs code clarity. You have 
> to implement optimization workaround at the cost of readability.

Anatoly, do you have some measurements to justify the "too expensive" 
claim?  Also, do you have an actual example of expensive logging?  I 
doubt your real code is logging the factorial of 2**15.   What is 
actually in your debug log that is expensive?  It will be much easier to 
discuss solutions if we are talking about actual problems.

>
> The idea is to fix the interpreter, introducing a "feature block" - 
> execution block that works only if it is enabled. Execution block 
> for logging example below is defined by function name "debug" and 
> braces ().
>
>     debug( <block contents> )
>
> debug is an object of 'feature' type, which is only 
> executed/evaluated, if the feature is enabled in a table of features.
>

This feels both sketchy and strange, and not at all integrated with 
existing Python semantics.

--Ned.


From dstanek at dstanek.com  Wed Dec 26 05:46:39 2012
From: dstanek at dstanek.com (David Stanek)
Date: Tue, 25 Dec 2012 23:46:39 -0500
Subject: [Python-ideas] Dynamic code NOPing
In-Reply-To: <50DA792A.5020700@nedbatchelder.com>
References: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>
	<CAA0H+QTTUudo6CWCvFT72dhn7ibK=+n+Vp3hCdWp3J40iEbtsg@mail.gmail.com>
	<CAA+RL7FckXhCzHOSXPtUrDrCNzEQewSc1tLQHgZv_xQdqc-J9Q@mail.gmail.com>
	<5DFDB30C-9A3D-4939-81D7-F34727284148@stranden.com>
	<CADiSq7fxJFGOnh8hhxYCHMwMJQCpSHa0q-UUnnsbF6rMBKvz_A@mail.gmail.com>
	<563458C3-9580-46AE-B343-6987116A3F08@stranden.com>
	<loom.20121225T153325-693@post.gmane.org>
	<CAPkN8x+uTKEq2KioAoQAzHcON1+hdtCxupoog+CELnqQQuq78w@mail.gmail.com>
	<50DA792A.5020700@nedbatchelder.com>
Message-ID: <CAO69NdkcXnPczgsdY=qiEQ8TdZwLMDVyPTECM--GFF79dgVTig@mail.gmail.com>

On Tue, Dec 25, 2012 at 11:12 PM, Ned Batchelder <ned at nedbatchelder.com>wrote:

> On 12/25/2012 6:04 PM, anatoly techtonik wrote:
>
>> > logger.debug(Message(**factorial, 2, 15))
>>
>> > With this setup, no if statements are needed in your code, and the
>> expensive
>> > computations only occur when required.
>>
>> That's still two function calls and three assignments per logging call.
>> Too expensive and syntax unwieldy. I think everybody agrees now that for
>> existing CPython implementation there is really no solution for the problem
>> of expensive logging calls vs code clarity. You have to implement
>> optimization workaround at the cost of readability.
>>
>
> Anatoly, do you have some measurements to justify the "too expensive"
> claim?  Also, do you have an actual example of expensive logging?  I doubt
> your real code is logging the factorial of 2**15.   What is actually in
> your debug log that is expensive?  It will be much easier to discuss
> solutions if we are talking about actual problems.
>
>
I was thinking the same thing as I read though this thread. I'm typically
logging the result of a calculation and not doing a calculation only
because I'm logging.

On the other hand I have used a homegrown logging system (existed well
before Python's logging module) that allowed the following:

   >>> logger.warn('factorial = %s', lambda: factorial(2**15))

Instead of just outputting the string representation of the lambda the
logger would evaluate the function and str() the return value. Something
like this would be trivial to implement on top of Python's logging module.

-- 
David
blog: http://www.traceback.org
twitter: http://twitter.com/dstanek
www: http://dstanek.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121225/bbe03e2a/attachment.html>

From rosuav at gmail.com  Wed Dec 26 06:48:50 2012
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 26 Dec 2012 16:48:50 +1100
Subject: [Python-ideas] Dynamic code NOPing
In-Reply-To: <50DA792A.5020700@nedbatchelder.com>
References: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>
	<CAA0H+QTTUudo6CWCvFT72dhn7ibK=+n+Vp3hCdWp3J40iEbtsg@mail.gmail.com>
	<CAA+RL7FckXhCzHOSXPtUrDrCNzEQewSc1tLQHgZv_xQdqc-J9Q@mail.gmail.com>
	<5DFDB30C-9A3D-4939-81D7-F34727284148@stranden.com>
	<CADiSq7fxJFGOnh8hhxYCHMwMJQCpSHa0q-UUnnsbF6rMBKvz_A@mail.gmail.com>
	<563458C3-9580-46AE-B343-6987116A3F08@stranden.com>
	<loom.20121225T153325-693@post.gmane.org>
	<CAPkN8x+uTKEq2KioAoQAzHcON1+hdtCxupoog+CELnqQQuq78w@mail.gmail.com>
	<50DA792A.5020700@nedbatchelder.com>
Message-ID: <CAPTjJmqEZDXh3GSuoQX_OZfoxrmv7HNMg5qL9UEmM5eerVzdkA@mail.gmail.com>

On Wed, Dec 26, 2012 at 3:12 PM, Ned Batchelder <ned at nedbatchelder.com> wrote:
> Also, do you have an actual example of expensive logging?  I doubt your real
> code is logging the factorial of 2**15.   What is actually in your debug log
> that is expensive?  It will be much easier to discuss solutions if we are
> talking about actual problems.

Not specifically a Python logging issue, but what I periodically find
in my code is that there's an "internal representation" and an
"external representation" that have some sort of direct relationship.
I could easily log the internal form at many points, but that's not
particularly useful; logging the external involves either some hefty
calculations, or perhaps a linear search of some list of possibilities
(eg a reverse lookup of a constant - do you pay the cost of building
up a reverse dictionary, or just do the search?). Obviously that's
nothing like as expensive as 2**15!, but it makes more sense to be
logging "WM_MOUSEMOVE" than "Msg 512".

ChrisA


From ram.rachum at gmail.com  Tue Dec 25 14:56:45 2012
From: ram.rachum at gmail.com (Ram Rachum)
Date: Tue, 25 Dec 2012 05:56:45 -0800 (PST)
Subject: [Python-ideas] Allow deleting slice in an OrderedDict
Message-ID: <d8efbec9-412c-492b-927b-5e4926a9a7f9@googlegroups.com>

When I have an OrderedDict, I want to be able to delete a slice of it. I 
want to be able to do:

    del ordered_dict[:3]

To delete the first 3 items, like I would do in a list.

Is there any reason why this shouldn't be implemented?


Thanks,
Ram.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121225/df6db1fc/attachment.html>

From ncoghlan at gmail.com  Wed Dec 26 08:27:52 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 26 Dec 2012 17:27:52 +1000
Subject: [Python-ideas] Allow deleting slice in an OrderedDict
In-Reply-To: <d8efbec9-412c-492b-927b-5e4926a9a7f9@googlegroups.com>
References: <d8efbec9-412c-492b-927b-5e4926a9a7f9@googlegroups.com>
Message-ID: <CADiSq7dvsg-3oO1eMxbgGsPtfs0wKYfWqfruHsRcufSaePJb+w@mail.gmail.com>

(replying again, as the original somehow had a broken googlegroups.com
address instead of the proper python.org one)

On Tue, Dec 25, 2012 at 11:56 PM, Ram Rachum <ram.rachum at gmail.com> wrote:
> When I have an OrderedDict, I want to be able to delete a slice of it. I
> want to be able to do:
>
>     del ordered_dict[:3]
>
> To delete the first 3 items, like I would do in a list.
>
> Is there any reason why this shouldn't be implemented?

Yes, because if you want to do that, you need a list, not an ordered
dictionary. Don't try to lump every possible operation into one
incoherent uber-type. If you need mutable list-like behaviour *and*
mapping behaviour, you're better off with an ordinary mapping and a
separate list of keys.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


From tjreedy at udel.edu  Wed Dec 26 08:45:09 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 26 Dec 2012 02:45:09 -0500
Subject: [Python-ideas] Allow deleting slice in an OrderedDict
In-Reply-To: <d8efbec9-412c-492b-927b-5e4926a9a7f9@googlegroups.com>
References: <d8efbec9-412c-492b-927b-5e4926a9a7f9@googlegroups.com>
Message-ID: <kbe9um$emo$1@ger.gmane.org>

On 12/25/2012 8:56 AM, Ram Rachum wrote:
> When I have an OrderedDict, I want to be able to delete a slice of it. I
> want to be able to do:
>
>      del ordered_dict[:3]
>
> To delete the first 3 items, like I would do in a list.
>
> Is there any reason why this shouldn't be implemented?

An OrderedDict is a mapping (has the mapping api) with a defined 
iteration order (the order of entry). It is not a sequence and does not 
have the sequence api. Indeed, a DictList is not possible because dl[2] 
would look for the item associated with 2 as a key rather than 2 as a 
position. So od[2:3] would *not* be the same as od[2], violating the 
usually properly of within-sequence length-1 slices.

-- 
Terry Jan Reedy



From tjreedy at udel.edu  Wed Dec 26 08:58:00 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 26 Dec 2012 02:58:00 -0500
Subject: [Python-ideas] collections.sortedset proposal
In-Reply-To: <CAA0gF6oAqkY+-x-TUHQeS0TaMF8JAED5YwfJekON3ZbZG2qtNw@mail.gmail.com>
References: <CAA0gF6oAqkY+-x-TUHQeS0TaMF8JAED5YwfJekON3ZbZG2qtNw@mail.gmail.com>
Message-ID: <kbeamo$jlk$1@ger.gmane.org>

On 12/25/2012 5:24 PM, Paul Colomiets wrote:
> Hi,
>
> I want to propose to include SortedSet data structure into collections module.
>
> SortedSet (name borrowed from Redis) is a basically a mapping of
> (unique) keys to scores, that allows fast slicing by ordinal number
> and by score.

Since a set, in general, is not a mapping, I do not understand what you 
mean. If you mean a mapping from sorted position to item, then I would 
call it a sortedlist.

> There are plenty of use cases for the sorted sets:
>
> * Leaderboard for a game

This looks like an auto-sorted list.

> * Priority queue (that supports task deletion)

This looks like something else.

> * Timer list (e.g. can be used for tulip, supports deletion too)
> * Caches with TTL-based, LFU or LRU eviction (including `functools.lru_cache`)

These look like sorted lists.

> * Search databases with relevance scores
> * Statistics (many use cases)

These are rather vague.

> * Replacement for `collections.Counter` with faster `most_common()`

This looks like something else.

> I have first draft of pure python implementation:
>
> https://github.com/tailhook/sortedsets
> http://pypi.python.org/pypi/sortedsets/1.0
>
> The implementation is closely modeled on Redis. Internally it consists
> of a dict for mapping between keys and scores, and a skiplist for
> scores. So most operations are done with O(log n) time. The actual
> performance is probably very slow for pure-python implementation, but
> can be fixed by C code later. The asymptotic performance seems to be
> OK.
>
> So my questions are:
>
> 1. Do you think SortedSets are eligible for inclusion to stdlib?
> 2. Do I need a PEP?
> 3. Any comments on the implementation?

The standard answer is to list on or submit to pypi and get community 
approval and adoption. Then a pep with a commitment to maintenance even 
while others interfere with your 'baby'. Long-time core committers 
sometimes get to cut the process short, but even Guido is starting his 
propose async module/package with a pep and publicly available code for 3.3.

-- 
Terry Jan Reedy



From storchaka at gmail.com  Wed Dec 26 09:58:12 2012
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 26 Dec 2012 10:58:12 +0200
Subject: [Python-ideas] collections.sortedset proposal
In-Reply-To: <CAA0gF6oAqkY+-x-TUHQeS0TaMF8JAED5YwfJekON3ZbZG2qtNw@mail.gmail.com>
References: <CAA0gF6oAqkY+-x-TUHQeS0TaMF8JAED5YwfJekON3ZbZG2qtNw@mail.gmail.com>
Message-ID: <kbee75$ch8$1@ger.gmane.org>

On 26.12.12 00:24, Paul Colomiets wrote:
> P.S.: Sorted sets in redis are not the same thing as sorted sets in
> blist. So maybe a better name?

SortedSet in Java (and some other languages) is something entirely 
different.



From paul at colomiets.name  Wed Dec 26 10:59:51 2012
From: paul at colomiets.name (Paul Colomiets)
Date: Wed, 26 Dec 2012 11:59:51 +0200
Subject: [Python-ideas] collections.sortedset proposal
In-Reply-To: <kbeamo$jlk$1@ger.gmane.org>
References: <CAA0gF6oAqkY+-x-TUHQeS0TaMF8JAED5YwfJekON3ZbZG2qtNw@mail.gmail.com>
	<kbeamo$jlk$1@ger.gmane.org>
Message-ID: <CAA0gF6p0ccXp+p3Dp=uW+xRnQegVhGCjUfHAUTDAQ=VFKD-Wtg@mail.gmail.com>

Hi,

On Wed, Dec 26, 2012 at 9:58 AM, Terry Reedy <tjreedy at udel.edu> wrote:
>> SortedSet (name borrowed from Redis) is a basically a mapping of
>> (unique) keys to scores, that allows fast slicing by ordinal number
>> and by score.
>
>
> Since a set, in general, is not a mapping, I do not understand what you
> mean. If you mean a mapping from sorted position to item, then I would call
> it a sortedlist.
>

Ok. My description is vague. Here is one from Redis documentation:

Redis Sorted Sets are, similarly to Redis Sets, non repeating
collections of Strings. The difference is that every member of a
Sorted Set is associated with score, that is used in order to take the
sorted set ordered, from the smallest to the greatest score. While
members are unique, scores may be repeated.

http://redis.io/topics/data-types

I was just so silly to suppose that everybody knows Redis data types.

>
>> There are plenty of use cases for the sorted sets:
>>
>> * Leaderboard for a game
>
>
> This looks like an auto-sorted list.
>

Yep. The crucial property is fast insertion and updates.

>
>> * Priority queue (that supports task deletion)
>
>
> This looks like something else.
>

Priority queue is basically an auto-sorted list too. No?

>
>> * Timer list (e.g. can be used for tulip, supports deletion too)
>> * Caches with TTL-based, LFU or LRU eviction (including
>> `functools.lru_cache`)
>
>
> These look like sorted lists.
>

Yup. But we can't call the data structure  SortedList, because
elements must be unique.

>
>> * Search databases with relevance scores
>> * Statistics (many use cases)
>
>
> These are rather vague.
>

Yes. Included just to give some overview.

>
>> * Replacement for `collections.Counter` with faster `most_common()`
>
>
> This looks like something else.
>

Why? If you have a list sorted by counter values, you can have
`most_common()` by slicing.

> The standard answer is to list on or submit to pypi and get community
> approval and adoption. Then a pep with a commitment to maintenance even
> while others interfere with your 'baby'. Long-time core committers sometimes
> get to cut the process short, but even Guido is starting his propose async
> module/package with a pep and publicly available code for 3.3.
>

It's on the PyPI now. I know the standard answer :) So you don't
understand what SortedSets are and what would be good name for data
structure, or do you think it's useless?

The crucial point of adoption, is that most of the time people don't
want to add additional dependency for simple tasks like priority
queue, even if it's faster or more featureful. And I think that
SortedSets in Redis have proved their usefulness as a data structure.

--
Paul


From ncoghlan at gmail.com  Wed Dec 26 11:12:11 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 26 Dec 2012 20:12:11 +1000
Subject: [Python-ideas] collections.sortedset proposal
In-Reply-To: <CAA0gF6p0ccXp+p3Dp=uW+xRnQegVhGCjUfHAUTDAQ=VFKD-Wtg@mail.gmail.com>
References: <CAA0gF6oAqkY+-x-TUHQeS0TaMF8JAED5YwfJekON3ZbZG2qtNw@mail.gmail.com>
	<kbeamo$jlk$1@ger.gmane.org>
	<CAA0gF6p0ccXp+p3Dp=uW+xRnQegVhGCjUfHAUTDAQ=VFKD-Wtg@mail.gmail.com>
Message-ID: <CADiSq7e+3UgAatSAufhf=XrU+hSWhVcQSYsRs3Z7F1iWyNOUXw@mail.gmail.com>

On Wed, Dec 26, 2012 at 7:59 PM, Paul Colomiets <paul at colomiets.name> wrote:
> Hi,
>
> On Wed, Dec 26, 2012 at 9:58 AM, Terry Reedy <tjreedy at udel.edu> wrote:
>>> SortedSet (name borrowed from Redis) is a basically a mapping of
>>> (unique) keys to scores, that allows fast slicing by ordinal number
>>> and by score.
>>
>>
>> Since a set, in general, is not a mapping, I do not understand what you
>> mean. If you mean a mapping from sorted position to item, then I would call
>> it a sortedlist.
>>
>
> Ok. My description is vague. Here is one from Redis documentation:
>
> Redis Sorted Sets are, similarly to Redis Sets, non repeating
> collections of Strings. The difference is that every member of a
> Sorted Set is associated with score, that is used in order to take the
> sorted set ordered, from the smallest to the greatest score. While
> members are unique, scores may be repeated.

Perhaps you mean a heap queue? The standard library doesn't have a
separate type for that, it just has some functions for treating a list
as a heap: http://docs.python.org/2/library/heapq.html

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


From paul at colomiets.name  Wed Dec 26 11:59:40 2012
From: paul at colomiets.name (Paul Colomiets)
Date: Wed, 26 Dec 2012 12:59:40 +0200
Subject: [Python-ideas] collections.sortedset proposal
In-Reply-To: <CADiSq7e+3UgAatSAufhf=XrU+hSWhVcQSYsRs3Z7F1iWyNOUXw@mail.gmail.com>
References: <CAA0gF6oAqkY+-x-TUHQeS0TaMF8JAED5YwfJekON3ZbZG2qtNw@mail.gmail.com>
	<kbeamo$jlk$1@ger.gmane.org>
	<CAA0gF6p0ccXp+p3Dp=uW+xRnQegVhGCjUfHAUTDAQ=VFKD-Wtg@mail.gmail.com>
	<CADiSq7e+3UgAatSAufhf=XrU+hSWhVcQSYsRs3Z7F1iWyNOUXw@mail.gmail.com>
Message-ID: <CAA0gF6qeC2uWC3p07qQnY2X2VFhhZQT+Wr3HQZ1d+=Vg99CZfg@mail.gmail.com>

Hi Nick,

On Wed, Dec 26, 2012 at 12:12 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Perhaps you mean a heap queue? The standard library doesn't have a
> separate type for that, it just has some functions for treating a list
> as a heap: http://docs.python.org/2/library/heapq.html
>

The problem with heap queue (as implemented in python) as priority
queue or list of timers is that it does not support deletion of the
tasks (at least not in efficient manner). For other use cases, e.g.
for a leader board heapq doesn't allow efficient slicing.

Or do you mean "heap queue" is a nice name for the data structure that
redis calls "sorted set"?

--
Paul


From wuwei23 at gmail.com  Wed Dec 26 11:58:27 2012
From: wuwei23 at gmail.com (alex23)
Date: Wed, 26 Dec 2012 02:58:27 -0800 (PST)
Subject: [Python-ideas] Allow accessing return value inside finally
	clause
In-Reply-To: <CANXboVa2WSCK7DMfaRiGE6OZNvySpsNJLU3zDGWTqe4mn68Lmw@mail.gmail.com>
References: <213d11b1-e7a5-4336-82a8-fca65a612ad6@googlegroups.com>
	<kbd7co$cmv$1@ger.gmane.org>
	<CANXboVZXTfz5ewm0bixmKR2SMj-dBBS=80_Ky3mbi+UW3sxLLA@mail.gmail.com>
	<a859719d-33f2-4f47-a6d3-2b0ebb9820c6@pp8g2000pbb.googlegroups.com>
	<CANXboVaS41=YU96t_F0FdcKBufYvUMBDwLQs4ZpxW2UXf61aXw@mail.gmail.com>
	<5b1fc750-96d1-44a7-bb7c-1bf3ddd89aa2@jl13g2000pbb.googlegroups.com>
	<CANXboVa2WSCK7DMfaRiGE6OZNvySpsNJLU3zDGWTqe4mn68Lmw@mail.gmail.com>
Message-ID: <854f85a6-9a74-439e-8075-6b3f2e7e721d@d2g2000pbd.googlegroups.com>

On Dec 26, 7:55?pm, Ram Rachum <r... at rachum.com> wrote:
> Alex: I'm getting the feeling that you misunderstand what I'm proposing
> here. I'm proposing that the return value will be accessible in the
> `finally` clause. In a similar (if shorter) way that the exception info is
> available by using `sys.exc_info()`.

I get what you're saying. What you haven't shown is how introducing
'return value' semantics to try/finally blocks does anything other
than make them more confusing to people. Functions have return values.
Decorators wrap functions and can thus be used to pre- or post-process
the in/outputs for the function. This is clearly defined and a well
known approach to your problem. The onus is on you to show how turning
try/finally into a kitchen-sink of behaviour will improve the
language, preferably without recourse to what you "think" or "feel".

A concrete use-case would help here, but I'm 100% convinced that
whatever you come up with, there'll be a better solution using
decorators that works right now.


From storchaka at gmail.com  Wed Dec 26 13:31:12 2012
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 26 Dec 2012 14:31:12 +0200
Subject: [Python-ideas] Add support keyword arguments with suitable defaults
 for OSError and subclasses
Message-ID: <kbeqme$abj$1@ger.gmane.org>

Now OSError constructor does not support keyword arguments. It will be 
good add support for followed keyword arguments: "errno", "strerror", 
"filename".

If "strerror" is not specified, a standard error message corresponding 
to errno is used. If "errno" is not specified for an OSError subclass, 
an errno associated with this subclass is used (if only one errno 
associated).

For backward compatibility perhaps keyword arguments should be 
incompatible with any positional arguments (or at least suitable 
defaults should used only if any keyword argument specified).

Examples:

 >>> OSError(errno=errno.ENOENT)
FileNotFoundError(2, 'No such file or directory')
 >>> FileNotFoundError(filename='qwerty')
FileNotFoundError(2, 'No such file or directory')
 >>> FileNotFoundError(strerr='Bad file')
FileNotFoundError(2, 'Bad file')



From ned at nedbatchelder.com  Wed Dec 26 14:21:16 2012
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Wed, 26 Dec 2012 08:21:16 -0500
Subject: [Python-ideas] Documenting Python warts on Stack Overflow
In-Reply-To: <CAPkN8x+TF6znvM-Kd2Qad_ExjYf-hv41S3_Uk0U0OqAGO8HteQ@mail.gmail.com>
References: <CAPkN8x+TF6znvM-Kd2Qad_ExjYf-hv41S3_Uk0U0OqAGO8HteQ@mail.gmail.com>
Message-ID: <50DAF9CC.3060208@nedbatchelder.com>

On 12/25/2012 7:10 PM, anatoly techtonik wrote:
> I am thinking about [python-wart] on SO. There is no currently a list 
> of Python warts, and building a better language is impossible without 
> a clear visibility of warts in current implementations.
>
> Why Roundup doesn't work ATM.
> - warts are lost among other "won't fix" and "works for me" issues
> - no way to edit description to make it more clear
> - no voting/stars to percieve how important is this issue
> - no comment/noise filtering
> and the most valuable
> - there is no query to list warts sorted by popularity to explore 
> other time-consuming areas of Python you are not aware of, but which 
> can popup one day
>
> SO at least allows:
> + voting
> + community wiki edits
> + useful comment upvoting
> + sorted lists
> + user editable tags (adding new warts is easy)
>

1) Stack Overflow probably won't accept this as a question.
2) a bunch of people answering "what is a wart" is not a way to get the 
Python community to agree on what needs to be changed in the language.  
People with ideas need to write them up thoughtfully with proposals for 
improvements, and then engage meaningfully in the discussion that follows.

You seem to think that people just need to identify "warts" and then we 
can start changing the language to remove them.  What you consider a 
"wart" is probably the result of a complex balance of competing forces.  
Changing Python is hard.  We take backward compatibility very seriously, 
and that sometimes makes it hard to "remove warts."

--Ned.

> This post is a result of facing with numerous locals/settrace/exec 
> issues that are closed on tracker. I also have my own list of other 
> issues (logging/subprocess) at GC project, which I might be unable to 
> maintain in future. There is also some undocumented stuff (subprocess 
> deadlocks) that I'm investigating, but don't have time for a write-up. 
> So I'd rather move this somewhere where it could be updated.
> -- 
> anatoly t.
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121226/a5cba8f6/attachment.html>

From ncoghlan at gmail.com  Wed Dec 26 15:32:02 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 27 Dec 2012 00:32:02 +1000
Subject: [Python-ideas] collections.sortedset proposal
In-Reply-To: <CAA0gF6qeC2uWC3p07qQnY2X2VFhhZQT+Wr3HQZ1d+=Vg99CZfg@mail.gmail.com>
References: <CAA0gF6oAqkY+-x-TUHQeS0TaMF8JAED5YwfJekON3ZbZG2qtNw@mail.gmail.com>
	<kbeamo$jlk$1@ger.gmane.org>
	<CAA0gF6p0ccXp+p3Dp=uW+xRnQegVhGCjUfHAUTDAQ=VFKD-Wtg@mail.gmail.com>
	<CADiSq7e+3UgAatSAufhf=XrU+hSWhVcQSYsRs3Z7F1iWyNOUXw@mail.gmail.com>
	<CAA0gF6qeC2uWC3p07qQnY2X2VFhhZQT+Wr3HQZ1d+=Vg99CZfg@mail.gmail.com>
Message-ID: <CADiSq7eMvtQDHLvcn9x9on-VOCyKfOu7vraXVXW78YYUipFb6A@mail.gmail.com>

On Wed, Dec 26, 2012 at 8:59 PM, Paul Colomiets <paul at colomiets.name> wrote:
> Hi Nick,
>
> On Wed, Dec 26, 2012 at 12:12 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> Perhaps you mean a heap queue? The standard library doesn't have a
>> separate type for that, it just has some functions for treating a list
>> as a heap: http://docs.python.org/2/library/heapq.html
>>
>
> The problem with heap queue (as implemented in python) as priority
> queue or list of timers is that it does not support deletion of the
> tasks (at least not in efficient manner). For other use cases, e.g.
> for a leader board heapq doesn't allow efficient slicing.
>
> Or do you mean "heap queue" is a nice name for the data structure that
> redis calls "sorted set"?

I mean if what you want is a heap queue with a more efficient
heappop() implementation (due to a different underlying data
structure), then it's probably clearer to call it that.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


From paul at colomiets.name  Wed Dec 26 16:09:27 2012
From: paul at colomiets.name (Paul Colomiets)
Date: Wed, 26 Dec 2012 17:09:27 +0200
Subject: [Python-ideas] collections.sortedset proposal
In-Reply-To: <CADiSq7eMvtQDHLvcn9x9on-VOCyKfOu7vraXVXW78YYUipFb6A@mail.gmail.com>
References: <CAA0gF6oAqkY+-x-TUHQeS0TaMF8JAED5YwfJekON3ZbZG2qtNw@mail.gmail.com>
	<kbeamo$jlk$1@ger.gmane.org>
	<CAA0gF6p0ccXp+p3Dp=uW+xRnQegVhGCjUfHAUTDAQ=VFKD-Wtg@mail.gmail.com>
	<CADiSq7e+3UgAatSAufhf=XrU+hSWhVcQSYsRs3Z7F1iWyNOUXw@mail.gmail.com>
	<CAA0gF6qeC2uWC3p07qQnY2X2VFhhZQT+Wr3HQZ1d+=Vg99CZfg@mail.gmail.com>
	<CADiSq7eMvtQDHLvcn9x9on-VOCyKfOu7vraXVXW78YYUipFb6A@mail.gmail.com>
Message-ID: <CAA0gF6r1g1MKoq0G9dSGe882WQ2-YmzqhNDOeW=RHH8bymxcdw@mail.gmail.com>

Hi Nick,

On Wed, Dec 26, 2012 at 4:32 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> The problem with heap queue (as implemented in python) as priority
>> queue or list of timers is that it does not support deletion of the
>> tasks (at least not in efficient manner). For other use cases, e.g.
>> for a leader board heapq doesn't allow efficient slicing.
>>
>> Or do you mean "heap queue" is a nice name for the data structure that
>> redis calls "sorted set"?
>
> I mean if what you want is a heap queue with a more efficient
> heappop() implementation (due to a different underlying data
> structure), then it's probably clearer to call it that.
>

The underlying data structure is skiplists not heap. It would be
strange to call it heap-something.  But, yes, for the discussion,
similarity with heapqueue may be a better starting point.

--
Paul


From guido at python.org  Wed Dec 26 17:58:31 2012
From: guido at python.org (Guido van Rossum)
Date: Wed, 26 Dec 2012 08:58:31 -0800
Subject: [Python-ideas] Allow deleting slice in an OrderedDict
In-Reply-To: <kbe9um$emo$1@ger.gmane.org>
References: <d8efbec9-412c-492b-927b-5e4926a9a7f9@googlegroups.com>
	<kbe9um$emo$1@ger.gmane.org>
Message-ID: <CAP7+vJK03_Wr6wCpBEzqzH65ZC_KBbYv9G7aD5Y=FeJ47vWS2A@mail.gmail.com>

Perhaps the desired functionality can be spelled as a method? Would it be
easy to implement?

--Guido

On Wednesday, December 26, 2012, Terry Reedy wrote:

> On 12/25/2012 8:56 AM, Ram Rachum wrote:
>
>> When I have an OrderedDict, I want to be able to delete a slice of it. I
>> want to be able to do:
>>
>>      del ordered_dict[:3]
>>
>> To delete the first 3 items, like I would do in a list.
>>
>> Is there any reason why this shouldn't be implemented?
>>
>
> An OrderedDict is a mapping (has the mapping api) with a defined iteration
> order (the order of entry). It is not a sequence and does not have the
> sequence api. Indeed, a DictList is not possible because dl[2] would look
> for the item associated with 2 as a key rather than 2 as a position. So
> od[2:3] would *not* be the same as od[2], violating the usually properly of
> within-sequence length-1 slices.
>
> --
> Terry Jan Reedy
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121226/5906e077/attachment.html>

From michelelacchia at gmail.com  Wed Dec 26 18:43:34 2012
From: michelelacchia at gmail.com (Michele Lacchia)
Date: Wed, 26 Dec 2012 18:43:34 +0100
Subject: [Python-ideas] collections.sortedset proposal
In-Reply-To: <CAA0gF6r1g1MKoq0G9dSGe882WQ2-YmzqhNDOeW=RHH8bymxcdw@mail.gmail.com>
References: <CAA0gF6oAqkY+-x-TUHQeS0TaMF8JAED5YwfJekON3ZbZG2qtNw@mail.gmail.com>
	<kbeamo$jlk$1@ger.gmane.org>
	<CAA0gF6p0ccXp+p3Dp=uW+xRnQegVhGCjUfHAUTDAQ=VFKD-Wtg@mail.gmail.com>
	<CADiSq7e+3UgAatSAufhf=XrU+hSWhVcQSYsRs3Z7F1iWyNOUXw@mail.gmail.com>
	<CAA0gF6qeC2uWC3p07qQnY2X2VFhhZQT+Wr3HQZ1d+=Vg99CZfg@mail.gmail.com>
	<CADiSq7eMvtQDHLvcn9x9on-VOCyKfOu7vraXVXW78YYUipFb6A@mail.gmail.com>
	<CAA0gF6r1g1MKoq0G9dSGe882WQ2-YmzqhNDOeW=RHH8bymxcdw@mail.gmail.com>
Message-ID: <CAFjP7=V63mS1EvWrhCVkGoFjAapSD6-7=E2zE9K3X6KwY7A4=w@mail.gmail.com>

For the record, there is another implementation of skiplists on PyPI:
http://pypi.python.org/pypi/skiplist/0.1.0


2012/12/26 Paul Colomiets <paul at colomiets.name>

> Hi Nick,
>
> On Wed, Dec 26, 2012 at 4:32 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> >> The problem with heap queue (as implemented in python) as priority
> >> queue or list of timers is that it does not support deletion of the
> >> tasks (at least not in efficient manner). For other use cases, e.g.
> >> for a leader board heapq doesn't allow efficient slicing.
> >>
> >> Or do you mean "heap queue" is a nice name for the data structure that
> >> redis calls "sorted set"?
> >
> > I mean if what you want is a heap queue with a more efficient
> > heappop() implementation (due to a different underlying data
> > structure), then it's probably clearer to call it that.
> >
>
> The underlying data structure is skiplists not heap. It would be
> strange to call it heap-something.  But, yes, for the discussion,
> similarity with heapqueue may be a better starting point.
>
> --
> Paul
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
Michele Lacchia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121226/d057bd93/attachment.html>

From storchaka at gmail.com  Wed Dec 26 18:54:03 2012
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 26 Dec 2012 19:54:03 +0200
Subject: [Python-ideas] Allow deleting slice in an OrderedDict
In-Reply-To: <CAP7+vJK03_Wr6wCpBEzqzH65ZC_KBbYv9G7aD5Y=FeJ47vWS2A@mail.gmail.com>
References: <d8efbec9-412c-492b-927b-5e4926a9a7f9@googlegroups.com>
	<kbe9um$emo$1@ger.gmane.org>
	<CAP7+vJK03_Wr6wCpBEzqzH65ZC_KBbYv9G7aD5Y=FeJ47vWS2A@mail.gmail.com>
Message-ID: <kbfdju$28l$1@ger.gmane.org>

On 26.12.12 18:58, Guido van Rossum wrote:
> Perhaps the desired functionality can be spelled as a method? Would it
> be easy to implement?

This is a pretty trivial method.

     def drop_items(self, n, last=True)
         for i in range(n):
             self.popitem(last)

You can wrap it with "try/except KeyError" or add pre-execution checks 
if you will. I doubt such trivial and not common used method needed to 
be in stdlib.



From eliben at gmail.com  Wed Dec 26 19:28:10 2012
From: eliben at gmail.com (Eli Bendersky)
Date: Wed, 26 Dec 2012 10:28:10 -0800
Subject: [Python-ideas] Documenting Python warts on Stack Overflow
In-Reply-To: <CAPkN8x+TF6znvM-Kd2Qad_ExjYf-hv41S3_Uk0U0OqAGO8HteQ@mail.gmail.com>
References: <CAPkN8x+TF6znvM-Kd2Qad_ExjYf-hv41S3_Uk0U0OqAGO8HteQ@mail.gmail.com>
Message-ID: <CAF-Rda8b+7LtuymG4VmrhJEt6yEWGea9-TjoNbo9jznSxYn_5Q@mail.gmail.com>

On Tue, Dec 25, 2012 at 4:10 PM, anatoly techtonik <techtonik at gmail.com>wrote:

> I am thinking about [python-wart] on SO. There is no currently a list of
> Python warts, and building a better language is impossible without a clear
> visibility of warts in current implementations.
>
> Why Roundup doesn't work ATM.
> - warts are lost among other "won't fix" and "works for me" issues
> - no way to edit description to make it more clear
> - no voting/stars to percieve how important is this issue
> - no comment/noise filtering
> and the most valuable
> - there is no query to list warts sorted by popularity to explore other
> time-consuming areas of Python you are not aware of, but which can popup
> one day
>
> SO at least allows:
> + voting
> + community wiki edits
> + useful comment upvoting
> + sorted lists
> + user editable tags (adding new warts is easy)
>
> This post is a result of facing with numerous locals/settrace/exec issues
> that are closed on tracker. I also have my own list of other issues
> (logging/subprocess) at GC project, which I might be unable to maintain in
> future. There is also some undocumented stuff (subprocess deadlocks) that
> I'm investigating, but don't have time for a write-up. So I'd rather move
> this somewhere where it could be updated.
>  --
>

Is this a question or just a rant? If it's a question, I must have missed
what it is exactly that you're asking?

The web is a pretty free place. Feel free to create such a tag on Stack
Overflow and maintain it, if the SO community agrees it has merit. Don't
expect the Python developers to officially endorse it, because "warts" is a
very subjective issue. A "wart" for one person is a reasonable behavior for
another.

Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121226/5f98d374/attachment.html>

From techtonik at gmail.com  Wed Dec 26 19:28:38 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Wed, 26 Dec 2012 21:28:38 +0300
Subject: [Python-ideas] Documenting Python warts on Stack Overflow
In-Reply-To: <50DAF9CC.3060208@nedbatchelder.com>
References: <CAPkN8x+TF6znvM-Kd2Qad_ExjYf-hv41S3_Uk0U0OqAGO8HteQ@mail.gmail.com>
	<50DAF9CC.3060208@nedbatchelder.com>
Message-ID: <CAPkN8xK2joFWV=QAyuqxTT_jrk8oCtW20RPiv0x31RM1w10u3Q@mail.gmail.com>

On Wed, Dec 26, 2012 at 4:21 PM, Ned Batchelder <ned at nedbatchelder.com>wrote:

>  On 12/25/2012 7:10 PM, anatoly techtonik wrote:
>
> I am thinking about [python-wart] on SO. There is no currently a list of
> Python warts, and building a better language is impossible without a clear
> visibility of warts in current implementations.
>
>  Why Roundup doesn't work ATM.
> - warts are lost among other "won't fix" and "works for me" issues
> - no way to edit description to make it more clear
> - no voting/stars to percieve how important is this issue
> - no comment/noise filtering
> and the most valuable
> - there is no query to list warts sorted by popularity to explore other
> time-consuming areas of Python you are not aware of, but which can popup
> one day
>
> SO at least allows:
> + voting
> + community wiki edits
> + useful comment upvoting
> + sorted lists
> + user editable tags (adding new warts is easy)
>
>
> 1) Stack Overflow probably won't accept this as a question.
>

That's why it is proposed as a community wiki.

 2) a bunch of people answering "what is a wart" is not a way to get the
> Python community to agree on what needs to be changed in the language.
> People with ideas need to write them up thoughtfully with proposals for
> improvements, and then engage meaningfully in the discussion that follows.
>
> You seem to think that people just need to identify "warts" and then we
> can start changing the language to remove them.  What you consider a "wart"
> is probably the result of a complex balance of competing forces.  Changing
> Python is hard.  We take backward compatibility very seriously, and that
> sometimes makes it hard to "remove warts."
>

You've nailed it. The goal of listing warts on SO is not to prove that some
language suxx [1], but to provide answers to question about *why* some
particular wart exists. "wart" may not be the best word, because from the
other side of rebalancing things there is most likely some "feature", but
when people experience problems, they usually face only one side of the
story [2]

As I already said it is impossible to fully master the language without a
complete coverage of such things. These things are equally interesting for
users and for future contributors. There are the starting points in
making the next better generation dynamic language (if the one is possible).

SO is a FAQ site, not a web-page or a wiki, so I expect there to be answers
with research on the history of design decisions behind the balancing of
the language, the sources of "warts" and things that are balancing them on
the other side. I expect there to find analysis what features will have to
be removed in order for some specific "wart" to be gone, and I see it as a
perfect entrypoint for learning high-level things about programming
languages.

Some people may get a feeling that a SO list like that will make a huge
negative impact on Python development. I don't know how to respond to these
concerns. =) In my life I haven't seen a person who abandoned Python
completely after picking it up. That should mean something. From my side
I'd like to thank to all core developers and say that you are doing the
right thing. Unicode and Python 3 was hard, but even grumpy trolls like me
start to like it. The next year will be the next exciting step in Python
development. My IMHO is that it became mature enough to openly discuss its
"bad child habits" in details and make fun of them accepting they as they
are.

Take it easy, and have a good year ahead! ;)

1. http://wiki.theory.org/YourLanguageSucks
2. http://adsoftheworld.com/media/ambient/bbc_world_soldier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121226/5c7eab54/attachment.html>

From ram at rachum.com  Wed Dec 26 19:33:07 2012
From: ram at rachum.com (Ram Rachum)
Date: Wed, 26 Dec 2012 20:33:07 +0200
Subject: [Python-ideas] Allow deleting slice in an OrderedDict
In-Reply-To: <CAP7+vJK03_Wr6wCpBEzqzH65ZC_KBbYv9G7aD5Y=FeJ47vWS2A@mail.gmail.com>
References: <d8efbec9-412c-492b-927b-5e4926a9a7f9@googlegroups.com>
	<kbe9um$emo$1@ger.gmane.org>
	<CAP7+vJK03_Wr6wCpBEzqzH65ZC_KBbYv9G7aD5Y=FeJ47vWS2A@mail.gmail.com>
Message-ID: <CANXboVZhhzoMxdrgviiTSJrMxjtkgpYktHnA7R_C6nUBXM3gmA@mail.gmail.com>

I agree with Terry that doing `del ordered_dict[:2]` is problematic because
there might be confusion between index numbers and dictionary keys.

My new proposed API: Build on `ItemsView` so we could do this: `del
ordered_dict.items()[:2]` and have it delete the first 2 items from the
ordered dict.


On Wed, Dec 26, 2012 at 6:58 PM, Guido van Rossum <guido at python.org> wrote:

> Perhaps the desired functionality can be spelled as a method? Would it be
> easy to implement?
>
> --Guido
>
>
> On Wednesday, December 26, 2012, Terry Reedy wrote:
>
>> On 12/25/2012 8:56 AM, Ram Rachum wrote:
>>
>>> When I have an OrderedDict, I want to be able to delete a slice of it. I
>>> want to be able to do:
>>>
>>>      del ordered_dict[:3]
>>>
>>> To delete the first 3 items, like I would do in a list.
>>>
>>> Is there any reason why this shouldn't be implemented?
>>>
>>
>> An OrderedDict is a mapping (has the mapping api) with a defined
>> iteration order (the order of entry). It is not a sequence and does not
>> have the sequence api. Indeed, a DictList is not possible because dl[2]
>> would look for the item associated with 2 as a key rather than 2 as a
>> position. So od[2:3] would *not* be the same as od[2], violating the
>> usually properly of within-sequence length-1 slices.
>>
>> --
>> Terry Jan Reedy
>>
>> ______________________________**_________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121226/0251d05b/attachment.html>

From eliben at gmail.com  Wed Dec 26 19:37:35 2012
From: eliben at gmail.com (Eli Bendersky)
Date: Wed, 26 Dec 2012 10:37:35 -0800
Subject: [Python-ideas] Documenting Python warts on Stack Overflow
In-Reply-To: <CAPkN8xK2joFWV=QAyuqxTT_jrk8oCtW20RPiv0x31RM1w10u3Q@mail.gmail.com>
References: <CAPkN8x+TF6znvM-Kd2Qad_ExjYf-hv41S3_Uk0U0OqAGO8HteQ@mail.gmail.com>
	<50DAF9CC.3060208@nedbatchelder.com>
	<CAPkN8xK2joFWV=QAyuqxTT_jrk8oCtW20RPiv0x31RM1w10u3Q@mail.gmail.com>
Message-ID: <CAF-Rda8f_oeabQ47jwmpFxHG1CMrbDw0q8XeuxN-nvohC-t2bg@mail.gmail.com>

>  2) a bunch of people answering "what is a wart" is not a way to get the
>> Python community to agree on what needs to be changed in the language.
>> People with ideas need to write them up thoughtfully with proposals for
>> improvements, and then engage meaningfully in the discussion that follows.
>>
>> You seem to think that people just need to identify "warts" and then we
>> can start changing the language to remove them.  What you consider a "wart"
>> is probably the result of a complex balance of competing forces.  Changing
>> Python is hard.  We take backward compatibility very seriously, and that
>> sometimes makes it hard to "remove warts."
>>
>
> You've nailed it. The goal of listing warts on SO is not to prove that
> some language suxx [1], but to provide answers to question about *why* some
> particular wart exists. "wart" may not be the best word, because from the
> other side of rebalancing things there is most likely some "feature", but
> when people experience problems, they usually face only one side of the
> story [2]
>
> As I already said it is impossible to fully master the language without a
> complete coverage of such things. These things are equally interesting for
> users and for future contributors. There are the starting points in
> making the next better generation dynamic language (if the one is possible).
>
> SO is a FAQ site, not a web-page or a wiki, so I expect there to be
> answers with research on the history of design decisions behind the
> balancing of the language, the sources of "warts" and things that are
> balancing them on the other side. I expect there to find analysis what
> features will have to be removed in order for some specific "wart" to be
> gone, and I see it as a perfect entrypoint for learning high-level things
> about programming languages.
>
>
Yet again, while I don't speak for the whole Python dev community, I
predict this will not be officially endorsed. As for explaining why some
things are the way they are, there are plenty of blog articles on the web
trying to explain Python internals. Nick Coghlan has some very good ones
(with the benefit of his being actually in the position to say *why* things
are this way historically), Guido has articles on the history of Python,
and even my humble blog has some internals pieces (which are more focused
on the "how" instead of "why").

Consider directing your energies and obvious love for Python to
constructive channels like contributing similar articles of your own. I'm
sure that you'll be able to find core devs willing to review such articles
and discuss them prior to your posting them. Also feel free to collect all
such articles in some central location and maintaining the list - this
actually could be very helpful for a lot of Python fans and devs alike.

Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121226/a9e4d2e4/attachment.html>

From wuwei23 at gmail.com  Wed Dec 26 23:08:18 2012
From: wuwei23 at gmail.com (alex23)
Date: Wed, 26 Dec 2012 14:08:18 -0800 (PST)
Subject: [Python-ideas] Allow deleting slice in an OrderedDict
In-Reply-To: <CANXboVZhhzoMxdrgviiTSJrMxjtkgpYktHnA7R_C6nUBXM3gmA@mail.gmail.com>
References: <d8efbec9-412c-492b-927b-5e4926a9a7f9@googlegroups.com>
	<kbe9um$emo$1@ger.gmane.org>
	<CAP7+vJK03_Wr6wCpBEzqzH65ZC_KBbYv9G7aD5Y=FeJ47vWS2A@mail.gmail.com>
	<CANXboVZhhzoMxdrgviiTSJrMxjtkgpYktHnA7R_C6nUBXM3gmA@mail.gmail.com>
Message-ID: <d9696496-8331-4b9a-a86f-bab1db714bf1@jl13g2000pbb.googlegroups.com>

On Dec 27, 4:33?am, Ram Rachum <r... at rachum.com> wrote:
> My new proposed API: Build on `ItemsView` so we could do this: `del
> ordered_dict.items()[:2]` and have it delete the first 2 items from the
> ordered dict.

Modifying the return value of an object's method and having the object
itself mutate feels like a side-effect to me.


From python at mrabarnett.plus.com  Thu Dec 27 00:47:49 2012
From: python at mrabarnett.plus.com (MRAB)
Date: Wed, 26 Dec 2012 23:47:49 +0000
Subject: [Python-ideas] Allow deleting slice in an OrderedDict
In-Reply-To: <d9696496-8331-4b9a-a86f-bab1db714bf1@jl13g2000pbb.googlegroups.com>
References: <d8efbec9-412c-492b-927b-5e4926a9a7f9@googlegroups.com>
	<kbe9um$emo$1@ger.gmane.org>
	<CAP7+vJK03_Wr6wCpBEzqzH65ZC_KBbYv9G7aD5Y=FeJ47vWS2A@mail.gmail.com>
	<CANXboVZhhzoMxdrgviiTSJrMxjtkgpYktHnA7R_C6nUBXM3gmA@mail.gmail.com>
	<d9696496-8331-4b9a-a86f-bab1db714bf1@jl13g2000pbb.googlegroups.com>
Message-ID: <50DB8CA5.3070200@mrabarnett.plus.com>

On 2012-12-26 22:08, alex23 wrote:
> On Dec 27, 4:33 am, Ram Rachum <r... at rachum.com> wrote:
>> My new proposed API: Build on `ItemsView` so we could do this: `del
>> ordered_dict.items()[:2]` and have it delete the first 2 items from the
>> ordered dict.
>
> Modifying the return value of an object's method and having the object
> itself mutate feels like a side-effect to me.
>
+1



From ncoghlan at gmail.com  Thu Dec 27 16:10:47 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 28 Dec 2012 01:10:47 +1000
Subject: [Python-ideas] PEP 432: Simplifying the CPython startup sequence
Message-ID: <CADiSq7eMvB9mF8K0WYVhYDPw=gn49GgsZgXL4eSp6BhoAnbEBg@mail.gmail.com>

After helping Brett with the migration to importlib in 3.3, and
looking at some of the ideas kicking around for additional CPython
features that would affect the startup sequence, I've come to the
conclusion that what we have now simply isn't sustainable long term.
It's already the case that if you use certain options (specifically -W
or -X), the interpreter will start accessing the C API before it has
called Py_Initialize(). It's not cool when other people do that (we'd
never accept code that behaved that way as a valid reproducer for a
bug report), and it's *definitely* not cool that we're doing it (even
though we seem to be getting away with it for the moment, and have
been for a long time).

The attached PEP is a first attempt at a plan for doing something
about it. (My notes at
http://wiki.python.org/moin/CPythonInterpreterInitialization provide
additional context - let me know if you think there's more material on
that page that should be in the PEP itself)

The PEP is also available online at http://www.python.org/dev/peps/pep-0432/

Cheers,
Nick.

PEP: 432
Title: Simplifying the CPython startup sequence
Version: $Revision$
Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 28-Dec-2012
Python-Version: 3.4
Post-History: 28-Dec-2012


Abstract
========

This PEP proposes a mechanism for simplifying the startup sequence for
CPython, making it easier to modify the initialisation behaviour of the
reference interpreter executable, as well as making it easier to control
CPython's startup behaviour when creating an alternate executable or
embedding it as a Python execution engine inside a larger application.


Proposal Summary
================

This PEP proposes that CPython move to an explicit 2-phase initialisation
process, where a preliminary interpreter is put in place with limited OS
interaction capabilities early in the startup sequence. This essential core
remains in place while all of the configuration settings are determined,
until a final configuration call takes those settings and finishes
bootstrapping the interpreter immediately before executing the main module.

As a concrete use case to help guide any design changes, and to solve a known
problem where the appropriate defaults for system utilities differ from those
for running user scripts, this PEP also proposes the creation and
distribution of a separate system Python (``spython``) executable which, by
default, ignores user site directories and environment variables, and does
not implicitly set ``sys.path[0]`` based on the current directory or the
script being executed.


Background
==========

Over time, CPython's initialisation sequence has become progressively more
complicated, offering more options, as well as performing more complex tasks
(such as configuring the Unicode settings for OS interfaces in Python 3 as
well as bootstrapping a pure Python implementation of the import system).

Much of this complexity is accessible only through the ``Py_Main`` and
``Py_Initialize`` APIs, offering embedding applications little opportunity
for customisation. This creeping complexity also makes life difficult for
maintainers, as much of the configuration needs to take place prior to the
``Py_Initialize`` call, meaning much of the Python C API cannot be used
safely.

A number of proposals are on the table for even *more* sophisticated
startup behaviour, such as better control over ``sys.path`` initialisation
(easily adding additional directories on the command line in a cross-platform
fashion, as well as controlling the configuration of ``sys.path[0]``), easier
configuration of utilities like coverage tracing when launching Python
subprocesses, and easier control of the encoding used for the standard IO
streams when embedding CPython in a larger application.

Rather than attempting to bolt such behaviour onto an already complicated
system, this PEP proposes to instead simplify the status quo *first*, with
the aim of making these further feature requests easier to implement.


Key Concerns
============

There are a couple of key concerns that any change to the startup sequence
needs to take into account.


Maintainability
---------------

The current CPython startup sequence is difficult to understand, and even
more difficult to modify. It is not clear what state the interpreter is in
while much of the initialisation code executes, leading to behaviour such
as lists, dictionaries and Unicode values being created prior to the call
to ``Py_Initialize`` when the ``-X`` or ``-W`` options are used [1_].

By moving to a 2-phase startup sequence, developers should only need to
understand which features are not available in the core bootstrapping state,
as the vast majority of the configuration process will now take place in
that state.

By basing the new design on a combination of C structures and Python
dictionaries, it should also be easier to modify the system in the
future to add new configuration options.


Performance
-----------

CPython is used heavily to run short scripts where the runtime is dominated
by the interpreter initialisation time. Any changes to the startup sequence
should minimise their impact on the startup overhead. (Given that the
overhead is dominated by IO operations, this is not currently expected to
cause any significant problems).


The Status Quo
==============

Much of the configuration of CPython is currently handled through C level
global variables::

    Py_IgnoreEnvironmentFlag
    Py_HashRandomizationFlag
    _Py_HashSecretInitialized
    _Py_HashSecret
    Py_BytesWarningFlag
    Py_DebugFlag
    Py_InspectFlag
    Py_InteractiveFlag
    Py_OptimizeFlag
    Py_DontWriteBytecodeFlag
    Py_NoUserSiteDirectory
    Py_NoSiteFlag
    Py_UnbufferedStdioFlag
    Py_VerboseFlag

For the above variables, the conversion of command line options and
environment variables to C global variables is handled by ``Py_Main``,
so each embedding application must set those appropriately in order to
change them from their defaults.

Some configuration can only be provided as OS level environment variables::

    PYTHONHASHSEED
    PYTHONSTARTUP
    PYTHONPATH
    PYTHONHOME
    PYTHONCASEOK
    PYTHONIOENCODING

Additional configuration is handled via separate API calls::

    Py_SetProgramName() (call before Py_Initialize())
    Py_SetPath() (optional, call before Py_Initialize())
    Py_SetPythonHome() (optional, call before Py_Initialize()???)
    Py_SetArgv[Ex]() (call after Py_Initialize())

The ``Py_InitializeEx()`` API also accepts a boolean flag to indicate
whether or not CPython's signal handlers should be installed.

Finally, some interactive behaviour (such as printing the introductory
banner) is triggered only when standard input is reported as a terminal
connection by the operating system.

Also see more detailed notes at [1_]


Proposal
========

(Note: details here are still very much in flux, but preliminary feedback
is appreciated anyway)

Core Interpreter Initialisation
-------------------------------

The only configuration that currently absolutely needs to be in place
before even the interpreter core can be initialised is the seed for the
randomised hash algorithm. However, there are a couple of settings needed
there: whether or not hash randomisation is enabled at all, and if it's
enabled, whether or not to use a specific seed value.

The proposed API for this step in the startup sequence is::

    void Py_BeginInitialization(Py_CoreConfig *config);

Like Py_Initialize, this part of the new API treats initialisation failures
as fatal errors. While that's still not particularly embedding friendly,
the operations in this step *really* shouldn't be failing, and changing them
to return error codes instead of aborting would be an even larger task than
the one already being proposed.

The new Py_CoreConfig struct holds the settings required for preliminary
configuration::

    typedef struct {
        int use_hash_seed;
        size_t hash_seed;
    } Py_CoreConfig;

To "disable" hash randomisation, set "use_hash_seed" and pass a hash seed of
zero. (This seems reasonable to me, but there may be security implications
I'm overlooking. If so, adding a separate flag or switching to a 3-valued
"no randomisation", "fixed hash seed" and "randomised hash" option is easy)

The core configuration settings pointer may be NULL, in which case the
default behaviour of randomised hashes with a random seed will be used.

A new query API will allow code to determine if the interpreter is in the
bootstrapping state between core initialisation and the completion of the
initialisation process::

    int Py_IsInitializing();

While in the initialising state, the interpreter should be fully functional
except that:

* compilation is not allowed (as the parser is not yet configured properly)
* The following attributes in the ``sys`` module are all either missing or
  ``None``:
  * ``sys.path``
  * ``sys.argv``
  * ``sys.executable``
  * ``sys.base_exec_prefix``
  * ``sys.base_prefix``
  * ``sys.exec_prefix``
  * ``sys.prefix``
  * ``sys.warnoptions``
  * ``sys.flags``
  * ``sys.dont_write_bytecode``
  * ``sys.stdin``
  * ``sys.stdout``
* The filesystem encoding is not yet defined
* The IO encoding is not yet defined
* CPython signal handlers are not yet installed
* only builtin and frozen modules may be imported (due to above limitations)
* ``sys.stderr`` is set to a temporary IO object using unbuffered binary
  mode
* The ``warnings`` module is not yet initialised
* The ``__main__`` module does not yet exist

<TBD: identify any other notable missing functionality>

The main things made available by this step will be the core Python
datatypes, in particular dictionaries, lists and strings. This allows them
to be used safely for all of the remaining configuration steps (unlike the
status quo).

In addition, the current thread will possess a valid Python thread state,
allow any further configuration data to be stored on the interpreter object
rather than in C process globals.

Any call to Py_BeginInitialization() must have a matching call to
Py_Finalize(). It is acceptable to skip calling Py_EndInitialization() in
between (e.g. if attempting to read the configuration settings fails)


Determining the remaining configuration settings
------------------------------------------------

The next step in the initialisation sequence is to determine the full
settings needed to complete the process. No changes are made to the
interpreter state at this point. The core API for this step is::

    int Py_ReadConfiguration(PyObject *config);

The config argument should be a pointer to a Python dictionary. For any
supported configuration setting already in the dictionary, CPython will
sanity check the supplied value, but otherwise accept it as correct.

Unlike Py_Initialize and Py_BeginInitialization, this call will raise an
exception and report an error return rather than exhibiting fatal errors if
a problem is found with the config data.

Any supported configuration setting which is not already set will be
populated appropriately. The default configuration can be overridden
entirely by setting the value *before* calling Py_ReadConfiguration. The
provided value will then also be used in calculating any settings derived
from that value.

Alternatively, settings may be overridden *after* the Py_ReadConfiguration
call (this can be useful if an embedding application wants to adjust
a setting rather than replace it completely, such as removing
``sys.path[0]``).


Supported configuration settings
--------------------------------

At least the following configuration settings will be supported::

    raw_argv (list of str, default = retrieved from OS APIs)

    argv (list of str, default = derived from raw_argv)
    warnoptions (list of str, default = derived from raw_argv and environment)
    xoptions (list of str, default = derived from raw_argv and environment)

    program_name (str, default = retrieved from OS APIs)
    executable (str, default = derived from program_name)
    home (str, default = complicated!)
    prefix (str, default = complicated!)
    exec_prefix (str, default = complicated!)
    base_prefix (str, default = complicated!)
    base_exec_prefix (str, default = complicated!)
    path (list of str, default = complicated!)

    io_encoding (str, default = derived from environment or OS APIs)
    fs_encoding (str, default = derived from OS APIs)

    skip_signal_handlers (boolean, default = derived from environment or False)
    ignore_environment (boolean, default = derived from environment or False)
    dont_write_bytecode (boolean, default = derived from environment or False)
    no_site (boolean, default = derived from environment or False)
    no_user_site (boolean, default = derived from environment or False)
    <TBD: at least more from sys.flags need to go here>



Completing the interpreter initialisation
-----------------------------------------

The final step in the process is to actually put the configuration settings
into effect and finish bootstrapping the interpreter up to full operation::

    int Py_EndInitialization(PyObject *config);

Like Py_ReadConfiguration, this call will raise an exception and report an
error return rather than exhibiting fatal errors if a problem is found with
the config data.

After a successful call, Py_IsInitializing() will be false, while
Py_IsInitialized() will become true. The caveats described above for the
interpreter during the initialisation phase will no longer hold.


Stable ABI
----------

All of the APIs proposed in this PEP are excluded from the stable ABI, as
embedding a Python interpreter involves a much higher degree of coupling
than merely writing an extension.


Backwards Compatibility
-----------------------

Backwards compatibility will be preserved primarily by ensuring that
Py_ReadConfiguration() interrogates all the previously defined configuration
settings stored in global variables and environment variables.

One acknowledged incompatiblity is that some environment variables which
are currently read lazily may instead be read once during interpreter
initialisation. As the PEP matures, these will be discussed in more detail
on a case by case basis.

The Py_Initialize() style of initialisation will continue to be
supported. It will use
the new API internally, but will continue to exhibit the same
behaviour as it does today,
ensuring that sys.argv is not set until a subsequent PySys_SetArgv call.


A System Python Executable
==========================

When executing system utilities with administrative access to a system, many
of the default behaviours of CPython are undesirable, as they may allow
untrusted code to execute with elevated privileges. The most problematic
aspects are the fact that user site directories are enabled,
environment variables are trusted and that the directory containing the
executed file is placed at the beginning of the import path.

Currently, providing a separate executable with different default behaviour
would be prohibitively hard to maintain. One of the goals of this PEP is to
make it possible to replace much of the hard to maintain bootstrapping code
with more normal CPython code, as well as making it easier for a separate
application to make use of key components of ``Py_Main``. Including this
change in the PEP is designed to help avoid acceptance of a design that
sounds good in theory but proves to be problematic in practice.

One final aspect not addressed by the general embedding changes above is
the current inaccessibility of the core logic for deciding between the
different execution modes supported by CPython:

* script execution
* directory/zipfile execution
* command execution ("-c" switch)
* module or package execution ("-m" switch)
* execution from stdin (non-interactive)
* interactive stdin

<TBD: concrete proposal for better exposing the __main__ execution step>

Implementation
==============

None as yet. Once I have a reasonably solid plan of attack, I intend to work
on a reference implementation as a feature branch in my BitBucket sandbox [2_]


References
==========

.. [1] CPython interpreter initialization notes
   (http://wiki.python.org/moin/CPythonInterpreterInitialization)

.. [2] BitBucket Sandbox
   (https://bitbucket.org/ncoghlan/cpython_sandbox)


Copyright
===========
This document has been placed in the public domain.


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


From benjamin at python.org  Thu Dec 27 17:29:54 2012
From: benjamin at python.org (Benjamin Peterson)
Date: Thu, 27 Dec 2012 16:29:54 +0000 (UTC)
Subject: [Python-ideas] PEP 432: Simplifying the CPython startup sequence
References: <CADiSq7eMvB9mF8K0WYVhYDPw=gn49GgsZgXL4eSp6BhoAnbEBg@mail.gmail.com>
Message-ID: <loom.20121227T172719-408@post.gmane.org>

Nick Coghlan <ncoghlan at ...> writes:
> 
> PEP: 432
> Title: Simplifying the CPython startup sequence
b
In general, it looks quite nice. While you're creating new initialization APIs,
it would be nice if they could support (or at least be future compatible with) a
"interpreter context". If we ever get around to killing at the c-level global
state in the interpreter, such a struct would hold the state. For example, it
would be nice if instead of those Py_* option variables, members of a structure
on PyInterpreter were used.



From ubershmekel at gmail.com  Thu Dec 27 17:39:08 2012
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Thu, 27 Dec 2012 18:39:08 +0200
Subject: [Python-ideas] PEP 432: Simplifying the CPython startup sequence
In-Reply-To: <CADiSq7eMvB9mF8K0WYVhYDPw=gn49GgsZgXL4eSp6BhoAnbEBg@mail.gmail.com>
References: <CADiSq7eMvB9mF8K0WYVhYDPw=gn49GgsZgXL4eSp6BhoAnbEBg@mail.gmail.com>
Message-ID: <CANSw7KzaHFcmYuTC3X2pm7dH7tVgTF_wetbYDOtSebrnPTQd_A@mail.gmail.com>

On Thu, Dec 27, 2012 at 5:10 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> Performance
> -----------
>
> CPython is used heavily to run short scripts where the runtime is dominated
> by the interpreter initialisation time. Any changes to the startup sequence
> should minimise their impact on the startup overhead. (Given that the
> overhead is dominated by IO operations, this is not currently expected to
> cause any significant problems).
>
>
I'd like to just stress the performance issue. It seems python3.3 takes 30%
more time to start vs 2.7 on my ubuntu.

Yuval
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121227/671b67e8/attachment.html>

From ubershmekel at gmail.com  Thu Dec 27 17:40:49 2012
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Thu, 27 Dec 2012 18:40:49 +0200
Subject: [Python-ideas] PEP 432: Simplifying the CPython startup sequence
In-Reply-To: <CANSw7KzaHFcmYuTC3X2pm7dH7tVgTF_wetbYDOtSebrnPTQd_A@mail.gmail.com>
References: <CADiSq7eMvB9mF8K0WYVhYDPw=gn49GgsZgXL4eSp6BhoAnbEBg@mail.gmail.com>
	<CANSw7KzaHFcmYuTC3X2pm7dH7tVgTF_wetbYDOtSebrnPTQd_A@mail.gmail.com>
Message-ID: <CANSw7KzWt3O=Y3SJz=Gu_OTuE=04woFWAjcRkLPKmmg11sPzZQ@mail.gmail.com>

On Thu, Dec 27, 2012 at 6:39 PM, Yuval Greenfield <ubershmekel at gmail.com>wrote:

> On Thu, Dec 27, 2012 at 5:10 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>
>> Performance
>> -----------
>>
>> CPython is used heavily to run short scripts where the runtime is
>> dominated
>> by the interpreter initialisation time. Any changes to the startup
>> sequence
>> should minimise their impact on the startup overhead. (Given that the
>> overhead is dominated by IO operations, this is not currently expected to
>> cause any significant problems).
>>
>>
> I'd like to just stress the performance issue. It seems python3.3 takes
> 30% more time to start vs 2.7 on my ubuntu.
>
> Yuval
>

Here's the test I used https://gist.github.com/4389657
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121227/59bb610a/attachment.html>

From solipsis at pitrou.net  Thu Dec 27 17:42:52 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 27 Dec 2012 17:42:52 +0100
Subject: [Python-ideas] PEP 432: Simplifying the CPython startup sequence
References: <CADiSq7eMvB9mF8K0WYVhYDPw=gn49GgsZgXL4eSp6BhoAnbEBg@mail.gmail.com>
Message-ID: <20121227174252.562d604c@pitrou.net>

On Fri, 28 Dec 2012 01:10:47 +1000
Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> Performance
> -----------
> 
> CPython is used heavily to run short scripts where the runtime is dominated
> by the interpreter initialisation time. Any changes to the startup sequence
> should minimise their impact on the startup overhead. (Given that the
> overhead is dominated by IO operations, this is not currently expected to
> cause any significant problems).

Do you have any actual measurements to back this up?

Regards

Antoine.




From solipsis at pitrou.net  Thu Dec 27 17:43:59 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 27 Dec 2012 17:43:59 +0100
Subject: [Python-ideas] PEP 432: Simplifying the CPython startup sequence
References: <CADiSq7eMvB9mF8K0WYVhYDPw=gn49GgsZgXL4eSp6BhoAnbEBg@mail.gmail.com>
	<CANSw7KzaHFcmYuTC3X2pm7dH7tVgTF_wetbYDOtSebrnPTQd_A@mail.gmail.com>
	<CANSw7KzWt3O=Y3SJz=Gu_OTuE=04woFWAjcRkLPKmmg11sPzZQ@mail.gmail.com>
Message-ID: <20121227174359.2aa1b71a@pitrou.net>

On Thu, 27 Dec 2012 18:40:49 +0200
Yuval Greenfield <ubershmekel at gmail.com>
wrote:
> On Thu, Dec 27, 2012 at 6:39 PM, Yuval Greenfield <ubershmekel at gmail.com>wrote:
> 
> > On Thu, Dec 27, 2012 at 5:10 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> >
> >> Performance
> >> -----------
> >>
> >> CPython is used heavily to run short scripts where the runtime is
> >> dominated
> >> by the interpreter initialisation time. Any changes to the startup
> >> sequence
> >> should minimise their impact on the startup overhead. (Given that the
> >> overhead is dominated by IO operations, this is not currently expected to
> >> cause any significant problems).
> >>
> >>
> > I'd like to just stress the performance issue. It seems python3.3 takes
> > 30% more time to start vs 2.7 on my ubuntu.
> >
> > Yuval
> >
> 
> Here's the test I used https://gist.github.com/4389657

Python 3 simply has more modules to load at startup (for example
because of the IO stack).

Regards

Antoine.




From christian at python.org  Thu Dec 27 21:14:11 2012
From: christian at python.org (Christian Heimes)
Date: Thu, 27 Dec 2012 21:14:11 +0100
Subject: [Python-ideas] PEP 432: Simplifying the CPython startup sequence
In-Reply-To: <CADiSq7eMvB9mF8K0WYVhYDPw=gn49GgsZgXL4eSp6BhoAnbEBg@mail.gmail.com>
References: <CADiSq7eMvB9mF8K0WYVhYDPw=gn49GgsZgXL4eSp6BhoAnbEBg@mail.gmail.com>
Message-ID: <50DCAC13.5040303@python.org>

Am 27.12.2012 16:10, schrieb Nick Coghlan:

> Additional configuration is handled via separate API calls::
> 
>     Py_SetProgramName() (call before Py_Initialize())
>     Py_SetPath() (optional, call before Py_Initialize())
>     Py_SetPythonHome() (optional, call before Py_Initialize()???)
>     Py_SetArgv[Ex]() (call after Py_Initialize())

[...]

> The only configuration that currently absolutely needs to be in place
> before even the interpreter core can be initialised is the seed for the
> randomised hash algorithm. However, there are a couple of settings needed
> there: whether or not hash randomisation is enabled at all, and if it's
> enabled, whether or not to use a specific seed value.
> 
> The proposed API for this step in the startup sequence is::
> 
>     void Py_BeginInitialization(Py_CoreConfig *config);
> 
> Like Py_Initialize, this part of the new API treats initialisation failures
> as fatal errors. While that's still not particularly embedding friendly,
> the operations in this step *really* shouldn't be failing, and changing them
> to return error codes instead of aborting would be an even larger task than
> the one already being proposed.
> 
> The new Py_CoreConfig struct holds the settings required for preliminary
> configuration::
> 
>     typedef struct {
>         int use_hash_seed;
>         size_t hash_seed;
>     } Py_CoreConfig;

Hello Nick,

we could use the opportunity and move more settings to Py_CoreConfig. At
the moment several settings are stored in static variables:

Python/pythonrun.c

static wchar_t *progname
static wchar_t *default_home
static wchar_t env_home[PATH_MAX+1]

Modules/getpath.c

static wchar_t prefix[MAXPATHLEN+1]
static wchar_t exec_prefix[MAXPATHLEN+1]
static wchar_t progpath[MAXPATHLEN+1]
static wchar_t *module_search_path
static int module_search_path_malloced
static wchar_t *lib_python = L"lib/python" VERSION;

PC/getpath.c

static wchar_t dllpath[MAXPATHLEN+1]


These settings could be added to the Py_CoreConfig struct and unify the
configuration schema for embedders. Functions like Py_SetProgramName()
would set the members of a global Py_CoreConfig struct.

Christian


From ncoghlan at gmail.com  Fri Dec 28 01:55:52 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 28 Dec 2012 10:55:52 +1000
Subject: [Python-ideas] PEP 432: Simplifying the CPython startup sequence
In-Reply-To: <50DCAC13.5040303@python.org>
References: <CADiSq7eMvB9mF8K0WYVhYDPw=gn49GgsZgXL4eSp6BhoAnbEBg@mail.gmail.com>
	<50DCAC13.5040303@python.org>
Message-ID: <CADiSq7ehi1R38GuOiC3JjH=KAYktodfs5VXTxjx2UH2VaWzkCw@mail.gmail.com>

I was planning to move most of those settings into the config dict. Both
the core config struct and the config dict would then be stored in new
slots in the interpreter struct.

My preference is to push more settings into the config dictionary, since
those can use the C API and frozen bytecode to do their calculations.

--
Sent from my phone, thus the relative brevity :)
On Dec 28, 2012 6:14 AM, "Christian Heimes" <christian at python.org> wrote:

> Am 27.12.2012 16:10, schrieb Nick Coghlan:
>
> > Additional configuration is handled via separate API calls::
> >
> >     Py_SetProgramName() (call before Py_Initialize())
> >     Py_SetPath() (optional, call before Py_Initialize())
> >     Py_SetPythonHome() (optional, call before Py_Initialize()???)
> >     Py_SetArgv[Ex]() (call after Py_Initialize())
>
> [...]
>
> > The only configuration that currently absolutely needs to be in place
> > before even the interpreter core can be initialised is the seed for the
> > randomised hash algorithm. However, there are a couple of settings needed
> > there: whether or not hash randomisation is enabled at all, and if it's
> > enabled, whether or not to use a specific seed value.
> >
> > The proposed API for this step in the startup sequence is::
> >
> >     void Py_BeginInitialization(Py_CoreConfig *config);
> >
> > Like Py_Initialize, this part of the new API treats initialisation
> failures
> > as fatal errors. While that's still not particularly embedding friendly,
> > the operations in this step *really* shouldn't be failing, and changing
> them
> > to return error codes instead of aborting would be an even larger task
> than
> > the one already being proposed.
> >
> > The new Py_CoreConfig struct holds the settings required for preliminary
> > configuration::
> >
> >     typedef struct {
> >         int use_hash_seed;
> >         size_t hash_seed;
> >     } Py_CoreConfig;
>
> Hello Nick,
>
> we could use the opportunity and move more settings to Py_CoreConfig. At
> the moment several settings are stored in static variables:
>
> Python/pythonrun.c
>
> static wchar_t *progname
> static wchar_t *default_home
> static wchar_t env_home[PATH_MAX+1]
>
> Modules/getpath.c
>
> static wchar_t prefix[MAXPATHLEN+1]
> static wchar_t exec_prefix[MAXPATHLEN+1]
> static wchar_t progpath[MAXPATHLEN+1]
> static wchar_t *module_search_path
> static int module_search_path_malloced
> static wchar_t *lib_python = L"lib/python" VERSION;
>
> PC/getpath.c
>
> static wchar_t dllpath[MAXPATHLEN+1]
>
>
> These settings could be added to the Py_CoreConfig struct and unify the
> configuration schema for embedders. Functions like Py_SetProgramName()
> would set the members of a global Py_CoreConfig struct.
>
> Christian
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121228/9b80f852/attachment.html>

From ericsnowcurrently at gmail.com  Fri Dec 28 06:50:28 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Thu, 27 Dec 2012 22:50:28 -0700
Subject: [Python-ideas] PEP 432: Simplifying the CPython startup sequence
In-Reply-To: <loom.20121227T172719-408@post.gmane.org>
References: <CADiSq7eMvB9mF8K0WYVhYDPw=gn49GgsZgXL4eSp6BhoAnbEBg@mail.gmail.com>
	<loom.20121227T172719-408@post.gmane.org>
Message-ID: <CALFfu7Cos=jHwSeVNjF3kifi3hvpgp37cidmSFZ70MwGSNNSqw@mail.gmail.com>

On Thu, Dec 27, 2012 at 9:29 AM, Benjamin Peterson <benjamin at python.org> wrote:
> Nick Coghlan <ncoghlan at ...> writes:
>>
>> PEP: 432
>> Title: Simplifying the CPython startup sequence
> b
> In general, it looks quite nice. While you're creating new initialization APIs,
> it would be nice if they could support (or at least be future compatible with) a
> "interpreter context". If we ever get around to killing at the c-level global
> state in the interpreter, such a struct would hold the state. For example, it
> would be nice if instead of those Py_* option variables, members of a structure
> on PyInterpreter were used.

This is exactly what I was wondering, a la subinterpreter support.

-eric


From solipsis at pitrou.net  Fri Dec 28 13:15:22 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 28 Dec 2012 13:15:22 +0100
Subject: [Python-ideas] PEP 432: Simplifying the CPython startup sequence
References: <CADiSq7eMvB9mF8K0WYVhYDPw=gn49GgsZgXL4eSp6BhoAnbEBg@mail.gmail.com>
	<50DCAC13.5040303@python.org>
	<CADiSq7ehi1R38GuOiC3JjH=KAYktodfs5VXTxjx2UH2VaWzkCw@mail.gmail.com>
Message-ID: <20121228131522.3c925d3e@pitrou.net>

On Fri, 28 Dec 2012 10:55:52 +1000
Nick Coghlan <ncoghlan at gmail.com> wrote:
> I was planning to move most of those settings into the config dict. Both
> the core config struct and the config dict would then be stored in new
> slots in the interpreter struct.
> 
> My preference is to push more settings into the config dictionary, since
> those can use the C API and frozen bytecode to do their calculations.

But dicts are also more annoying to use in C than plain structs.

Regards

Antoine.




From ncoghlan at gmail.com  Fri Dec 28 13:50:07 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 28 Dec 2012 22:50:07 +1000
Subject: [Python-ideas] PEP 432: Simplifying the CPython startup sequence
In-Reply-To: <20121228131522.3c925d3e@pitrou.net>
References: <CADiSq7eMvB9mF8K0WYVhYDPw=gn49GgsZgXL4eSp6BhoAnbEBg@mail.gmail.com>
	<50DCAC13.5040303@python.org>
	<CADiSq7ehi1R38GuOiC3JjH=KAYktodfs5VXTxjx2UH2VaWzkCw@mail.gmail.com>
	<20121228131522.3c925d3e@pitrou.net>
Message-ID: <CADiSq7dNLcwVFuLU=zL5pjo_yMTC9Ex_=bArBFP=aZR7=VvxEQ@mail.gmail.com>

On Fri, Dec 28, 2012 at 10:15 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Fri, 28 Dec 2012 10:55:52 +1000
> Nick Coghlan <ncoghlan at gmail.com> wrote:
>> I was planning to move most of those settings into the config dict. Both
>> the core config struct and the config dict would then be stored in new
>> slots in the interpreter struct.
>>
>> My preference is to push more settings into the config dictionary, since
>> those can use the C API and frozen bytecode to do their calculations.
>
> But dicts are also more annoying to use in C than plain structs.

Yeah, you may be right. I'll add more on the internal storage of the
configuration data and include that as an open question.

I want the dict in the config API so we can distinguish between
"please fill in the default value" and "don't fill this in at all",
but there's nothing stopping us mapping that to a C struct internally.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


From mark at hotpy.org  Fri Dec 28 16:45:47 2012
From: mark at hotpy.org (Mark Shannon)
Date: Fri, 28 Dec 2012 15:45:47 +0000
Subject: [Python-ideas] PEP 432: Simplifying the CPython startup sequence
In-Reply-To: <CADiSq7eMvB9mF8K0WYVhYDPw=gn49GgsZgXL4eSp6BhoAnbEBg@mail.gmail.com>
References: <CADiSq7eMvB9mF8K0WYVhYDPw=gn49GgsZgXL4eSp6BhoAnbEBg@mail.gmail.com>
Message-ID: <50DDBEAB.6000906@hotpy.org>

On 27/12/12 15:10, Nick Coghlan wrote:

Hi,

> This PEP proposes that CPython move to an explicit 2-phase initialisation

Why only two phases? I was thinking about the initialisation sequence a 
while ago and thought that a three or four phase sequence might be 
appropriate. What matters is that the state in between phases is well 
defined and simple to understand.

You might want to take a look at rubinius which implements most of its 
core components in Ruby, so needs a clearly defined startup sequence.
http://rubini.us/doc/en/bootstrapping/
(Rubinius using 7 phases, but that would be overkill for CPython)

Cheers,
Mark.




From ncoghlan at gmail.com  Fri Dec 28 19:07:45 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 29 Dec 2012 04:07:45 +1000
Subject: [Python-ideas] PEP 432: Simplifying the CPython startup sequence
In-Reply-To: <50DDBEAB.6000906@hotpy.org>
References: <CADiSq7eMvB9mF8K0WYVhYDPw=gn49GgsZgXL4eSp6BhoAnbEBg@mail.gmail.com>
	<50DDBEAB.6000906@hotpy.org>
Message-ID: <CADiSq7ey+ZKfSUfcXvK9-vf=wOWoJt9OaT4nUJ8WKJdg1_oNdQ@mail.gmail.com>

On Sat, Dec 29, 2012 at 1:45 AM, Mark Shannon <mark at hotpy.org> wrote:
> On 27/12/12 15:10, Nick Coghlan wrote:
>
> Hi,
>
>
>> This PEP proposes that CPython move to an explicit 2-phase initialisation
>
>
> Why only two phases? I was thinking about the initialisation sequence a
> while ago and thought that a three or four phase sequence might be
> appropriate. What matters is that the state in between phases is well
> defined and simple to understand.

The "2-phase" term came from the fact that I'm trying to break
Py_Initialize() into two separate phase changes that roughly
correspond with the locations of the current calls to
_Py_Random_Init() and Py_Initialize() in Py_Main().

There's also at least a 3rd phase (even in the current design),
because there's a "get ready to start executing __main__" phase after
Py_Initialise finishes that changes various attributes on __main__ and
may also modify sys.path[0] and sys.argv[0]. This is the first phase
where user code may execute (Package __init__ modules may run in this
phase when the "-m" switch is used to execute a package or submodule)

So yeah, I need to lose the "2-phase" term, because it's simply wrong.
A more realistic description of the phases proposed in the PEP would
be:

PreInit Phase - No CPython infrastructure configured, only pure C code allowed
Initializing Phase - After Py_BeginInitialization() is called.
Limitations as described in the PEP.
PreMain Phase - After Py_EndInitialization() is called. __main__
attributes, sys.path[0], sys.argv[0] may still be inaccurate
Main Execution - Execution of the main module bytecode has started.
Interpreter has been fully configured.

> You might want to take a look at rubinius which implements most of its core
> components in Ruby, so needs a clearly defined startup sequence.
> http://rubini.us/doc/en/bootstrapping/
> (Rubinius using 7 phases, but that would be overkill for CPython)

Thanks for the reference. However, it looks like most of those seven
stages will still be handled in our preinit phase. It sounds like we
do a *lot* more in C than Rubinius does, so most of that code really
doesn't need much in the way of infrastructure. It's definitely not
*easy* to understand, but we also don't mess with it very often, and
it's the kind of code where having access to more of the Python C API
wouldn't really help all that much.

The key piece I think we're currently missing is the clearly phase
change between "PreInit" (can't safely use the Python C API) and
"Initializing" (can use most of the C API, with some restrictions).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


From michelelacchia at gmail.com  Sat Dec 29 08:39:05 2012
From: michelelacchia at gmail.com (Michele Lacchia)
Date: Sat, 29 Dec 2012 08:39:05 +0100
Subject: [Python-ideas] [Python-Dev] question about packaging
In-Reply-To: <50DE2762.90509@cavallinux.eu>
References: <480CF8A8-0461-4C20-8A3C-2944C883E78B@gmail.com>
	<50DE2762.90509@cavallinux.eu>
Message-ID: <CAFjP7=Wc4sRW_CcMRKXeNLigOc02BwQDV4-iduB0Pf-9npEK_Q@mail.gmail.com>

Sorry if I interfere, but now what should be supported, distlib or
packaging? It seems to me that the former was born to solve some problems
packaging and distribute still had. In addition to that, packaging has not
been included in Python 3.3, as it was first planned.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121229/6bea0ef6/attachment.html>

From ncoghlan at gmail.com  Sat Dec 29 09:25:40 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 29 Dec 2012 18:25:40 +1000
Subject: [Python-ideas] [Python-Dev] question about packaging
In-Reply-To: <CAFjP7=Wc4sRW_CcMRKXeNLigOc02BwQDV4-iduB0Pf-9npEK_Q@mail.gmail.com>
References: <480CF8A8-0461-4C20-8A3C-2944C883E78B@gmail.com>
	<50DE2762.90509@cavallinux.eu>
	<CAFjP7=Wc4sRW_CcMRKXeNLigOc02BwQDV4-iduB0Pf-9npEK_Q@mail.gmail.com>
Message-ID: <CADiSq7eSu8AZmAUf5Ky-FcB41b4RU9Gp=FQahc=we4hEV4bVtA@mail.gmail.com>

On Sat, Dec 29, 2012 at 5:39 PM, Michele Lacchia
<michelelacchia at gmail.com> wrote:
> Sorry if I interfere, but now what should be supported, distlib or
> packaging? It seems to me that the former was born to solve some problems
> packaging and distribute still had. In addition to that, packaging has not
> been included in Python 3.3, as it was first planned.

Originally, the distutils2 project was going to be the basis the new
packaging support in the stdlib. The critical problem identified in
the run up to 3.3 was that the level of maturity in distutils2 (and
hence packaging) was hugely variable - some parts were (almost) ready
for inclusion, but many were not. By building up distlib more
incrementally (rather than starting as a fork of distutils), it should
be easier to identify which parts are sufficiently mature for stdlib
inclusion.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


From dkreuter at gmail.com  Sun Dec 30 04:25:38 2012
From: dkreuter at gmail.com (David Kreuter)
Date: Sun, 30 Dec 2012 04:25:38 +0100
Subject: [Python-ideas] proposed methods: list.replace / list.indices
Message-ID: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>

Hi python-ideas.

I think it would be nice to have a method in 'list' to replace certain
elements by others in-place. Like this:

    l = [x, a, y, a]
    l.replace(a, b)
    assert l == [x, b, y, b]

The alternatives are longer than they should be, imo. For example:

    for i, n in enumerate(l):
        if n == a:
            l[i] = b

Or:

    l = [b if n==a else n for n in l]

And this is what happens when someone tries to "optimize" this process.
It totally obscures the intention:

    try:
        i = 0
        while i < len(l):
            i = l.index(a, i)
            l[i] = b
            i += 1
    except ValueError:
        pass

If there is a reason not to add '.replace' as built-in method, it could be
implemented in pure python efficiently if python provided a version of
'.index' that returns the index of more than just the first occurrence of a
given item. Like this:

    l = [x, a, b, a]
    for i in l.indices(a):
        l[i] = b

So adding .replace and/or .indices? Good idea? Bad idea?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121230/1b67ec8b/attachment.html>

From python at mrabarnett.plus.com  Sun Dec 30 05:03:18 2012
From: python at mrabarnett.plus.com (MRAB)
Date: Sun, 30 Dec 2012 04:03:18 +0000
Subject: [Python-ideas] proposed methods: list.replace / list.indices
In-Reply-To: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
References: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
Message-ID: <50DFBD06.10304@mrabarnett.plus.com>

On 2012-12-30 03:25, David Kreuter wrote:
> Hi python-ideas.
>
> I think it would be nice to have a method in 'list' to replace certain
> elements by others in-place. Like this:
>
>      l = [x, a, y, a]
>      l.replace(a, b)
>      assert l == [x, b, y, b]
>
> The alternatives are longer than they should be, imo. For example:
>
>      for i, n in enumerate(l):
>          if n == a:
>              l[i] = b
>
> Or:
>
>      l = [b if n==a else n for n in l]
>
> And this is what happens when someone tries to "optimize" this process.
> It totally obscures the intention:
>
>      try:
>          i = 0
>          while i < len(l):
>              i = l.index(a, i)
>              l[i] = b
>              i += 1
>      except ValueError:
>          pass
>
> If there is a reason not to add '.replace' as built-in method, it could
> be implemented in pure python efficiently if python provided a version
> of '.index' that returns the index of more than just the first
> occurrence of a given item. Like this:
>
>      l = [x, a, b, a]
>      for i in l.indices(a):
>          l[i] = b
>
> So adding .replace and/or .indices? Good idea? Bad idea?
>
What's your use-case?

I personally can't remember ever needing to do this (or, if I have, it
was so long ago that I can't remember it!).

Features get added to Python only when someone can show a compelling
reason for it and sufficient other people agree.


From dkreuter at gmail.com  Sun Dec 30 05:59:28 2012
From: dkreuter at gmail.com (David Kreuter)
Date: Sun, 30 Dec 2012 05:59:28 +0100
Subject: [Python-ideas] proposed methods: list.replace / list.indices
In-Reply-To: <50DFBD06.10304@mrabarnett.plus.com>
References: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
	<50DFBD06.10304@mrabarnett.plus.com>
Message-ID: <CAMiSff4cdLMemuv5tg36N+weZx=6t8rSVMO=+eNWVrRNEY9Qjg@mail.gmail.com>

On Sun, Dec 30, 2012 at 5:03 AM, MRAB <python at mrabarnett.plus.com> wrote:

> On 2012-12-30 03:25, David Kreuter wrote:
>
>> Hi python-ideas.
>>
>> I think it would be nice to have a method in 'list' to replace certain
>> elements by others in-place. Like this:
>>
>>      l = [x, a, y, a]
>>      l.replace(a, b)
>>      assert l == [x, b, y, b]
>>
>> The alternatives are longer than they should be, imo. For example:
>>
>>      for i, n in enumerate(l):
>>          if n == a:
>>              l[i] = b
>>
>> Or:
>>
>>      l = [b if n==a else n for n in l]
>>
>> And this is what happens when someone tries to "optimize" this process.
>> It totally obscures the intention:
>>
>>      try:
>>          i = 0
>>          while i < len(l):
>>              i = l.index(a, i)
>>              l[i] = b
>>              i += 1
>>      except ValueError:
>>          pass
>>
>> If there is a reason not to add '.replace' as built-in method, it could
>> be implemented in pure python efficiently if python provided a version
>> of '.index' that returns the index of more than just the first
>> occurrence of a given item. Like this:
>>
>>      l = [x, a, b, a]
>>      for i in l.indices(a):
>>          l[i] = b
>>
>> So adding .replace and/or .indices? Good idea? Bad idea?
>>
>>  What's your use-case?
>
> I personally can't remember ever needing to do this (or, if I have, it
> was so long ago that I can't remember it!).
>
> Features get added to Python only when someone can show a compelling
> reason for it and sufficient other people agree.
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>

When I write code for processing graphs it becomes very useful. For example:

    def collapse_edge_undirected_graph(a, b):
        n = Node()
        n.connected = a.connected + b.connected
        for x in a.connected:
            x.connected.replace(a, n)
        for x in b.connected:
            x.connected.replace(b, n)

In other cases one would probably just add another layer of indirection.

    x = Wrapper("a")
    y = Wrapper("y")
    a = Wrapper("a")

    l = [x, a, y, a]
    a.contents = "b" # instead of l.replace(a, b)

But having to add .contents everywhere makes it messy. Graph code is
complicated enough as it is.

And '.index' is a basically a resumable search. But instead of using a
iterator-interface it requires the user to call it repeatedly.
A method '.indices' returning a generator seems more like the python way to
approaching this.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121230/5843b5af/attachment.html>

From p.f.moore at gmail.com  Sun Dec 30 10:04:32 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun, 30 Dec 2012 09:04:32 +0000
Subject: [Python-ideas] proposed methods: list.replace / list.indices
In-Reply-To: <CAMiSff4cdLMemuv5tg36N+weZx=6t8rSVMO=+eNWVrRNEY9Qjg@mail.gmail.com>
References: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
	<50DFBD06.10304@mrabarnett.plus.com>
	<CAMiSff4cdLMemuv5tg36N+weZx=6t8rSVMO=+eNWVrRNEY9Qjg@mail.gmail.com>
Message-ID: <CACac1F9EaNaaMgr6+q_peUuW-wW3pvsFjqpKas2eijRYRHqonQ@mail.gmail.com>

On 30 December 2012 04:59, David Kreuter <dkreuter at gmail.com> wrote:
> When I write code for processing graphs it becomes very useful. For example:
>
>     def collapse_edge_undirected_graph(a, b):
>         n = Node()
>         n.connected = a.connected + b.connected
>         for x in a.connected:
>             x.connected.replace(a, n)
>         for x in b.connected:
>             x.connected.replace(b, n)

Assuming n.connected is the set of nodes connected to n, why use a
list rather than a set? And if you need multi-edges, a dict mapping
node to count of edges (i.e. a multiset).

Paul.


From dkreuter at gmail.com  Sun Dec 30 10:24:31 2012
From: dkreuter at gmail.com (David Kreuter)
Date: Sun, 30 Dec 2012 10:24:31 +0100
Subject: [Python-ideas] proposed methods: list.replace / list.indices
In-Reply-To: <CACac1F9EaNaaMgr6+q_peUuW-wW3pvsFjqpKas2eijRYRHqonQ@mail.gmail.com>
References: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
	<50DFBD06.10304@mrabarnett.plus.com>
	<CAMiSff4cdLMemuv5tg36N+weZx=6t8rSVMO=+eNWVrRNEY9Qjg@mail.gmail.com>
	<CACac1F9EaNaaMgr6+q_peUuW-wW3pvsFjqpKas2eijRYRHqonQ@mail.gmail.com>
Message-ID: <CAMiSff4HHXdvUD+aGfN9g7pitYOKm-KL1DyeOi435Lqp4RFXJQ@mail.gmail.com>

On Sun, Dec 30, 2012 at 10:04 AM, Paul Moore <p.f.moore at gmail.com> wrote:

> On 30 December 2012 04:59, David Kreuter <dkreuter at gmail.com> wrote:
> > When I write code for processing graphs it becomes very useful. For
> example:
> >
> >     def collapse_edge_undirected_graph(a, b):
> >         n = Node()
> >         n.connected = a.connected + b.connected
> >         for x in a.connected:
> >             x.connected.replace(a, n)
> >         for x in b.connected:
> >             x.connected.replace(b, n)
>
> Assuming n.connected is the set of nodes connected to n, why use a
> list rather than a set? And if you need multi-edges, a dict mapping
> node to count of edges (i.e. a multiset).
>
> Paul.
>

Ah, that's because in that specific case I'm processing flow graphs. A node
with two outgoing edges represents an 'if'. The order does matter. [1] is
where the flow continues when the condition evaluates to true. [0] for
false.
Forgot to mention that.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121230/b1d13493/attachment.html>

From victor.stinner at gmail.com  Sun Dec 30 11:42:53 2012
From: victor.stinner at gmail.com (Victor Stinner)
Date: Sun, 30 Dec 2012 11:42:53 +0100
Subject: [Python-ideas] Dynamic code NOPing
In-Reply-To: <563458C3-9580-46AE-B343-6987116A3F08@stranden.com>
References: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>
	<CAA0H+QTTUudo6CWCvFT72dhn7ibK=+n+Vp3hCdWp3J40iEbtsg@mail.gmail.com>
	<CAA+RL7FckXhCzHOSXPtUrDrCNzEQewSc1tLQHgZv_xQdqc-J9Q@mail.gmail.com>
	<5DFDB30C-9A3D-4939-81D7-F34727284148@stranden.com>
	<CADiSq7fxJFGOnh8hhxYCHMwMJQCpSHa0q-UUnnsbF6rMBKvz_A@mail.gmail.com>
	<563458C3-9580-46AE-B343-6987116A3F08@stranden.com>
Message-ID: <CAMpsgwbK6k+UpO7Q2qiM+rhNOFhPp1OxQJDT89CP644kxbQuSg@mail.gmail.com>

My astoptimizer provides tools to really *remove* debug at compilation, so
the overhead of the debug code is just null.

You can for example declare your variable project.config.DEBUG as constant
with the value 0, where project.config is a module. So the if statement in
"from project.config import DEBUG ... if DEBUG: ..." will be removed.

See:
https://bitbucket.org/haypo/astoptimizer

Victor
Le 25 d?c. 2012 13:43, "Rene Nejsum" <rene at stranden.com> a ?crit :

> I understand and agree with all your arguments on debugging.
>
> At my company we typically make some kind of backend/server control
> software, with a LOT of debugging lines across many modules. We have 20+
> debugging flags and in different situations we enable a few of those, if we
> were to enable all at once it would defently have an impact on production,
> but hopefully just a hotter CPU and a lot of disk space being used.
>
> debug statements in our code is probably one per 10-20 lines of code.
>
> I think my main issue (and what I therefore read into the original
> suggestion) was the extra "if" statement at every log statement
>
> So doing:
>
> if log.debug.enabled():
>         log.debug( bla. bla. )
>
> Add's 5-10% extra code lines, whereas if we could do:
>
> log.debug( bla. bla )
>
> at the same cost would save a lot of lines.
>
> And when you have 43 lines in your editor, it will give you 3-5 lines more
> of real code to look at  :-)
>
> /Rene
>
>
>
> On Dec 25, 2012, at 1:28 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>
> > On Tue, Dec 25, 2012 at 9:11 PM, Rene Nejsum <rene at stranden.com> wrote:
> >> But if debug() was indeed NOP'able, maybe it could be done ?
> >
> > If someone *really* wants to do this, they can abuse assert statements
> > (which will be optimised out under "-O", just like code guarded by "if
> > __debug__"). That doesn't make it a good idea - you most need log
> > messages to investigate faults in production systems that you can't
> > (or are still trying to) reproduce in development and integration
> > environments. Compiling them out instead of deactivating them with
> > runtime configuration settings means you can't switch them on without
> > restarting the system with different options.
> >
> > This does mean that you have to factor in the cost of logging into
> > your performance targets and hardware requirements, but the payoff is
> > an increased ability to correctly diagnose system faults (as well as
> > improving your ability to extract interesting metrics from log
> > messages).
> >
> > Excessive logging calls certainly *can* cause performance problems due
> > to the function call overhead, as can careless calculation of
> > expensive values that aren't needed.  One alternatives occasional
> > noted is that you could design a logging API that can accept lazily
> > evaluated callables instead of ordinary parameters.
> >
> > However, one danger of such expensive logging it that enabling that
> > logging level becomes infeasible in practice, because the performance
> > hit is too significant. The typical aim for logging is that your
> > overhead should be such that enabling it in production means your
> > servers run a little hotter, or your task takes a little longer, not
> > that your application grinds to a halt. One good way to achieve this
> > is to decouple the expensive calculations from the main application -
> > you instead log the necessary pieces of information, which can be
> > picked up by an external service and the calculation performed in a
> > separate process (or even on a separate machine) where it won't affect
> > the main application, and where you only calculate it if you actually
> > need it for some reason.
> >
> > Cheers,
> > Nick.
> >
> > --
> > Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121230/06d9f83c/attachment.html>

From tjreedy at udel.edu  Sun Dec 30 11:46:45 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 30 Dec 2012 05:46:45 -0500
Subject: [Python-ideas] proposed methods: list.replace / list.indices
In-Reply-To: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
References: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
Message-ID: <kbp63c$tjg$1@ger.gmane.org>

On 12/29/2012 10:25 PM, David Kreuter wrote:

> I think it would be nice to have a method in 'list' to replace certain
> elements by others in-place. Like this:
>
>      l = [x, a, y, a]
>      l.replace(a, b)
>      assert l == [x, b, y, b]
>
> The alternatives are longer than they should be, imo. For example:
>
>      for i, n in enumerate(l):
>          if n == a:
>              l[i] = b

I dont see anything wrong with this. It is how I would do it in python. 
Wrap it in a function if you want. Or write it on two line ;-).

> If there is a reason not to add '.replace' as built-in method,

There is a perfectly good python version above that does the necessary 
search and replace as efficiently as possible. Thank you for posting it.

-- 
Terry Jan Reedy



From stefan_ml at behnel.de  Sun Dec 30 11:58:21 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sun, 30 Dec 2012 11:58:21 +0100
Subject: [Python-ideas] Dynamic code NOPing
In-Reply-To: <CAMpsgwbK6k+UpO7Q2qiM+rhNOFhPp1OxQJDT89CP644kxbQuSg@mail.gmail.com>
References: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>
	<CAA0H+QTTUudo6CWCvFT72dhn7ibK=+n+Vp3hCdWp3J40iEbtsg@mail.gmail.com>
	<CAA+RL7FckXhCzHOSXPtUrDrCNzEQewSc1tLQHgZv_xQdqc-J9Q@mail.gmail.com>
	<5DFDB30C-9A3D-4939-81D7-F34727284148@stranden.com>
	<CADiSq7fxJFGOnh8hhxYCHMwMJQCpSHa0q-UUnnsbF6rMBKvz_A@mail.gmail.com>
	<563458C3-9580-46AE-B343-6987116A3F08@stranden.com>
	<CAMpsgwbK6k+UpO7Q2qiM+rhNOFhPp1OxQJDT89CP644kxbQuSg@mail.gmail.com>
Message-ID: <kbp6oa$2ol$1@ger.gmane.org>

Victor Stinner, 30.12.2012 11:42:
> My astoptimizer provides tools to really *remove* debug at compilation, so
> the overhead of the debug code is just null.
> 
> You can for example declare your variable project.config.DEBUG as constant
> with the value 0, where project.config is a module. So the if statement in
> "from project.config import DEBUG ... if DEBUG: ..." will be removed.

How would you know at compile time that it can be removed? How do you
handle the example below?

Stefan


## constants.py

DEBUG = False


## enable_debug.py

import constants
constants.DEBUG = True


## test.py

import enable_debug
from constants import DEBUG

if DEBUG:
    print("DEBUGGING !")




From ned at nedbatchelder.com  Sun Dec 30 15:10:01 2012
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Sun, 30 Dec 2012 09:10:01 -0500
Subject: [Python-ideas] proposed methods: list.replace / list.indices
In-Reply-To: <kbp63c$tjg$1@ger.gmane.org>
References: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
	<kbp63c$tjg$1@ger.gmane.org>
Message-ID: <50E04B39.2040508@nedbatchelder.com>

On 12/30/2012 5:46 AM, Terry Reedy wrote:
> On 12/29/2012 10:25 PM, David Kreuter wrote:
>
>> I think it would be nice to have a method in 'list' to replace certain
>> elements by others in-place. Like this:
>>
>>      l = [x, a, y, a]
>>      l.replace(a, b)
>>      assert l == [x, b, y, b]
>>
>> The alternatives are longer than they should be, imo. For example:
>>
>>      for i, n in enumerate(l):
>>          if n == a:
>>              l[i] = b
>
> I dont see anything wrong with this. It is how I would do it in 
> python. Wrap it in a function if you want. Or write it on two line ;-).

I wonder at the underlying philosophy of things being accepted or 
rejected in this way.  For example, here's a thought experiment: if 
list.count() and list.index() didn't exist yet, would we accept them as 
additions to the list methods?  By Terry's reasoning, there's no need 
to, since I can implement those operations in a few lines of Python.  
Does that mean they persist only for backwards compatibility?  Was their 
initial inclusion a violation of some "list method philosophy"?  Or is 
there a good reason for them to exist, and if so, why shouldn't 
.replace() and .indexes() also exist?

The two sides (count/index and replace/indexes) seem about the same to me:

- They are unambiguous operations.  That is, no one has objected that 
reasonable people might disagree about how .replace() should behave, 
which is a common reason not to add things to the stdlib.
- They implement simple operations that are easy to explain and will 
find use.  In my experience, .indexes() is at least as useful as .count().
- All are based on element equality semantics.
- Any of them could be implemented in a few lines of Python.

What is the organizing principle for the methods list (or any other 
built-in data structure) should have? I would hate for the main 
criterion to be, "these are the methods that existed in Python 2.3," for 
example.   Why is .count() in and .replace() out?

>
>> If there is a reason not to add '.replace' as built-in method,
>
> There is a perfectly good python version above that does the necessary 
> search and replace as efficiently as possible. Thank you for posting it.
>

You say "as efficiently as possible," but you mean, "as algorithmically 
efficient as possible," which is true, they are linear, which is as good 
as it's going to get.  But surely if coded in C, these operations would 
be faster.

--Ned.


From ubershmekel at gmail.com  Sun Dec 30 15:51:23 2012
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Sun, 30 Dec 2012 16:51:23 +0200
Subject: [Python-ideas] proposed methods: list.replace / list.indices
In-Reply-To: <50E04B39.2040508@nedbatchelder.com>
References: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
	<kbp63c$tjg$1@ger.gmane.org> <50E04B39.2040508@nedbatchelder.com>
Message-ID: <CANSw7Kx0idQRgwxqP1FgXVrGXWa9eWEjkkz99h9ggbc9Us=F5g@mail.gmail.com>

On Sun, Dec 30, 2012 at 4:10 PM, Ned Batchelder <ned at nedbatchelder.com>wrote:

> I wonder at the underlying philosophy of things being accepted or rejected
> in this way.
>

I'm no expert on the subject but here are a few criteria for builtin method
inclusion:

* Useful - show many popular use cases, e.g. attach many links to various
lines on github/stackoverflow/bitbucket.
* Hard to get right, i.e. user implementations tend to have bugs.
* Would benefit greatly from C optimization
* Have a great, obvious, specific, readable name
* Don't overlap with anything else in the stdlib - TSBOAPOOOWTDI
* Consistent with the rest of python, e.g.
* Community approval
* BDFL approval

Brett wrote a bit on stdlib inclusion which may be relevant
http://mail.python.org/pipermail/python-3000/2006-June/002442.html

"that way may not be obvious at first unless you're Dutch."

Yuval
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121230/c707a29b/attachment.html>

From ncoghlan at gmail.com  Sun Dec 30 16:05:45 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 31 Dec 2012 01:05:45 +1000
Subject: [Python-ideas] proposed methods: list.replace / list.indices
In-Reply-To: <50E04B39.2040508@nedbatchelder.com>
References: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
	<kbp63c$tjg$1@ger.gmane.org> <50E04B39.2040508@nedbatchelder.com>
Message-ID: <CADiSq7f8vGJU3UsjPAK7bY74b5cQPVOQL+3B-tm3nK8FUiR6ZA@mail.gmail.com>

On Mon, Dec 31, 2012 at 12:10 AM, Ned Batchelder <ned at nedbatchelder.com> wrote:
> The two sides (count/index and replace/indexes) seem about the same to me:
>
> - They are unambiguous operations.  That is, no one has objected that
> reasonable people might disagree about how .replace() should behave, which
> is a common reason not to add things to the stdlib.
> - They implement simple operations that are easy to explain and will find
> use.  In my experience, .indexes() is at least as useful as .count().
> - All are based on element equality semantics.
> - Any of them could be implemented in a few lines of Python.
>
> What is the organizing principle for the methods list (or any other built-in
> data structure) should have? I would hate for the main criterion to be,
> "these are the methods that existed in Python 2.3," for example.   Why is
> .count() in and .replace() out?

The general problem with adding new methods to types rather than
adding new functions+protocols is that it breaks ducktyping. We can
mitigate that now by adding the new methods to
collections.abc.Sequence, but it remains the case that relying on
these methods being present rather than using the functional
equivalent will needlessly couple your code to the underlying sequence
implementation (since not all sequences inherit from the ABC, some are
just registered).

We also have a problem with replace() specifically that it *does*
already exist in the standard library, as a non-mutating operation on
str, bytes and bytearray. Adding it as a mutating method on sequences
in general would create an immediate name conflict in the bytearray
method namespace. That alone is a dealbreaker for that part of the
idea.

The question of an "indices" builtin or itertools function is
potentially more interesting, but really, I don't think the algorithm
David noted in his original post rises to the level of needing
standardisation or acceleration:

    def indices(seq, val):
        for i, x in enumerate(seq):
            if x == val: yield i

    def map_assign(store, keys, val):
        for k in keys:
            store[k] = val

    def replace(seq, old, new):
        map_assign(seq, indices(seq, old), new)

    seq = [x, a, y, a]
    replace(seq, a, b)
    assert seq == [x, b, y, b]

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


From ned at nedbatchelder.com  Sun Dec 30 16:13:10 2012
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Sun, 30 Dec 2012 10:13:10 -0500
Subject: [Python-ideas] proposed methods: list.replace / list.indices
In-Reply-To: <CANSw7Kx0idQRgwxqP1FgXVrGXWa9eWEjkkz99h9ggbc9Us=F5g@mail.gmail.com>
References: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
	<kbp63c$tjg$1@ger.gmane.org> <50E04B39.2040508@nedbatchelder.com>
	<CANSw7Kx0idQRgwxqP1FgXVrGXWa9eWEjkkz99h9ggbc9Us=F5g@mail.gmail.com>
Message-ID: <50E05A06.8010308@nedbatchelder.com>

On 12/30/2012 9:51 AM, Yuval Greenfield wrote:
> On Sun, Dec 30, 2012 at 4:10 PM, Ned Batchelder <ned at nedbatchelder.com 
> <mailto:ned at nedbatchelder.com>> wrote:
>
>     I wonder at the underlying philosophy of things being accepted or
>     rejected in this way.
>
>
> I'm no expert on the subject but here are a few criteria for builtin 
> method inclusion:
>
> * Useful - show many popular use cases, e.g. attach many links to 
> various lines on github/stackoverflow/bitbucket.
> * Hard to get right, i.e. user implementations tend to have bugs.
> * Would benefit greatly from C optimization
> * Have a great, obvious, specific, readable name
> * Don't overlap with anything else in the stdlib - TSBOAPOOOWTDI
> * Consistent with the rest of python, e.g.
> * Community approval
> * BDFL approval
>

This is a good list.  To make this concrete: in your opinion, would 
list.replace() and list.indexes() pass these criteria, or not?

--Ned.

> Brett wrote a bit on stdlib inclusion which may be relevant 
> http://mail.python.org/pipermail/python-3000/2006-June/002442.html
>
> "that way may not be obvious at first unless you're Dutch."
>
> Yuval

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121230/2c52c281/attachment.html>

From ned at nedbatchelder.com  Sun Dec 30 17:00:31 2012
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Sun, 30 Dec 2012 11:00:31 -0500
Subject: [Python-ideas] proposed methods: list.replace / list.indices
In-Reply-To: <CADiSq7f8vGJU3UsjPAK7bY74b5cQPVOQL+3B-tm3nK8FUiR6ZA@mail.gmail.com>
References: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
	<kbp63c$tjg$1@ger.gmane.org> <50E04B39.2040508@nedbatchelder.com>
	<CADiSq7f8vGJU3UsjPAK7bY74b5cQPVOQL+3B-tm3nK8FUiR6ZA@mail.gmail.com>
Message-ID: <50E0651F.6000305@nedbatchelder.com>

On 12/30/2012 10:05 AM, Nick Coghlan wrote:
> On Mon, Dec 31, 2012 at 12:10 AM, Ned Batchelder <ned at nedbatchelder.com> wrote:
>> The two sides (count/index and replace/indexes) seem about the same to me:
>>
>> - They are unambiguous operations.  That is, no one has objected that
>> reasonable people might disagree about how .replace() should behave, which
>> is a common reason not to add things to the stdlib.
>> - They implement simple operations that are easy to explain and will find
>> use.  In my experience, .indexes() is at least as useful as .count().
>> - All are based on element equality semantics.
>> - Any of them could be implemented in a few lines of Python.
>>
>> What is the organizing principle for the methods list (or any other built-in
>> data structure) should have? I would hate for the main criterion to be,
>> "these are the methods that existed in Python 2.3," for example.   Why is
>> .count() in and .replace() out?
> The general problem with adding new methods to types rather than
> adding new functions+protocols is that it breaks ducktyping. We can
> mitigate that now by adding the new methods to
> collections.abc.Sequence, but it remains the case that relying on
> these methods being present rather than using the functional
> equivalent will needlessly couple your code to the underlying sequence
> implementation (since not all sequences inherit from the ABC, some are
> just registered).
>
> We also have a problem with replace() specifically that it *does*
> already exist in the standard library, as a non-mutating operation on
> str, bytes and bytearray. Adding it as a mutating method on sequences
> in general would create an immediate name conflict in the bytearray
> method namespace. That alone is a dealbreaker for that part of the
> idea.

I don't understand the conflict?  .replace() from sequence does 
precisely the same thing as .replace() from bytes if you limit the 
arguments to single-byte values.  It seems perfectly natural to me. I 
must be missing something.

>
> The question of an "indices" builtin or itertools function is
> potentially more interesting, but really, I don't think the algorithm
> David noted in his original post rises to the level of needing
> standardisation or acceleration:
>
>      def indices(seq, val):
>          for i, x in enumerate(seq):
>              if x == val: yield i
>
>      def map_assign(store, keys, val):
>          for k in keys:
>              store[k] = val
>
>      def replace(seq, old, new):
>          map_assign(seq, indices(seq, old), new)
>
>      seq = [x, a, y, a]
>      replace(seq, a, b)
>      assert seq == [x, b, y, b]

Does this mean that if .index() or .count() didn't already exist, you 
wouldn't add them to list?

> Cheers,
> Nick.
>



From python at mrabarnett.plus.com  Sun Dec 30 17:58:06 2012
From: python at mrabarnett.plus.com (MRAB)
Date: Sun, 30 Dec 2012 16:58:06 +0000
Subject: [Python-ideas] proposed methods: list.replace / list.indices
In-Reply-To: <50E0651F.6000305@nedbatchelder.com>
References: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
	<kbp63c$tjg$1@ger.gmane.org> <50E04B39.2040508@nedbatchelder.com>
	<CADiSq7f8vGJU3UsjPAK7bY74b5cQPVOQL+3B-tm3nK8FUiR6ZA@mail.gmail.com>
	<50E0651F.6000305@nedbatchelder.com>
Message-ID: <50E0729E.1040208@mrabarnett.plus.com>

On 2012-12-30 16:00, Ned Batchelder wrote:
> On 12/30/2012 10:05 AM, Nick Coghlan wrote:
>> On Mon, Dec 31, 2012 at 12:10 AM, Ned Batchelder <ned at nedbatchelder.com> wrote:
>>> The two sides (count/index and replace/indexes) seem about the same to me:
>>>
>>> - They are unambiguous operations.  That is, no one has objected that
>>> reasonable people might disagree about how .replace() should behave, which
>>> is a common reason not to add things to the stdlib.
>>> - They implement simple operations that are easy to explain and will find
>>> use.  In my experience, .indexes() is at least as useful as .count().
>>> - All are based on element equality semantics.
>>> - Any of them could be implemented in a few lines of Python.
>>>
>>> What is the organizing principle for the methods list (or any other built-in
>>> data structure) should have? I would hate for the main criterion to be,
>>> "these are the methods that existed in Python 2.3," for example.   Why is
>>> .count() in and .replace() out?
>> The general problem with adding new methods to types rather than
>> adding new functions+protocols is that it breaks ducktyping. We can
>> mitigate that now by adding the new methods to
>> collections.abc.Sequence, but it remains the case that relying on
>> these methods being present rather than using the functional
>> equivalent will needlessly couple your code to the underlying sequence
>> implementation (since not all sequences inherit from the ABC, some are
>> just registered).
>>
>> We also have a problem with replace() specifically that it *does*
>> already exist in the standard library, as a non-mutating operation on
>> str, bytes and bytearray. Adding it as a mutating method on sequences
>> in general would create an immediate name conflict in the bytearray
>> method namespace. That alone is a dealbreaker for that part of the
>> idea.
>
> I don't understand the conflict?  .replace() from sequence does
> precisely the same thing as .replace() from bytes if you limit the
> arguments to single-byte values.  It seems perfectly natural to me. I
> must be missing something.
>
[snip]
The difference is that for bytes and str it returns the result (they
are immutable after all), but the suggested addition would mutate the
list in-place. In order to be consistent it would have to return the
result instead.



From hernan.grecco at gmail.com  Sun Dec 30 18:54:43 2012
From: hernan.grecco at gmail.com (Hernan Grecco)
Date: Sun, 30 Dec 2012 18:54:43 +0100
Subject: [Python-ideas] Order in the documentation search results
Message-ID: <CAL6gwWXikjrYG+f+sqnm3k2mtNXCasTD7Uj_ABY=JNLi4eBNhQ@mail.gmail.com>

Hi,

I have seen many people new to Python stumbling while using the Python
docs due to the order of the search results.

For example, if somebody new to python searches for  `tuple`, the
actual section about `tuple` comes in place 39. What is more confusing
for people starting with the language is that all the C functions come
first. I have seen people clicking in PyTupleObject just to be totally
disoriented.

Maybe `tuple` is a silly example. But if somebody wants to know how
does `open` behaves and which arguments it takes, the result comes in
position 16. `property` does not appear in the list at all (but
built-in appears in position 31). This is true for most builtins.

Experienced people will have no trouble navigating through these
results, but new users do. It is not terrible and at the end they get
it, but I think it would be nice to change it to more (new) user
friendly order.

So my suggestion is to put the builtins first, the rest of the
standard lib later including HowTos, FAQ, etc and finally the
c-modules. Additionally, a section with a title matching exactly the
search query should come first. (I am not sure if the last suggestion
belongs in python-ideas or in
the sphinx mailing list, please advice)

Thanks,

Hernan


From ned at nedbatchelder.com  Sun Dec 30 19:11:06 2012
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Sun, 30 Dec 2012 13:11:06 -0500
Subject: [Python-ideas] Order in the documentation search results
In-Reply-To: <CAL6gwWXikjrYG+f+sqnm3k2mtNXCasTD7Uj_ABY=JNLi4eBNhQ@mail.gmail.com>
References: <CAL6gwWXikjrYG+f+sqnm3k2mtNXCasTD7Uj_ABY=JNLi4eBNhQ@mail.gmail.com>
Message-ID: <50E083BA.7000603@nedbatchelder.com>

On 12/30/2012 12:54 PM, Hernan Grecco wrote:
> Hi,
>
> I have seen many people new to Python stumbling while using the Python
> docs due to the order of the search results.
>
> For example, if somebody new to python searches for  `tuple`, the
> actual section about `tuple` comes in place 39. What is more confusing
> for people starting with the language is that all the C functions come
> first. I have seen people clicking in PyTupleObject just to be totally
> disoriented.
>
> Maybe `tuple` is a silly example. But if somebody wants to know how
> does `open` behaves and which arguments it takes, the result comes in
> position 16. `property` does not appear in the list at all (but
> built-in appears in position 31). This is true for most builtins.
>
> Experienced people will have no trouble navigating through these
> results, but new users do. It is not terrible and at the end they get
> it, but I think it would be nice to change it to more (new) user
> friendly order.
>
> So my suggestion is to put the builtins first, the rest of the
> standard lib later including HowTos, FAQ, etc and finally the
> c-modules. Additionally, a section with a title matching exactly the
> search query should come first. (I am not sure if the last suggestion
> belongs in python-ideas or in
> the sphinx mailing list, please advice)

While we're on the topic, why in this day and age do we have a custom 
search?  Using google site search would be faster for the user, and more 
accurate.

--Ned.

> Thanks,
>
> Hernan
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



From ezio.melotti at gmail.com  Sun Dec 30 19:11:27 2012
From: ezio.melotti at gmail.com (Ezio Melotti)
Date: Sun, 30 Dec 2012 20:11:27 +0200
Subject: [Python-ideas] Order in the documentation search results
In-Reply-To: <CAL6gwWXikjrYG+f+sqnm3k2mtNXCasTD7Uj_ABY=JNLi4eBNhQ@mail.gmail.com>
References: <CAL6gwWXikjrYG+f+sqnm3k2mtNXCasTD7Uj_ABY=JNLi4eBNhQ@mail.gmail.com>
Message-ID: <CACBhJdGP4bbYqhU=Yg+u3LqdZ-Jmc65tL4c+atuD0m5Pbt4G2Q@mail.gmail.com>

Hi,

On Sun, Dec 30, 2012 at 7:54 PM, Hernan Grecco <hernan.grecco at gmail.com>wrote:

> Hi,
>
> I have seen many people new to Python stumbling while using the Python
> docs due to the order of the search results.
>
> For example, if somebody new to python searches for  `tuple`, the
> actual section about `tuple` comes in place 39. What is more confusing
> for people starting with the language is that all the C functions come
> first. I have seen people clicking in PyTupleObject just to be totally
> disoriented.
>
> Maybe `tuple` is a silly example. But if somebody wants to know how
> does `open` behaves and which arguments it takes, the result comes in
> position 16. `property` does not appear in the list at all (but
> built-in appears in position 31). This is true for most builtins.
>
> Experienced people will have no trouble navigating through these
> results, but new users do. It is not terrible and at the end they get
> it, but I think it would be nice to change it to more (new) user
> friendly order.
>
> So my suggestion is to put the builtins first, the rest of the
> standard lib later including HowTos, FAQ, etc and finally the
> c-modules. Additionally, a section with a title matching exactly the
> search query should come first. (I am not sure if the last suggestion
> belongs in python-ideas or in
> the sphinx mailing list, please advice)
>
> Thanks,
>
> Hernan
>

I experimented with this a bit a while ago.  See
http://bugs.python.org/issue15871#msg170048.

Best Regards,
Ezio Melotti
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121230/3ba2435a/attachment.html>

From hernan.grecco at gmail.com  Sun Dec 30 19:18:28 2012
From: hernan.grecco at gmail.com (Hernan Grecco)
Date: Sun, 30 Dec 2012 19:18:28 +0100
Subject: [Python-ideas] Order in the documentation search results
In-Reply-To: <CAL6gwWXWdWQPwV_F__BJb=qrPs88P5dtyNBfjVvQARHK=jeMeA@mail.gmail.com>
References: <CAL6gwWXikjrYG+f+sqnm3k2mtNXCasTD7Uj_ABY=JNLi4eBNhQ@mail.gmail.com>
	<50E083BA.7000603@nedbatchelder.com>
	<CAL6gwWXWdWQPwV_F__BJb=qrPs88P5dtyNBfjVvQARHK=jeMeA@mail.gmail.com>
Message-ID: <CAL6gwWVN=dj2vSxX8Jj==zbCiJtPdEtu4oNunQBbq+Qger5zHw@mail.gmail.com>

Hi Ned,

On Sun, Dec 30, 2012 at 7:11 PM, Ned Batchelder <ned at nedbatchelder.com> wrote:
>
> While we're on the topic, why in this day and age do we have a custom
> search?  Using google site search would be faster for the user, and more
> accurate.
>
> --Ned.

In general I agree with you, but I find downloadable documentation
very useful (one of the many reasons that I like sphinx). Keeping the
search engine in that case is very convenient.

Hernan


From g.brandl at gmx.net  Sun Dec 30 20:45:53 2012
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 30 Dec 2012 20:45:53 +0100
Subject: [Python-ideas] Order in the documentation search results
In-Reply-To: <50E083BA.7000603@nedbatchelder.com>
References: <CAL6gwWXikjrYG+f+sqnm3k2mtNXCasTD7Uj_ABY=JNLi4eBNhQ@mail.gmail.com>
	<50E083BA.7000603@nedbatchelder.com>
Message-ID: <kbq5l9$g8o$1@ger.gmane.org>

On 12/30/2012 07:11 PM, Ned Batchelder wrote:
> On 12/30/2012 12:54 PM, Hernan Grecco wrote:
>> Hi,
>>
>> I have seen many people new to Python stumbling while using the Python
>> docs due to the order of the search results.
>>
>> For example, if somebody new to python searches for  `tuple`, the
>> actual section about `tuple` comes in place 39. What is more confusing
>> for people starting with the language is that all the C functions come
>> first. I have seen people clicking in PyTupleObject just to be totally
>> disoriented.
>>
>> Maybe `tuple` is a silly example. But if somebody wants to know how
>> does `open` behaves and which arguments it takes, the result comes in
>> position 16. `property` does not appear in the list at all (but
>> built-in appears in position 31). This is true for most builtins.
>>
>> Experienced people will have no trouble navigating through these
>> results, but new users do. It is not terrible and at the end they get
>> it, but I think it would be nice to change it to more (new) user
>> friendly order.
>>
>> So my suggestion is to put the builtins first, the rest of the
>> standard lib later including HowTos, FAQ, etc and finally the
>> c-modules. Additionally, a section with a title matching exactly the
>> search query should come first. (I am not sure if the last suggestion
>> belongs in python-ideas or in
>> the sphinx mailing list, please advice)
> 
> While we're on the topic, why in this day and age do we have a custom 
> search?  Using google site search would be faster for the user, and more 
> accurate.

I agree.  Someone needs to propose a patch though.

cheers,
Georg



From random832 at fastmail.us  Sun Dec 30 22:28:23 2012
From: random832 at fastmail.us (Random832)
Date: Sun, 30 Dec 2012 16:28:23 -0500
Subject: [Python-ideas] proposed methods: list.replace / list.indices
In-Reply-To: <CADiSq7f8vGJU3UsjPAK7bY74b5cQPVOQL+3B-tm3nK8FUiR6ZA@mail.gmail.com>
References: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
	<kbp63c$tjg$1@ger.gmane.org> <50E04B39.2040508@nedbatchelder.com>
	<CADiSq7f8vGJU3UsjPAK7bY74b5cQPVOQL+3B-tm3nK8FUiR6ZA@mail.gmail.com>
Message-ID: <50E0B1F7.1020000@fastmail.us>

On 12/30/2012 10:05 AM, Nick Coghlan wrote:
> The general problem with adding new methods to types rather than
> adding new functions+protocols is that it breaks ducktyping. We can
> mitigate that now by adding the new methods to
> collections.abc.Sequence, but it remains the case that relying on
> these methods being present rather than using the functional
> equivalent will needlessly couple your code to the underlying sequence
> implementation (since not all sequences inherit from the ABC, some are
> just registered).

You know what wouldn't break duck typing? Adding an 
extension-method-like (a la C#) mechanism to ABCs. Of course, the 
problem with that is, what if a sequence implements a method called 
replace that does something else?


From victor.stinner at gmail.com  Sun Dec 30 23:20:34 2012
From: victor.stinner at gmail.com (Victor Stinner)
Date: Sun, 30 Dec 2012 23:20:34 +0100
Subject: [Python-ideas] Documenting Python warts on Stack Overflow
In-Reply-To: <CAPkN8x+TF6znvM-Kd2Qad_ExjYf-hv41S3_Uk0U0OqAGO8HteQ@mail.gmail.com>
References: <CAPkN8x+TF6znvM-Kd2Qad_ExjYf-hv41S3_Uk0U0OqAGO8HteQ@mail.gmail.com>
Message-ID: <CAMpsgwYJiF5JFEf1jBmDoh9mDwLbkq7B6efAvq+56zbud8qHPg@mail.gmail.com>

2012/12/26 anatoly techtonik <techtonik at gmail.com>:
> I am thinking about [python-wart] on SO.

I'm not sure that StackOverflow is the best place for such project.
(Note: please avoid abreviation, not all people know this website.)

> There is no currently a list of
> Python warts, and building a better language is impossible without a clear
> visibility of warts in current implementations.

Sorry, but what is a wart in Python?

> Why Roundup doesn't work ATM.
> - warts are lost among other "won't fix" and "works for me" issues

When an issue is closed with "won't fix", "works for me", "invalid" or
something like this, a comment always explain why. If you don't
understand or such comment is missing, you can ask for more
information.

If you don't agree, the bug tracker is maybe not the right place for
such discussion. The python-ideas mailing list is maybe a better place
:-)

Sometimes, the best thing to do is to propose a patch to enhance the
documentation.

> - no way to edit description to make it more clear

You can add comments, it's almost the same.

> - no voting/stars to percieve how important is this issue

Votes are a trap. It's not how Python is developed. Python core
developers are not paid to work on Python, and so work only on issues
which interest them.

I don't think that votes would help to fix an issue.

If you want an issue to be closed:
- ensure that someone else reproduced it: if not, provide more information
- help to analyze the issue and track the bug in the code
- propose a patch with tests and documentation

> - no comment/noise filtering

I don't have such problem. Can you give an example of issue which
contains many useless comments?

> and the most valuable
> - there is no query to list warts sorted by popularity to explore other
> time-consuming areas of Python you are not aware of, but which can popup one
> day

Sorry, I don't understand, maybe because I don't know what a wart is.

--

If I understood correctly, you would like to list some specific issues
like print() not flushing immediatly stdout if you ask to not write a
newline (print "a", in Python 2 or print("a", end=" ") in Python 3).
If I understood correctly, and if you want to improve Python, you
should help the documentation project. Or if you can build a website
listing such issues *and listing solutions* like calling
sys.stdout.flush() or using print(flush=True) (Python 3.3+) for the
print issue.

A list of such issue without solution doesn't help anyone.

Victor


From tjreedy at udel.edu  Sun Dec 30 23:59:53 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 30 Dec 2012 17:59:53 -0500
Subject: [Python-ideas] proposed methods: list.replace / list.indices
In-Reply-To: <50E04B39.2040508@nedbatchelder.com>
References: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
	<kbp63c$tjg$1@ger.gmane.org> <50E04B39.2040508@nedbatchelder.com>
Message-ID: <kbqh1u$77q$1@ger.gmane.org>

On 12/30/2012 9:10 AM, Ned Batchelder wrote:
> On 12/30/2012 5:46 AM, Terry Reedy wrote:
>> On 12/29/2012 10:25 PM, David Kreuter wrote:
>>
>>> I think it would be nice to have a method in 'list' to replace certain
>>> elements by others in-place. Like this:
>>>
>>>      l = [x, a, y, a]
>>>      l.replace(a, b)
>>>      assert l == [x, b, y, b]
>>>
>>> The alternatives are longer than they should be, imo. For example:
>>>
>>>      for i, n in enumerate(l):

Note that enumerate is a generic function of iterables, not a specific 
list method.

>>>          if n == a:
>>>              l[i] = b
>>
>> I dont see anything wrong with this. It is how I would do it in
>> python. Wrap it in a function if you want. Or write it on two line ;-).

My deeper objection is that 'replace_all_in_place' is a generic mutable 
collection function, not a specific list or even mutable sequence 
function. Python 1 was stronger list oriented. Python 3 is mostly 
iterable oriented, with remnants of the Python 1 heritage.

> I wonder at the underlying philosophy of things being accepted or
> rejected in this way.  For example, here's a thought experiment: if
> list.count() and list.index() didn't exist yet, would we accept them as
> additions to the list methods?

I personally would have deleted list.find in 3.0. Count and index are 
not list methods but rather sequence methods, part of the sequence ABC. 
Tuples got them, as their only two public methods, in 3.0 to conform. 
This ties in to Nick's comment. (Actually counting a particular item in 
a collection is not specific to sequencess, but having multiple items to 
count tends to be.) It would be possible for count and index to be 
functions instead. But their definition as methods goes back to Python 1.

Also note that .index has a start parameter, making it useful to get all 
indexes. See the code below.

 > By Terry's reasoning, there's no need
> to, since I can implement those operations in a few lines of Python.

We constantly get proposals to add new functions and methods that are 
easily written in a few lines. Everyone thinks their proposal is useful 
because it is useful in their work. If we accepted all such proposals, 
Python would have hundreds more.

> Does that mean they persist only for backwards compatibility?

Backwards compatibility is important. Changing them to functions would 
be disruptive without sufficient gain.

> Was their initial inclusion a violation of some "list method philosophy"?

No, it was part of the Python 1 philosophy of lists as the common data 
interchange type. As I said, this has changed in Python 3.

> Or is
> there a good reason for them to exist, and if so, why shouldn't
> .replace() and .indexes() also exist?

Neither are list methods. Nicks gave a generic indexes generator. A 
specific list indexes generator can use repeated applications of .index 
with start argument. I do that 'inline' below.

> I would hate for the main
> criterion to be, "these are the methods that existed in Python 2.3,"

Then you are hating reality ;-). The .method()s of basic builtin classes 
is close to frozen.

>> There is a perfectly good python version above that does the necessary
>> search and replace as efficiently as possible. Thank you for posting it.

> You say "as efficiently as possible," but you mean, "as algorithmically
> efficient as possible," which is true, they are linear, which is as good
> as it's going to get.  But surely if coded in C, these operations would
> be faster.

You are right.
Lets do the next-item search in C with .index.
If the density of items to be replaces is low, as it would be for most 
applications, this should dominate.

def enum(lis):
       for i, n in enumerate(lis):
           if n == 1:
               lis[i] = 2

a, b = 100, 10000  # started with 2,1 for initial tests
start = a*([1]+b*[0])
after = a*([2]+b*[0])

# test that correct before test speed!
# since the list is mutated, it must be reset for each test
lis = start.copy()
enum(lis)
print('enum: ', lis == after)

def repin(lis):
     i = -1
     try:
         while True:
             i = lis.index(1, i+1)
             lis[i] = 2
     except:
         pass

lis = start.copy()
repin(lis)
print('repin: ', lis == after)

from timeit import timeit
# now for speed, remembering to reset for each test
# first measure the copy time to subtract from test times
print(timeit('lis = start.copy()',
              'from __main__ import start', number=10))
print(timeit('lis = start.copy(); enum(lis)',
              'from __main__ import start, enum', number=10))
print(timeit('lis = start.copy(); repin(lis)',
              'from __main__ import start, repin', number=10))
# measure scan without replace to give an upper limit to python-coded 
replace
# since lis is not mutated, it only needs to be defined once
print(timeit('repin(lis)',
              'from __main__ import a, b, repin; lis = a*(b+1)*[0]',
              number=10))

# prints
enum:  True
repin:  True
0.06801244890066886
0.849063227602523
0.2759397696510706
0.20790119084727898

After subtracting and dividing, enum take .078 seconds for 100 
replacements in 1000000 items, repin just .021, which is essentially the 
time it takes just to scan 1000000 items. So doing the replacements also 
in C would not be much faster. Rerunning with 10000 replacements (a,b = 
10000, 100), the times are .080 and .024.

-- 
Terry Jan Reedy



From tjreedy at udel.edu  Mon Dec 31 00:05:17 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 30 Dec 2012 18:05:17 -0500
Subject: [Python-ideas] Order in the documentation search results
In-Reply-To: <CACBhJdGP4bbYqhU=Yg+u3LqdZ-Jmc65tL4c+atuD0m5Pbt4G2Q@mail.gmail.com>
References: <CAL6gwWXikjrYG+f+sqnm3k2mtNXCasTD7Uj_ABY=JNLi4eBNhQ@mail.gmail.com>
	<CACBhJdGP4bbYqhU=Yg+u3LqdZ-Jmc65tL4c+atuD0m5Pbt4G2Q@mail.gmail.com>
Message-ID: <kbqhc2$9fa$1@ger.gmane.org>

On 12/30/2012 1:11 PM, Ezio Melotti wrote:

> On Sun, Dec 30, 2012 at 7:54 PM, Hernan Grecco
>     I have seen many people new to Python stumbling while using the Python
>     docs due to the order of the search results.

People should use the index, both on and off line. See the issue below

> I experimented with this a bit a while ago.  See
> http://bugs.python.org/issue15871#msg170048.

-- 
Terry Jan Reedy



From g.rodola at gmail.com  Mon Dec 31 00:38:54 2012
From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=)
Date: Mon, 31 Dec 2012 00:38:54 +0100
Subject: [Python-ideas] Order in the documentation search results
In-Reply-To: <CAL6gwWXikjrYG+f+sqnm3k2mtNXCasTD7Uj_ABY=JNLi4eBNhQ@mail.gmail.com>
References: <CAL6gwWXikjrYG+f+sqnm3k2mtNXCasTD7Uj_ABY=JNLi4eBNhQ@mail.gmail.com>
Message-ID: <CAFYqXL9xp_5g15oj2Jfrv5nrBfuP9ymjpMW7njsQ4pghXqC5Qw@mail.gmail.com>

2012/12/30 Hernan Grecco <hernan.grecco at gmail.com>

> Hi,
>
> I have seen many people new to Python stumbling while using the Python
> docs due to the order of the search results.
>
> For example, if somebody new to python searches for  `tuple`, the
> actual section about `tuple` comes in place 39. What is more confusing
> for people starting with the language is that all the C functions come
> first. I have seen people clicking in PyTupleObject just to be totally
> disoriented.
>
> Maybe `tuple` is a silly example. But if somebody wants to know how
> does `open` behaves and which arguments it takes, the result comes in
> position 16. `property` does not appear in the list at all (but
> built-in appears in position 31). This is true for most builtins.
>
> Experienced people will have no trouble navigating through these
> results, but new users do. It is not terrible and at the end they get
> it, but I think it would be nice to change it to more (new) user
> friendly order.
>
> So my suggestion is to put the builtins first, the rest of the
> standard lib later including HowTos, FAQ, etc and finally the
> c-modules. Additionally, a section with a title matching exactly the
> search query should come first. (I am not sure if the last suggestion
> belongs in python-ideas or in
> the sphinx mailing list, please advice)
>
> Thanks,
>
> Hernan
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>

+1
I agree it's sub-optimal.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121231/96191d62/attachment.html>

From ned at nedbatchelder.com  Mon Dec 31 00:59:27 2012
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Sun, 30 Dec 2012 18:59:27 -0500
Subject: [Python-ideas] proposed methods: list.replace / list.indices
In-Reply-To: <kbqh1u$77q$1@ger.gmane.org>
References: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
	<kbp63c$tjg$1@ger.gmane.org> <50E04B39.2040508@nedbatchelder.com>
	<kbqh1u$77q$1@ger.gmane.org>
Message-ID: <50E0D55F.7080108@nedbatchelder.com>

Thanks, these are very informative answers.

--Ned.


From ryan at hackery.io  Mon Dec 31 01:06:08 2012
From: ryan at hackery.io (Ryan Macy)
Date: Sun, 30 Dec 2012 18:06:08 -0600
Subject: [Python-ideas] Documenting Python warts on Stack Overflow
In-Reply-To: <CAMpsgwYJiF5JFEf1jBmDoh9mDwLbkq7B6efAvq+56zbud8qHPg@mail.gmail.com>
References: <CAPkN8x+TF6znvM-Kd2Qad_ExjYf-hv41S3_Uk0U0OqAGO8HteQ@mail.gmail.com>
	<CAMpsgwYJiF5JFEf1jBmDoh9mDwLbkq7B6efAvq+56zbud8qHPg@mail.gmail.com>
Message-ID: <50E0D6F0.5090200@hackery.io>

I'm a young developer (22) that is aspiring to contribute to the python 
language and I think another perspective could help the conversation. 
What I believe Anataoly is asking for, albeit phrased in a different 
manner, is the ability to clearly see what the core issues/needs are in 
the language. I've been able to discern through time, and the python 
mailing lists, that packaging, multitasking, and timezone support are 
areas that could use help. Sure, 'wart' is subjective, but I believe the 
point made in between the lines is valid.

Is there a place that holds the key improvements that the python 
language needs, so that we can work to it better? If that's the bug 
tracker, is there a method already in place that signals areas that need 
improvements or fixes? [I know that there are severity levels, etc :)]

FWIW I've joined the python-mentor list, have read most of the devguide, 
and lurked on the bug tracker; I still feel like there is tons of 
context that I'm missing, which has me chasing PEPs constantly - So I'm 
definitely able to resonate the with this thread.

I apologize if I'm waay off target.

_Ryan

> Victor Stinner <mailto:victor.stinner at gmail.com>
> December 30, 2012 4:20 PM
> 2012/12/26 anatoly techtonik<techtonik at gmail.com>:
>> I am thinking about [python-wart] on SO.
>
> I'm not sure that StackOverflow is the best place for such project.
> (Note: please avoid abreviation, not all people know this website.)
>
>> There is no currently a list of
>> Python warts, and building a better language is impossible without a clear
>> visibility of warts in current implementations.
>
> Sorry, but what is a wart in Python?
>
>> Why Roundup doesn't work ATM.
>> - warts are lost among other "won't fix" and "works for me" issues
>
> When an issue is closed with "won't fix", "works for me", "invalid" or
> something like this, a comment always explain why. If you don't
> understand or such comment is missing, you can ask for more
> information.
>
> If you don't agree, the bug tracker is maybe not the right place for
> such discussion. The python-ideas mailing list is maybe a better place
> :-)
>
> Sometimes, the best thing to do is to propose a patch to enhance the
> documentation.
>
>> - no way to edit description to make it more clear
>
> You can add comments, it's almost the same.
>
>> - no voting/stars to percieve how important is this issue
>
> Votes are a trap. It's not how Python is developed. Python core
> developers are not paid to work on Python, and so work only on issues
> which interest them.
>
> I don't think that votes would help to fix an issue.
>
> If you want an issue to be closed:
> - ensure that someone else reproduced it: if not, provide more information
> - help to analyze the issue and track the bug in the code
> - propose a patch with tests and documentation
>
>> - no comment/noise filtering
>
> I don't have such problem. Can you give an example of issue which
> contains many useless comments?
>
>> and the most valuable
>> - there is no query to list warts sorted by popularity to explore other
>> time-consuming areas of Python you are not aware of, but which can popup one
>> day
>
> Sorry, I don't understand, maybe because I don't know what a wart is.
>
> --
>
> If I understood correctly, you would like to list some specific issues
> like print() not flushing immediatly stdout if you ask to not write a
> newline (print "a", in Python 2 or print("a", end=" ") in Python 3).
> If I understood correctly, and if you want to improve Python, you
> should help the documentation project. Or if you can build a website
> listing such issues *and listing solutions* like calling
> sys.stdout.flush() or using print(flush=True) (Python 3.3+) for the
> print issue.
>
> A list of such issue without solution doesn't help anyone.
>
> Victor
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
> anatoly techtonik <mailto:techtonik at gmail.com>
> December 25, 2012 6:10 PM
> I am thinking about [python-wart] on SO. There is no currently a list 
> of Python warts, and building a better language is impossible without 
> a clear visibility of warts in current implementations.
>
> Why Roundup doesn't work ATM.
> - warts are lost among other "won't fix" and "works for me" issues
> - no way to edit description to make it more clear
> - no voting/stars to percieve how important is this issue
> - no comment/noise filtering
> and the most valuable
> - there is no query to list warts sorted by popularity to explore 
> other time-consuming areas of Python you are not aware of, but which 
> can popup one day
>
> SO at least allows:
> + voting
> + community wiki edits
> + useful comment upvoting
> + sorted lists
> + user editable tags (adding new warts is easy)
>
> This post is a result of facing with numerous locals/settrace/exec 
> issues that are closed on tracker. I also have my own list of other 
> issues (logging/subprocess) at GC project, which I might be unable to 
> maintain in future. There is also some undocumented stuff (subprocess 
> deadlocks) that I'm investigating, but don't have time for a write-up. 
> So I'd rather move this somewhere where it could be updated.
> -- 
> anatoly t.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121230/0988c216/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: postbox-contact.jpg
Type: image/jpeg
Size: 1240 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121230/0988c216/attachment.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: postbox-contact.jpg
Type: image/jpeg
Size: 1103 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121230/0988c216/attachment-0001.jpg>

From steve at pearwood.info  Mon Dec 31 01:12:15 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 31 Dec 2012 11:12:15 +1100
Subject: [Python-ideas] Order in the documentation search results
In-Reply-To: <CAL6gwWXikjrYG+f+sqnm3k2mtNXCasTD7Uj_ABY=JNLi4eBNhQ@mail.gmail.com>
References: <CAL6gwWXikjrYG+f+sqnm3k2mtNXCasTD7Uj_ABY=JNLi4eBNhQ@mail.gmail.com>
Message-ID: <50E0D85F.8070607@pearwood.info>

On 31/12/12 04:54, Hernan Grecco wrote:
> Hi,
>
> I have seen many people new to Python stumbling while using the Python
> docs due to the order of the search results.
[...]
> Experienced people will have no trouble navigating through these
> results, but new users do. It is not terrible and at the end they get
> it, but I think it would be nice to change it to more (new) user
> friendly order.


I'm an experienced person, and I have trouble navigating through the
search results. I usually use Google or DuckDuckGo to search, and
avoid the website's search functionality altogether.


-- 
Steven


From cs at zip.com.au  Mon Dec 31 01:22:16 2012
From: cs at zip.com.au (Cameron Simpson)
Date: Mon, 31 Dec 2012 11:22:16 +1100
Subject: [Python-ideas] Order in the documentation search results
In-Reply-To: <kbqhc2$9fa$1@ger.gmane.org>
References: <kbqhc2$9fa$1@ger.gmane.org>
Message-ID: <20121231002215.GA28101@cskk.homeip.net>

On 30Dec2012 18:05, Terry Reedy <tjreedy at udel.edu> wrote:
| On 12/30/2012 1:11 PM, Ezio Melotti wrote:
| > On Sun, Dec 30, 2012 at 7:54 PM, Hernan Grecco
| >     I have seen many people new to Python stumbling while using the Python
| >     docs due to the order of the search results.
| 
| People should use the index, both on and off line. See the issue below

Personally, I do. But even that is misleading, or at any rate often not
so useful. And since there is a search, its quality should be addressed.

IMO the index has similar issues to the search, though on a much smaller
scale. 

You'll see here I'm only offering criticism, no fixes.

Cheers,
-- 
Cameron Simpson <cs at zip.com.au>

'Soup: This is the one that Kawasaki sent out pictures, that looks so beautiful.
Yanagawa: Yes, everybody says it's beautiful - but many problems!
'Soup: But you are not part of the design team, you're just a test rider.
Yanagawa: Yes. I just complain.
- _Akira Yanagawa Sounds Off_ @ www.amasuperbike.com


From ryan at hackery.io  Mon Dec 31 01:15:58 2012
From: ryan at hackery.io (Ryan Macy)
Date: Sun, 30 Dec 2012 18:15:58 -0600
Subject: [Python-ideas] Order in the documentation search results
In-Reply-To: <CAFYqXL9xp_5g15oj2Jfrv5nrBfuP9ymjpMW7njsQ4pghXqC5Qw@mail.gmail.com>
References: <CAL6gwWXikjrYG+f+sqnm3k2mtNXCasTD7Uj_ABY=JNLi4eBNhQ@mail.gmail.com>
	<CAFYqXL9xp_5g15oj2Jfrv5nrBfuP9ymjpMW7njsQ4pghXqC5Qw@mail.gmail.com>
Message-ID: <50E0D93E.7010904@hackery.io>

> Giampaolo Rodol? <mailto:g.rodola at gmail.com>
> December 30, 2012 5:38 PM
>
>
> +1
> I agree it's sub-optimal.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
> Hernan Grecco <mailto:hernan.grecco at gmail.com>
> December 30, 2012 11:54 AM
> Hi,
>
> I have seen many people new to Python stumbling while using the Python
> docs due to the order of the search results.
>
> For example, if somebody new to python searches for `tuple`, the
> actual section about `tuple` comes in place 39. What is more confusing
> for people starting with the language is that all the C functions come
> first. I have seen people clicking in PyTupleObject just to be totally
> disoriented.
>
> Maybe `tuple` is a silly example. But if somebody wants to know how
> does `open` behaves and which arguments it takes, the result comes in
> position 16. `property` does not appear in the list at all (but
> built-in appears in position 31). This is true for most builtins.
>
> Experienced people will have no trouble navigating through these
> results, but new users do. It is not terrible and at the end they get
> it, but I think it would be nice to change it to more (new) user
> friendly order.
>
> So my suggestion is to put the builtins first, the rest of the
> standard lib later including HowTos, FAQ, etc and finally the
> c-modules. Additionally, a section with a title matching exactly the
> search query should come first. (I am not sure if the last suggestion
> belongs in python-ideas or in
> the sphinx mailing list, please advice)
>
> Thanks,
>
> Hernan
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas

+1 as well, dash has come in handy!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121230/f4fc97a1/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: postbox-contact.jpg
Type: image/jpeg
Size: 1283 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121230/f4fc97a1/attachment.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: postbox-contact.jpg
Type: image/jpeg
Size: 1128 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121230/f4fc97a1/attachment-0001.jpg>

From solipsis at pitrou.net  Mon Dec 31 01:48:21 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 31 Dec 2012 01:48:21 +0100
Subject: [Python-ideas] Documenting Python warts on Stack Overflow
References: <CAPkN8x+TF6znvM-Kd2Qad_ExjYf-hv41S3_Uk0U0OqAGO8HteQ@mail.gmail.com>
	<CAMpsgwYJiF5JFEf1jBmDoh9mDwLbkq7B6efAvq+56zbud8qHPg@mail.gmail.com>
	<50E0D6F0.5090200@hackery.io>
Message-ID: <20121231014821.2fb21be8@pitrou.net>

On Sun, 30 Dec 2012 18:06:08 -0600
Ryan Macy <ryan at hackery.io> wrote:
> I'm a young developer (22) that is aspiring to contribute to the python 
> language and I think another perspective could help the conversation. 
> What I believe Anataoly is asking for, albeit phrased in a different 
> manner, is the ability to clearly see what the core issues/needs are in 
> the language. I've been able to discern through time, and the python 
> mailing lists, that packaging, multitasking, and timezone support are 
> areas that could use help. Sure, 'wart' is subjective, but I believe the 
> point made in between the lines is valid.

I'm not sure Anatoly is talking about things that have to be improved,
rather than things which are lacking (in his opinion, or in the general
opinion) and which nevertheless won't be fixed for various reasons.

These things would have a place in the FAQ, if Anatoly wants to
contribute documentation patches:
http://docs.python.org/dev/faq/index.html

Regards

Antoine.




From tjreedy at udel.edu  Mon Dec 31 02:17:14 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 30 Dec 2012 20:17:14 -0500
Subject: [Python-ideas] Documenting Python warts on Stack Overflow
In-Reply-To: <CAMpsgwYJiF5JFEf1jBmDoh9mDwLbkq7B6efAvq+56zbud8qHPg@mail.gmail.com>
References: <CAPkN8x+TF6znvM-Kd2Qad_ExjYf-hv41S3_Uk0U0OqAGO8HteQ@mail.gmail.com>
	<CAMpsgwYJiF5JFEf1jBmDoh9mDwLbkq7B6efAvq+56zbud8qHPg@mail.gmail.com>
Message-ID: <kbqp3g$uln$1@ger.gmane.org>

I consider Anatoly's post to be off-topic, obnoxious, and best ignored.

On 12/30/2012 5:20 PM, Victor Stinner wrote:

I am only responding because Eli and then Victor responded, largely 
repeating things that have been said before (and ignored) on many of the 
same issues.

> 2012/12/26 anatoly techtonik <techtonik at gmail.com>:
>> I am thinking about [python-wart] on SO.

The purpose of python-ideas is to discuss possible ideas for improving 
future versions of Python and the reference CPython implementation, 
including its included documentation.

Announcements of independent personal activities are off topic. 
Announcements of thoughts about such activities are, to me, even more 
so. I have lots of thoughts about things I *might* do, and I am sure 
many others do too. Should we all post them here? I think not. I am 
actually working on, not just thinking about, a book that showcases many 
of the positive features of Python. But I do not think that an 
announcement post here is particularly on-topic.

As for 'obnoxious', this is not just a post about thoughts, but of 
thoughts to abuse another forum to trash python, and a trashy 
justification for doing so.

>> There is no currently a list of
>> Python warts, and building a better language is impossible without a clear
>> visibility of warts in current implementations.

There is, of course, a tracker with, at the moment,3771 open issues. 
That is already too many. Repeatly regurgitating closed issue is an 
obnoxious distraction.

> Sorry, but what is a wart in Python?

A Python behavior that Anatoly does not like and that the CPython 
developers cannot, will not*, or have not yet# changed. By extension, 
our disliked-by-him actions are also warts. This ego-centric view is 
more of 'obnoxious'.

* Perhaps because we consider the whole community, not just one person.

# Perhaps because of ignorance or lack of interest.

Berating us for not doing something that he will also not do (write a 
patch) is more of 'obnoxious'.

>> Why Roundup doesn't work ATM.
>> - warts are lost among other "won't fix" and "works for me" issues

One can easily search the tracker for closed issues with any particular 
resolution. One can even limit the search for such issue with 
'techtonik' on the nosy list. Results:
'rejected' 17
'invalid' 17
'won't fix' 10
'works for me' 17
The numbers are smaller if 'techtonik' is entered instead in the creator 
box. This is the core list of issues Anatoly would consider 'lost 
warts'. They are not lost, just not prominently displayed to the world 
in the way he would like.

Spreading disinformation is more of 'obnoxious'.

>> - no way to edit description to make it more clear

There is no description field. The title of an issue and other 
descriptive headers can be edited and often are. There is an audit trail 
of changes. The description of a issue can be and sometimes is re-stated 
by the original author or others in successive messages. As a matter of 
audit trail policy, messages cannot be edited. They can be deleted from 
an issue (and that fact noted, and by who) but not (normally, anyway) 
from the database.

So, more disinformation. Calling a disagreement over policy a 'wart' is 
disengenous.

>> - no voting/stars to percieve how important is this issue

Proposed and rejected before. Again: the devs don't do what Anatoly 
wants, its a wart.

-- 
Terry Jan Reedy



From steve at pearwood.info  Mon Dec 31 02:39:18 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 31 Dec 2012 12:39:18 +1100
Subject: [Python-ideas] proposed methods: list.replace / list.indices
In-Reply-To: <50E04B39.2040508@nedbatchelder.com>
References: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
	<kbp63c$tjg$1@ger.gmane.org> <50E04B39.2040508@nedbatchelder.com>
Message-ID: <50E0ECC6.6060004@pearwood.info>

On 31/12/12 01:10, Ned Batchelder wrote:

> What is the organizing principle for the methods list (or any other
>built-in data structure) should have? I would hate for the main
>criterion to be, "these are the methods that existed in Python 2.3,"
> for example. Why is .count() in and .replace() out?

I fear that it is more likely to be "they existed in Python 1.5".

As far as I can tell, there have been very few new methods added to standard
types since Python 1.5, and possibly before that. Putting aside dunder methods,
the only public list methods in 3.3 that weren't in 1.5 are clear and copy.
Tuples also have two new methods, count and index. Dicts have seen a few more
changes:

- has_key is gone;
- fromkeys, pop, popitem, and setdefault are added.


So changes to builtin types have been very conservative.



-- 
Steven


From steve at pearwood.info  Mon Dec 31 02:40:02 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 31 Dec 2012 12:40:02 +1100
Subject: [Python-ideas] proposed methods: list.replace / list.indices
In-Reply-To: <50E0729E.1040208@mrabarnett.plus.com>
References: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
	<kbp63c$tjg$1@ger.gmane.org> <50E04B39.2040508@nedbatchelder.com>
	<CADiSq7f8vGJU3UsjPAK7bY74b5cQPVOQL+3B-tm3nK8FUiR6ZA@mail.gmail.com>
	<50E0651F.6000305@nedbatchelder.com>
	<50E0729E.1040208@mrabarnett.plus.com>
Message-ID: <50E0ECF2.7050107@pearwood.info>

On 31/12/12 03:58, MRAB wrote:
> On 2012-12-30 16:00, Ned Batchelder wrote:

>> I don't understand the conflict? .replace() from sequence does
>> precisely the same thing as .replace() from bytes if you limit the
>> arguments to single-byte values. It seems perfectly natural to me. I
>> must be missing something.
>>
> [snip]
> The difference is that for bytes and str it returns the result (they
> are immutable after all), but the suggested addition would mutate the
> list in-place. In order to be consistent it would have to return the
> result instead.

Are you seriously suggesting that because str has a replace method with
a specific API, no other type can have a replace method unless it has
the same API?

Why must list.replace and str.replace do exactly the same thing? Lists
and strings are not the same, and you cannot in general expect to
substitute lists with strings, or vice versa.

collections.abc.MutableSequence would seem to me to be the right place
for a mutator replace method.


-- 
Steven


From victor.stinner at gmail.com  Mon Dec 31 03:00:08 2012
From: victor.stinner at gmail.com (Victor Stinner)
Date: Mon, 31 Dec 2012 03:00:08 +0100
Subject: [Python-ideas] Dynamic code NOPing
In-Reply-To: <kbp6oa$2ol$1@ger.gmane.org>
References: <CAPkN8xKZCaeGxH-HHMKcnm74EOP=VsPmtPbcc5AFO7fJQdWkGA@mail.gmail.com>
	<CAA0H+QTTUudo6CWCvFT72dhn7ibK=+n+Vp3hCdWp3J40iEbtsg@mail.gmail.com>
	<CAA+RL7FckXhCzHOSXPtUrDrCNzEQewSc1tLQHgZv_xQdqc-J9Q@mail.gmail.com>
	<5DFDB30C-9A3D-4939-81D7-F34727284148@stranden.com>
	<CADiSq7fxJFGOnh8hhxYCHMwMJQCpSHa0q-UUnnsbF6rMBKvz_A@mail.gmail.com>
	<563458C3-9580-46AE-B343-6987116A3F08@stranden.com>
	<CAMpsgwbK6k+UpO7Q2qiM+rhNOFhPp1OxQJDT89CP644kxbQuSg@mail.gmail.com>
	<kbp6oa$2ol$1@ger.gmane.org>
Message-ID: <CAMpsgwZaLzpbp=Zr1obWnX+6dv4NUNcg8D6QAPBeFDRKGUC1pA@mail.gmail.com>

If you mark constant.DEBUG as constant and compile your project with
astoptimizer, enable_debug has no effect (if it was compiled with
DEBUG=False).

So only use it if DEBUG will not be changed at runtime. It cannot be used
if your users might run your applucation in debug mode.

To compare it to the C language, DEBUG would be a #define and astoptimizer
can be see as a preprocessor.

Victor
Le 30 d?c. 2012 11:59, "Stefan Behnel" <stefan_ml at behnel.de> a ?crit :

> Victor Stinner, 30.12.2012 11:42:
> > My astoptimizer provides tools to really *remove* debug at compilation,
> so
> > the overhead of the debug code is just null.
> >
> > You can for example declare your variable project.config.DEBUG as
> constant
> > with the value 0, where project.config is a module. So the if statement
> in
> > "from project.config import DEBUG ... if DEBUG: ..." will be removed.
>
> How would you know at compile time that it can be removed? How do you
> handle the example below?
>
> Stefan
>
>
> ## constants.py
>
> DEBUG = False
>
>
> ## enable_debug.py
>
> import constants
> constants.DEBUG = True
>
>
> ## test.py
>
> import enable_debug
> from constants import DEBUG
>
> if DEBUG:
>     print("DEBUGGING !")
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121231/cfee0f7f/attachment.html>

From guido at python.org  Mon Dec 31 03:48:10 2012
From: guido at python.org (Guido van Rossum)
Date: Sun, 30 Dec 2012 19:48:10 -0700
Subject: [Python-ideas] proposed methods: list.replace / list.indices
In-Reply-To: <50E0ECF2.7050107@pearwood.info>
References: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
	<kbp63c$tjg$1@ger.gmane.org> <50E04B39.2040508@nedbatchelder.com>
	<CADiSq7f8vGJU3UsjPAK7bY74b5cQPVOQL+3B-tm3nK8FUiR6ZA@mail.gmail.com>
	<50E0651F.6000305@nedbatchelder.com>
	<50E0729E.1040208@mrabarnett.plus.com>
	<50E0ECF2.7050107@pearwood.info>
Message-ID: <CAP7+vJJaU7qj5qgPxbZJPrzVwsf6cGGS-eWOPQ600ymK4y=hXw@mail.gmail.com>

I would be very conservative here, since they are both builtin types, both
sequences, and the reader may use the methods used as a hint about the type
(a form of type inference if you will). The use case for list.replace()
seems weak and we should beware of making standard interfaces too "thick"
lest implementing alternative versions become too burdensome.

--Guido

On Sunday, December 30, 2012, Steven D'Aprano wrote:

> On 31/12/12 03:58, MRAB wrote:
>
>> On 2012-12-30 16:00, Ned Batchelder wrote:
>>
>
>  I don't understand the conflict? .replace() from sequence does
>>> precisely the same thing as .replace() from bytes if you limit the
>>> arguments to single-byte values. It seems perfectly natural to me. I
>>> must be missing something.
>>>
>>>  [snip]
>> The difference is that for bytes and str it returns the result (they
>> are immutable after all), but the suggested addition would mutate the
>> list in-place. In order to be consistent it would have to return the
>> result instead.
>>
>
> Are you seriously suggesting that because str has a replace method with
> a specific API, no other type can have a replace method unless it has
> the same API?
>
> Why must list.replace and str.replace do exactly the same thing? Lists
> and strings are not the same, and you cannot in general expect to
> substitute lists with strings, or vice versa.
>
> collections.abc.**MutableSequence would seem to me to be the right place
> for a mutator replace method.
>
>
> --
> Steven
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>


-- 
--Guido van Rossum (on iPad)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121230/4168e8d3/attachment.html>

From stephen at xemacs.org  Mon Dec 31 04:10:57 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 31 Dec 2012 12:10:57 +0900
Subject: [Python-ideas] Documenting Python warts on Stack Overflow
In-Reply-To: <50E0D6F0.5090200@hackery.io>
References: <CAPkN8x+TF6znvM-Kd2Qad_ExjYf-hv41S3_Uk0U0OqAGO8HteQ@mail.gmail.com>
	<CAMpsgwYJiF5JFEf1jBmDoh9mDwLbkq7B6efAvq+56zbud8qHPg@mail.gmail.com>
	<50E0D6F0.5090200@hackery.io>
Message-ID: <87fw2mor0e.fsf@uwakimon.sk.tsukuba.ac.jp>

Ryan Macy writes:

 > I'm a young developer (22) that is aspiring to contribute to the python 
 > language and I think another perspective could help the conversation. 
 > What I believe Anataoly is asking for, albeit phrased in a different 
 > manner, is the ability to clearly see what the core issues/needs are in 
 > the language.

Good luck on that.  Just as you write, it is in fact a human ability,
not a collection of facts that can be published.  AFAICS, the core
issues are what block core developers from getting applied work done.
(Or cause them to stumble in process, for that matter.)

The reason for this, based on introspection and watching a few Nobel
prizewinners work, is that people working at that level have an
uncanny ability to *ask* the right questions, and do so recursively.
Of course they're usually really fast and accurate at answering them,
too, but answering smallish research questions is an upperclass
undergrad student[1] skill.  The knack for filtering out inessential
questions and zeroing in on the bottleneck is what makes them great.

The flip side, of course, is that because a core developer is blocked,
he or she is working on it.  So maybe you won't get a chance to make a
big contribution there -- it will be solved by the time you figure out
what to do. ;-)

 > Is there a place that holds the key improvements that the python 
 > language needs, so that we can work to it better? If that's the bug 
 > tracker,

Bingo!

 > is there a method already in place that signals areas that need 
 > improvements or fixes? [I know that there are severity levels, etc :)]

The problem is that "need" is mostly subjective.  In Python there are
several objectifiable criteria, encoded in the venerable Zen of
Python, and more recently in the thread answering Ned Batchelder's
question on what makes a good change to the stdlib.  But if you look
at them, I suspect that you'll come to the same conclusion that I do:
need is defined by what at least some programmers often want to do and
are likely to do imperfectly, even if they do it repeatedly.  That's
"need", and it's dynamic, only imperfectly correlated with the state
of the language.

The only reliable measure of need is what somebody is willing to
provide a high-quality patch for.  Just Do It! :-)

 > [There is] context that I'm missing, which has me chasing PEPs
 > constantly

Well, when you catch one, take it out to lunch.  Spend some time in
conversation with it.  Figure out what the person who wrote it was
thinking, and why. :-)

 > - So I'm definitely able to resonate the with this thread.
 > 
 > I apologize if I'm waay off target.

Not at all.  I just don't think there's a royal road to core
contribution.

The flip side of that is that as far as defining "need" goes, what you
perceive as important is no less important than what Guido does.  It's
just that he has a proven knack for picking questions that others
value too, and for giving answers that untangle the language, as well
as solving a practical problem.  But that doesn't mean you should work
on what Guido thinks is important just because he thinks it's
important.  If you resonate with the need he feels, then you will find
ways to contribute to resolving it.

I haven't seen the word "channel" around here recently, but trying to
channel the core developers on problems you encounter is a good way to
get started.  Try to anticipate what they'll say if you post (or in
response to somebody else's post that interests you).  When conversing
with a PEP, try to figure out what it's going to propose as the
solution before you read it.  Try to figure out what problems it will
need to solve to achieve its goal, etc.

Steve


From stephen at xemacs.org  Mon Dec 31 04:42:09 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 31 Dec 2012 12:42:09 +0900
Subject: [Python-ideas] Documenting Python warts on Stack Overflow
In-Reply-To: <kbqp3g$uln$1@ger.gmane.org>
References: <CAPkN8x+TF6znvM-Kd2Qad_ExjYf-hv41S3_Uk0U0OqAGO8HteQ@mail.gmail.com>
	<CAMpsgwYJiF5JFEf1jBmDoh9mDwLbkq7B6efAvq+56zbud8qHPg@mail.gmail.com>
	<kbqp3g$uln$1@ger.gmane.org>
Message-ID: <87ehi6opke.fsf@uwakimon.sk.tsukuba.ac.jp>

Terry Reedy writes:
 > I consider Anatoly's post to be off-topic, obnoxious, and best ignored.
 > 
 > On 12/30/2012 5:20 PM, Victor Stinner wrote:
 > 
 > I am only responding because Eli and then Victor responded, largely 
 > repeating things that have been said before (and ignored) on many of the 
 > same issues.

+1

 > > 2012/12/26 anatoly techtonik <techtonik at gmail.com>:
 > >> I am thinking about [python-wart] on SO.
 > 
 > The purpose of python-ideas is to discuss possible ideas for improving 
 > future versions of Python and the reference CPython implementation, 
 > including its included documentation.

I think it would be fair to s/included//.  See the doc site search
engine thread, which nobody (including you) seems to think off-topic.

 > Announcements of independent personal activities are off topic.

Not at all.  Announcing a PyPI project and requesting testing for
potential stdlib inclusion, for example.  Doesn't fit exactly, but
what's the preferred venue?

 > Announcements of thoughts about such activities are, to me, even more 
 > so.

It's the lack of any pre-posting filter whatsoever, combined with a
lack of patches, that leads me to ignore Anatoly.  This is more of the
same.  Nevertheless, a desire for a list of "important unsolved
problems" is common (cf Ryan's post).

 > > Sorry, but what is a wart in Python?
 > 
 > A Python behavior that Anatoly does not like and that the CPython 
 > developers cannot, will not*, or have not yet# changed. By extension, 
 > our disliked-by-him actions are also warts.

That is apparently Anatoly's operational definition, yes.  However,
it's easy to define conceptually.  A wart in Python is an un-Pythonic
functionality, or an un-Pythonic implementation of functionality.  The
print statement was a wart.  It was an interesting idea, like
syntactic indentation.  The former didn't work for Python, the latter
did and still does.[1]

That makes it clear to me why Anatoly's proposal is perverse.  The
word "Pythonic" itself cannot be defined by stars on a Roundup issue
or user posts to StackOverflow.  Ultimately it's defined by Guido, I
suppose, but by now many developers have been shown to have an
excellent sense, sufficient to get Guido to change his mind on
occasion.  It is not, however, a matter for democratic decision.

The word "wart" itself is useful, when used by those know what
"Pythonic" means.  It's a warning: you will break your teeth if you
just try to bite it off.  So it's not very useful in guiding the work
of new developers, because the bar is high and the benefits small.
Most warts in Python 3 (and there are far fewer than Anatoly seems to
think) will have to wait for Python 4, absent solutions of true
genius.


Footnotes: 
[1]  As a way of tweaking the nose of paren-lovers, if nothing else.




From ncoghlan at gmail.com  Mon Dec 31 04:47:02 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 31 Dec 2012 13:47:02 +1000
Subject: [Python-ideas] proposed methods: list.replace / list.indices
In-Reply-To: <50E0ECF2.7050107@pearwood.info>
References: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
	<kbp63c$tjg$1@ger.gmane.org> <50E04B39.2040508@nedbatchelder.com>
	<CADiSq7f8vGJU3UsjPAK7bY74b5cQPVOQL+3B-tm3nK8FUiR6ZA@mail.gmail.com>
	<50E0651F.6000305@nedbatchelder.com>
	<50E0729E.1040208@mrabarnett.plus.com>
	<50E0ECF2.7050107@pearwood.info>
Message-ID: <CADiSq7dMrcL+vJbSOBw7dVaaa6=qLEDuQ01OGwHFpmi4F-YKBQ@mail.gmail.com>

The problem is bytearray, not bytes and str.

bytearray is a builtin mutable sequence with a non-destructive replace()
method. It doesn't matter that this is almost certainly just a mistake due
to its immutable bytes heritage, the presence of that method is enough to
categorically rule out the idea of adding a destructive replace() method to
mutable sequences in general.

--
Sent from my phone, thus the relative brevity :)
On Dec 31, 2012 11:41 AM, "Steven D'Aprano" <steve at pearwood.info> wrote:

> On 31/12/12 03:58, MRAB wrote:
>
>> On 2012-12-30 16:00, Ned Batchelder wrote:
>>
>
>  I don't understand the conflict? .replace() from sequence does
>>> precisely the same thing as .replace() from bytes if you limit the
>>> arguments to single-byte values. It seems perfectly natural to me. I
>>> must be missing something.
>>>
>>>  [snip]
>> The difference is that for bytes and str it returns the result (they
>> are immutable after all), but the suggested addition would mutate the
>> list in-place. In order to be consistent it would have to return the
>> result instead.
>>
>
> Are you seriously suggesting that because str has a replace method with
> a specific API, no other type can have a replace method unless it has
> the same API?
>
> Why must list.replace and str.replace do exactly the same thing? Lists
> and strings are not the same, and you cannot in general expect to
> substitute lists with strings, or vice versa.
>
> collections.abc.**MutableSequence would seem to me to be the right place
> for a mutator replace method.
>
>
> --
> Steven
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121231/13d502c4/attachment.html>

From random832 at fastmail.us  Mon Dec 31 06:38:57 2012
From: random832 at fastmail.us (Random832)
Date: Mon, 31 Dec 2012 00:38:57 -0500
Subject: [Python-ideas] proposed methods: list.replace / list.indices
In-Reply-To: <CADiSq7dMrcL+vJbSOBw7dVaaa6=qLEDuQ01OGwHFpmi4F-YKBQ@mail.gmail.com>
References: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
	<kbp63c$tjg$1@ger.gmane.org> <50E04B39.2040508@nedbatchelder.com>
	<CADiSq7f8vGJU3UsjPAK7bY74b5cQPVOQL+3B-tm3nK8FUiR6ZA@mail.gmail.com>
	<50E0651F.6000305@nedbatchelder.com>
	<50E0729E.1040208@mrabarnett.plus.com>
	<50E0ECF2.7050107@pearwood.info>
	<CADiSq7dMrcL+vJbSOBw7dVaaa6=qLEDuQ01OGwHFpmi4F-YKBQ@mail.gmail.com>
Message-ID: <50E124F1.8000406@fastmail.us>

On 12/30/2012 10:47 PM, Nick Coghlan wrote:
>
> The problem is bytearray, not bytes and str.
>
> bytearray is a builtin mutable sequence with a non-destructive 
> replace() method. It doesn't matter that this is almost certainly just 
> a mistake due to its immutable bytes heritage, the presence of that 
> method is enough to categorically rule out the idea of adding a 
> destructive replace() method to mutable sequences in general.
>
All this discussion is, of course, before getting into the fact that 
string, bytes, and bytearray .replace() methods all work on subsequences 
rather than elements.


From dkreuter at gmail.com  Mon Dec 31 07:17:32 2012
From: dkreuter at gmail.com (David Kreuter)
Date: Mon, 31 Dec 2012 07:17:32 +0100
Subject: [Python-ideas] proposed methods: list.replace / list.indices
In-Reply-To: <50E0729E.1040208@mrabarnett.plus.com>
References: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
	<kbp63c$tjg$1@ger.gmane.org> <50E04B39.2040508@nedbatchelder.com>
	<CADiSq7f8vGJU3UsjPAK7bY74b5cQPVOQL+3B-tm3nK8FUiR6ZA@mail.gmail.com>
	<50E0651F.6000305@nedbatchelder.com>
	<50E0729E.1040208@mrabarnett.plus.com>
Message-ID: <CAMiSff6VETvokHVad0i66X0hLOa2DLNESKknJDq6gCr9CY61tQ@mail.gmail.com>

On Sun, Dec 30, 2012 at 5:58 PM, MRAB <python at mrabarnett.plus.com> wrote:

> On 2012-12-30 16:00, Ned Batchelder wrote:
>>
>> I don't understand the conflict?  .replace() from sequence does
>> precisely the same thing as .replace() from bytes if you limit the
>> arguments to single-byte values.  It seems perfectly natural to me. I
>> must be missing something.
>>
>>  [snip]
> The difference is that for bytes and str it returns the result (they
> are immutable after all), but the suggested addition would mutate the
> list in-place. In order to be consistent it would have to return the
> result instead.


I don't think that consistency between str and list is desirable. If .index
for example were consistent in str and list it would look like this:

    [9, 8, 7, 6, 5].index([8,7]) # = 1

Also,
    reversed, sorted (copy)
    list.reverse, list.sort (in-place)
>From that perspective list.replace working in-place *is* consistent.

However, I can see that this '.replace' might cause more confusion than
future code clarity.

What about .indices though?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121231/4a78b790/attachment.html>

From ethan at stoneleaf.us  Mon Dec 31 16:00:41 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 31 Dec 2012 07:00:41 -0800
Subject: [Python-ideas] Documenting Python warts on Stack Overflow
In-Reply-To: <CAMpsgwYJiF5JFEf1jBmDoh9mDwLbkq7B6efAvq+56zbud8qHPg@mail.gmail.com>
References: <CAPkN8x+TF6znvM-Kd2Qad_ExjYf-hv41S3_Uk0U0OqAGO8HteQ@mail.gmail.com>
	<CAMpsgwYJiF5JFEf1jBmDoh9mDwLbkq7B6efAvq+56zbud8qHPg@mail.gmail.com>
Message-ID: <50E1A899.3000602@stoneleaf.us>

Victor Stinner wrote:
> A list of such issue without solution doesn't help anyone.

I disagree:  knowledge of a problem is beneficial even when a workaround is not known.

~Ethan~



From jstpierre at mecheye.net  Mon Dec 31 08:31:48 2012
From: jstpierre at mecheye.net (Jasper St. Pierre)
Date: Mon, 31 Dec 2012 02:31:48 -0500
Subject: [Python-ideas] Documenting Python warts on Stack Overflow
In-Reply-To: <CAPkN8x+TF6znvM-Kd2Qad_ExjYf-hv41S3_Uk0U0OqAGO8HteQ@mail.gmail.com>
References: <CAPkN8x+TF6znvM-Kd2Qad_ExjYf-hv41S3_Uk0U0OqAGO8HteQ@mail.gmail.com>
Message-ID: <CAA0H+QQ6=sTyELhWHgjkVXU=t+rvzn65WJF4bjoVKo7ckUKu1g@mail.gmail.com>

We already have a collection of "warts" or "gotchas":

http://docs.python.org/3/faq/design.html#why-must-dictionary-keys-be-immutable
http://docs.python.org/3/faq/design.html#why-doesn-t-list-sort-return-the-sorted-list
http://docs.python.org/3/faq/design.html#why-are-default-values-shared-between-objects
http://docs.python.org/3/faq/design.html#why-can-t-raw-strings-r-strings-end-with-a-backslash

and so on. Note that that document is probably extremely out of date, but
there is an existing place for them.


On Tue, Dec 25, 2012 at 7:10 PM, anatoly techtonik <techtonik at gmail.com>wrote:

> I am thinking about [python-wart] on SO. There is no currently a list of
> Python warts, and building a better language is impossible without a clear
> visibility of warts in current implementations.
>
> Why Roundup doesn't work ATM.
> - warts are lost among other "won't fix" and "works for me" issues
> - no way to edit description to make it more clear
> - no voting/stars to percieve how important is this issue
> - no comment/noise filtering
> and the most valuable
> - there is no query to list warts sorted by popularity to explore other
> time-consuming areas of Python you are not aware of, but which can popup
> one day
>
> SO at least allows:
> + voting
> + community wiki edits
> + useful comment upvoting
> + sorted lists
> + user editable tags (adding new warts is easy)
>
> This post is a result of facing with numerous locals/settrace/exec issues
> that are closed on tracker. I also have my own list of other issues
> (logging/subprocess) at GC project, which I might be unable to maintain in
> future. There is also some undocumented stuff (subprocess deadlocks) that
> I'm investigating, but don't have time for a write-up. So I'd rather move
> this somewhere where it could be updated.
> --
> anatoly t.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>


-- 
  Jasper
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121231/f7263dd0/attachment.html>

From pyideas at rebertia.com  Mon Dec 31 08:56:04 2012
From: pyideas at rebertia.com (Chris Rebert)
Date: Sun, 30 Dec 2012 23:56:04 -0800
Subject: [Python-ideas] Documenting Python warts on Stack Overflow
In-Reply-To: <CAA0H+QQ6=sTyELhWHgjkVXU=t+rvzn65WJF4bjoVKo7ckUKu1g@mail.gmail.com>
References: <CAPkN8x+TF6znvM-Kd2Qad_ExjYf-hv41S3_Uk0U0OqAGO8HteQ@mail.gmail.com>
	<CAA0H+QQ6=sTyELhWHgjkVXU=t+rvzn65WJF4bjoVKo7ckUKu1g@mail.gmail.com>
Message-ID: <CAMZYqRTs7g3nwko4ZyjxkV5=a1QcXxeW3zoQWCeJoqb94JDxeg@mail.gmail.com>

> On Tue, Dec 25, 2012 at 7:10 PM, anatoly techtonik <techtonik at gmail.com>
> wrote:
>>
>> I am thinking about [python-wart] on SO. There is no currently a list of
>> Python warts, and building a better language is impossible without a clear
>> visibility of warts in current implementations.
>>
>> Why Roundup doesn't work ATM.
>> - warts are lost among other "won't fix" and "works for me" issues
>> - no way to edit description to make it more clear
>> - no voting/stars to percieve how important is this issue
>> - no comment/noise filtering
>> and the most valuable
>> - there is no query to list warts sorted by popularity to explore other
>> time-consuming areas of Python you are not aware of, but which can popup one
>> day
>>
>> SO at least allows:
>> + voting
>> + community wiki edits
>> + useful comment upvoting
>> + sorted lists
>> + user editable tags (adding new warts is easy)
>>
>> This post is a result of facing with numerous locals/settrace/exec issues
>> that are closed on tracker. I also have my own list of other issues
>> (logging/subprocess) at GC project, which I might be unable to maintain in
>> future. There is also some undocumented stuff (subprocess deadlocks) that
>> I'm investigating, but don't have time for a write-up. So I'd rather move
>> this somewhere where it could be updated.

On Sun, Dec 30, 2012 at 11:31 PM, Jasper St. Pierre
<jstpierre at mecheye.net> wrote:
> We already have a collection of "warts" or "gotchas":
>
> http://docs.python.org/3/faq/design.html#why-must-dictionary-keys-be-immutable
> http://docs.python.org/3/faq/design.html#why-doesn-t-list-sort-return-the-sorted-list
> http://docs.python.org/3/faq/design.html#why-are-default-values-shared-between-objects
> http://docs.python.org/3/faq/design.html#why-can-t-raw-strings-r-strings-end-with-a-backslash
>
> and so on. Note that that document is probably extremely out of date, but
> there is an existing place for them.

When much older Python 2.x-s were still in their heyday, there were
some popular 3rd-party lists:
http://lwn.net/Articles/43059/
http://zephyrfalcon.org/labs/python_pitfalls.html
http://www.ferg.org/projects/python_gotchas.html

(FWICT, Andrew Kuchling's article led to the "warts" terminology.)

Cheers,
Chris


From stefan at drees.name  Mon Dec 31 08:47:11 2012
From: stefan at drees.name (Stefan Drees)
Date: Mon, 31 Dec 2012 08:47:11 +0100
Subject: [Python-ideas] Order in the documentation search results
In-Reply-To: <kbq5l9$g8o$1@ger.gmane.org>
References: <CAL6gwWXikjrYG+f+sqnm3k2mtNXCasTD7Uj_ABY=JNLi4eBNhQ@mail.gmail.com>
	<50E083BA.7000603@nedbatchelder.com> <kbq5l9$g8o$1@ger.gmane.org>
Message-ID: <50E142FF.3070101@drees.name>

On 30.12.12 20:45, Georg Brandl wrote:
> On 12/30/2012 07:11 PM, Ned Batchelder wrote:
>> On 12/30/2012 12:54 PM, Hernan Grecco wrote:
>>> ...
>>> I have seen many people new to Python stumbling while using the Python
>>> docs due to the order of the search results.
>>> ...
>>> So my suggestion is to put the builtins first, the rest of the
>>> standard lib later including HowTos, FAQ, etc and finally the
>>> c-modules. Additionally, a section with a title matching exactly the
>>> search query should come first. (I am not sure if the last suggestion
>>> belongs in python-ideas or in
>>> the sphinx mailing list, please advice)
>>
>> While we're on the topic, why in this day and age do we have a custom
>> search?  Using google site search would be faster for the user, and more
>> accurate.
>
> I agree.  Someone needs to propose a patch though.
> ...

a custom search in itself is a wonderful thing. To me it also shows more 
appreciation of visitor concerns than thoses sites, that are just 
_offering_ google site search (which is accessible anyway to every 
visitor capable of memorizing the google or bing or whatnot URL).

I second Hernans suggestion about ordering and also his question where 
the request (and patches) should be directed to.

All the best,
Stefan.



From solipsis at pitrou.net  Mon Dec 31 12:52:06 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 31 Dec 2012 12:52:06 +0100
Subject: [Python-ideas] proposed methods: list.replace / list.indices
References: <CAMiSff4k3ShXCfdcfXH2wB=tnjFhN4QKGwET-789FRtK_0KjDg@mail.gmail.com>
	<kbp63c$tjg$1@ger.gmane.org> <50E04B39.2040508@nedbatchelder.com>
	<CADiSq7f8vGJU3UsjPAK7bY74b5cQPVOQL+3B-tm3nK8FUiR6ZA@mail.gmail.com>
	<50E0651F.6000305@nedbatchelder.com>
	<50E0729E.1040208@mrabarnett.plus.com>
	<CAMiSff6VETvokHVad0i66X0hLOa2DLNESKknJDq6gCr9CY61tQ@mail.gmail.com>
Message-ID: <20121231125206.64b17fce@pitrou.net>

On Mon, 31 Dec 2012 07:17:32 +0100
David Kreuter <dkreuter at gmail.com> wrote:
> 
> I don't think that consistency between str and list is desirable. If .index
> for example were consistent in str and list it would look like this:
> 
>     [9, 8, 7, 6, 5].index([8,7]) # = 1
> 
> Also,
>     reversed, sorted (copy)
>     list.reverse, list.sort (in-place)
> From that perspective list.replace working in-place *is* consistent.
> 
> However, I can see that this '.replace' might cause more confusion than
> future code clarity.

Another name could be found if necessary.

> What about .indices though?

I've never needed it myself. The fact that it's O(n) seems to hint that
a list is not the right data structure for the use cases you may be
thinking about :)

Regards

Antoine.




From maxmoroz at gmail.com  Mon Dec 31 23:16:55 2012
From: maxmoroz at gmail.com (Max Moroz)
Date: Mon, 31 Dec 2012 14:16:55 -0800
Subject: [Python-ideas] Preventing out of memory conditions
Message-ID: <CAOVPiMiNBwe_v95apXFVKwj5r5ipmz-ttGriVLV5YE-AXQAT0Q@mail.gmail.com>

Sometimes, I have the flexibility to reduce the memory used by my
program (e.g., by destroying large cached objects, etc.). It would be
great if I could ask Python interpreter to notify me when memory is
running out, so I can take such actions.

Of course, it's nearly impossible for Python to know in advance if the
OS would run out of memory with the next malloc call. Furthermore,
Python shouldn't guess which memory (physical, virtual, etc.) is
relevant in the particular situation (for instance, in my case, I only
care about physical memory, since swapping to disk makes my
application as good as frozen). So the problem as stated above is
unsolvable.

But let's say I am willing to do some work to estimate the maximum
amount of memory my application can be allowed to use. If I provide
that number to Python interpreter, it may be possible for it to notify
me when the next memory allocation would exceed this limit by calling
a function I provide it (hopefully passing as arguments the amount of
memory being requested, as well as the amount currently in use). My
callback function could then destroy some objects, and return True to
indicate that some objects were destroyed. At that point, the
intepreter could run its standard garbage collection routines to
release the memory that corresponded to those objects - before
proceeding with whatever it was trying to do originally. (If I
returned False, or if I didn't provide a callback function at all, the
interpreter would simply behave as it does today.) Any memory
allocations that happen while the callback function itself is
executing, would not trigger further calls to it. The whole mechanism
would be disabled for the rest of the session if the memory freed by
the callback function was insufficient to prevent going over the
memory limit.

Would this be worth considering for a future language extension? How
hard would it be to implement?

Max


From phd at phdru.name  Mon Dec 31 01:00:12 2012
From: phd at phdru.name (Oleg Broytman)
Date: Mon, 31 Dec 2012 04:00:12 +0400
Subject: [Python-ideas] Documenting Python warts on Stack Overflow
In-Reply-To: <CAMpsgwYJiF5JFEf1jBmDoh9mDwLbkq7B6efAvq+56zbud8qHPg@mail.gmail.com>
References: <CAPkN8x+TF6znvM-Kd2Qad_ExjYf-hv41S3_Uk0U0OqAGO8HteQ@mail.gmail.com>
	<CAMpsgwYJiF5JFEf1jBmDoh9mDwLbkq7B6efAvq+56zbud8qHPg@mail.gmail.com>
Message-ID: <20121231000012.GA10426@iskra.aviel.ru>

Hello and happy New Year!

On Sun, Dec 30, 2012 at 11:20:34PM +0100, Victor Stinner <victor.stinner at gmail.com> wrote:
> If I understood correctly, you would like to list some specific issues
> like print() not flushing immediatly stdout if you ask to not write a
> newline (print "a", in Python 2 or print("a", end=" ") in Python 3).
> If I understood correctly, and if you want to improve Python, you
> should help the documentation project. Or if you can build a website
> listing such issues *and listing solutions* like calling
> sys.stdout.flush() or using print(flush=True) (Python 3.3+) for the
> print issue.
> 
> A list of such issue without solution doesn't help anyone.

   I cannot say for Anatoly but for me warts are:

-- things that don't exist where they should (but the core team object
   or they are hard to implement or something);
-- things that exist where they shouldn't; they are hard to fix because
   removing them would break backward compatibility;
-- things that are implemented in strange, inconsistent ways.

   A few examples:

-- things that don't exist in the language where they should:
   anonymous code blocks (multiline lambdas);
   case (switch) statements;
   do/until loops;

-- things that exist in the language where they shouldn't:
   else clause in 'for' loops (documentation doesn't help);

-- things that don't exist in the stdlib where they should:
   asynchronous network libs (ftp/http/etc);
   GUI toolkit wrappers (GTK and/or Qt);
   SQL DB API drivers;
   SSL (key/certificate generation and parsing);
   restricted execution (remember rexec and Bastion?);

-- things that exist in the stdlib where they shouldn't:
   tkinter (Tk is the rarest GUI toolkit in use), turtle;
   smtpd.py (it's a program, not a library);

-- things that are implemented in strange, inconsistent ways:
   limited expression syntax in decorators (only attr access and calls);
   heapq (not object-oriented).

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.