libuv based eventloop for tulip experiment

Hi all! I haven't been able to keep up with all the tulip development on the mailing list (hopefully I will now!) so please excuse me if something I mention has already been discussed. For those who may not know it, libuv is the platform layer library for nodejs, which implements a uniform interface on top of epoll, kqueue, event ports and iocp. I wrote Python bindings [1] for it a while ago, and I was very excited to see Tulip, so I thought I'd give this a try. Here [2] is the source code, along with some notes I took during the implementation. I know that the idea is not to re-implement the PEP itself but for people to create different EventLoop implementations. On rose I bundled tulip just to make a single package I could play with easily, once tulip makes it to the stdlib only the EventLop will remain. Here are some thoughts (in no particular order): - add_connector / remove_connector seem to be related to Windows, but being exposed like that feels a bit like leaking an implementation detail. I guess there was no way around it. - libuv implements a type of handle (Poll) which provides level-triggered file descriptor polling which also works on Windows, while being highly performant. It uses something called AFD Polling apparently, which is only available on Windows >= Vista, and a select thread on XP. I'm no Windows expert, but thanks to this the API is consistent across all platforms, which is nice. mAybe it's worth investigating? [3] - The transport abstraction seems quite tight to socket objects. pyuv provides a TCP and UDP handles, which provide a completion-style API and use a better approach than Poll handles. They should give better performance since EINTR in handled internally and there are less roundtrips between Python-land and C-land. Was it ever considered to provide some sort of abstraction so that transports can be used on top of something other than regular sockets? For example I see no way to get the remote party from the transport, without checking the underlying socket. Thanks for reading this far and keep up the good work. Regards, [1]: https://github.com/saghul/pyuv [2]: https://github.com/saghul/rose [3]: https://github.com/joyent/libuv/blob/master/src/win/poll.c -- Saúl Ibarra Corretgé http://saghul.net/blog | http://about.me/saghul

On Mon, Jan 28, 2013 at 2:48 PM, Saúl Ibarra Corretgé <saghul@gmail.com> wrote:
Me neither! :-) Libuv has been brought up before, though I haven't looked at it in detail. I think you're bringing up good stuff.
Great to hear!
Here [2] is the source code, along with some notes I took during the implementation.
Hm... I see you just copied all of tulip and then hacked on it for a while. :-) I wonder if you could refactor things so that an app would be able to dynamically choose between tulip's and rose's event loop using tulip's EventLoopPolicy machinery? The app could just instantiate tulip.unix_eventloop._UnixEventLoop() (yes, this should really be renamed!) or rose.uv.EventLoop, but all its imports should come from tulip. Also, there's a refactoring of the event loop classes underway in tulip's iocp branch -- this adds IOCP support on Windows.
It will be a long time before tulip makes it into the stdlib -- but for easy experimentation it should be possible for apps to choose between tulip and rose without having to change all their tulip imports to rose imports.
They would only be needed if we ever were to support WSAPoll() on Windows, but I'm pretty much decided against that (need to check with Richard Oudkerk once more). Then we can kill add_connector and remove_connector.
Again that's probably for Richard to look into. I have no idea how it relates to IOCP.
- The transport abstraction seems quite tight to socket objects.
I'm confused to hear you say this, since the APIs for transports and protocols are one of the few places of PEP 3156 where sockets are *not* explicitly mentioned. (Though they are used in the implementations, but I am envisioning alternate implementations that don't use sockets.)
So it implements TCP and UDP without socket objects? I actually like this, because it validates my decision to keep socket objects out of the transport/protocol APIs. (Note that PEP 3156 and Tulip currently don't support UDP; it will require a somewhat different API between transports and protocols.)
Why would EINTR handling be important? That should occur almost never. Or did you mean EAGAIN?
This we are considering in another thread -- there are in fact two proposals on the table, one to add transport methods get_name() and get_peer(), which should return (host, port) pairs if possible, or None if the transport is not talking to an IP connection (or there are too many layers in between to dig out that information). The other proposal is a more generic API to get info out of the transport, e.g. get_extra_info("name") and get_extra_info("peer"), which can be more easily extended (without changing the PEP) to support other things, e.g. certificate info if the transport implements SSL.
Thanks for reading this far and keep up the good work.
Thanks for looking at this and reimplementing PEP 3156 on top of libuv! This is exactly the kind of thing I am hoping for.
-- --Guido van Rossum (python.org/~guido)

Hi! [snip]
Sure, that's the idea, I just put everything together so that it would still run even if some API changes :-) Anyway, since I plan to follow this more closely I'll definitely go for that and rose will just create a new EventLoopPolicy which uses the uv event loop.
Agreed.
Ok, good to hear :-)
I'm no windows expert either :-) AFAIS, IOCP provides a completion-based interface, but many people/libraries are used to level-triggered readiness notifications. It's apparently not easy to have unix style file descriptor polling in Windows, but that AFD Poll stuff (fairy dust to me, to be honest) does the trick. It only works for sockets, but I guess that's ok.
Indeed I meant the implementation. For example right now start_serving returns a Python socket object maybe some sort of ServerHandler class could hide that and provide some some convenience methods such as getsockname. If the eventloop implementation uses Python sockets it could just call the function in the underlying sockets, but some other implementations may have other means so gather that information.
Yes, the TCP and UDP handles from pyuv are wrappers to their corresponding types in libuv. They exist because JS doesn't have sockets so the had to create them for nodejs. The API, however, is completion style, here is a simple example on how data is read from a TCP handle: def on_data_received(handle, data, error): if error == pyuv.error.UV_EOF: # Remove closed the connection handle.close() return print(data) tcp_handle.start_read(on_data_received) This model actually fits pretty well in tulip's transport/protocol mechanism.
Actually, both. If the process receives signal epoll_wait would be interrupted, and libuv takes care of rearming the file descriptor, which happens in C without the GIL. Same goes for EAGAIN, basically libuv tries to read 64k chunks when start_read is called, and it automatically retires on EAGAIN. I don't have number to back this up (yet) but conceptually sounds pretty plausible.
The second model seems more flexible indeed. I guess the SSL transport could be tricky, because while currently Tulip uses the ssl module I have no TLS handle on pyuv so I'd have to build one on top of a TCP handle with pyOpenSSL (I have a prototype here [1]), so object types / APIs wouldn't match, unless Tulip provides some wrappers for SSL related objects such as certificates...
I'll follow up the discussion closer now :-) [1]: https://gist.github.com/4599801#file-uvtls-py Regards, -- Saúl Ibarra Corretgé http://saghul.net/blog | http://about.me/saghul

On Tue, Jan 29, 2013 at 12:08 PM, Saúl Ibarra Corretgé <saghul@gmail.com> wrote:
Yeah, so do the other polling things on Windows. (Well, mostly sockets. There are some other things supported like named pipes.) I guess in order to support this we'd need some kind of abstraction away from socket objects and file descriptors, at least for event loop methods like sock_recv() and add_reader(). But those are mostly meant for transports to build upon, so I think that would be fine.
- The transport abstraction seems quite tight to socket objects.
Ah, yes, the start_serving() API. It is far from ready. :-(
Yeah, I see. If we squint and read "handle" instead of "socket" we could even make it so that loop.sock_recv() takes one of these -- it would return a Future and your callback would set the Future's result, or its exception if an error was set.
Why would EINTR handling be important? That should occur almost never. Or did you mean EAGAIN?
Hm. Anything that uses signals for its normal operation sounds highly suspect to me. But it probably doesn't matter either way.
Hm, I thought certificates were just blobs of data? We should probably come up with a standard way to represent these that isn't tied to the stdlib's ssl module. But I don't think this should be part of PEP 3156 -- it's too big already.
Saúl Ibarra Corretgé http://saghul.net/blog | http://about.me/saghul
-- --Guido van Rossum (python.org/~guido)

Yeah, so do the other polling things on Windows. (Well, mostly sockets. There are some other things supported like named pipes.)
In pyuv there is a pecial handle for those (Pipe) which works on both unix and windows with the same interface.
I see, great! [snip]
YEah, sounds like it could work :-) Anyway, I wouldn't be opposed to leaving to APIs just for Python sockets (which I can interact with using a Poll handle) if transports can be built on top other entities such as TCP handles. [snip]
Yes, they are blobs, I meant the objects that wrap those blobs and provide verification functions and such. But that can indeed be left out and have implementation deal with it, having tulip just hand over the blobs. Regards, -- Saúl Ibarra Corretgé http://saghul.net/blog | http://about.me/saghul

On Wed, Jan 30, 2013 at 1:55 AM, Saúl Ibarra Corretgé <saghul@gmail.com> wrote:
PEP 3156 should add a new API for adding a pipe (either the read or write end). Someone worked on that for a bit, search last week's python-ideas archives.
The iocp branch now has all these refactorings.
Do you know how to write code like that? It would be illustrative to take the curl.py and crawl.py examples and adjust them so that if the protocol is https, the server's authenticity is checked and reported. I've never dealt with this myself so I would probably do it wrong... :-( -- --Guido van Rossum (python.org/~guido)

On 30 January 2013 15:45, Guido van Rossum <guido@python.org> wrote:
That was me. There's a patched version of tulip with pipe connector methods and a subprocess transport using them in my bitbucket repository: https://bitbucket.org/pmoore/tulip Paul

Hi again, I just updated rose [0] to match latest changes in Tulip API and remove Tulip itself from the code, now a proper EventLoopPolicy is defined, which will in turn use the pyuv-based event loop. Regards, [0]: https://github.com/saghul/rose -- Saúl Ibarra Corretgé http://saghul.net/blog | http://about.me/saghul

On Feb 5, 2013, at 2:58 PM, Guido van Rossum <guido@python.org> wrote:
two tests are failing. i think thats very good result. ====================================================================== FAIL: test_start_serving_cant_bind (tulip.events_test.UVEventLoopTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/unittest/mock.py", line 1088, in patched return func(*args, **keywargs) File "/Users/nikolay/dev/tulip/src/tulip/tulip/events_test.py", line 497, in test_start_serving_cant_bind self.assertRaises(Err, self.event_loop.run_until_complete, fut) AssertionError: Err not raised by run_until_complete ====================================================================== FAIL: test_baseexception_during_cancel (tulip.tasks_test.TaskTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/nikolay/dev/tulip/src/tulip/tulip/tasks_test.py", line 451, in test_baseexception_during_cancel self.assertRaises(BaseException, self.event_loop.run_once) AssertionError: BaseException not raised by run_once

Nikolay Kim wrote:
Hum, they all pass for me on OSX. Did you use the runtests.py script from rose?
I overwrote this test in rose/events_test.py
Hum, I have not seen this one. -- Saúl Ibarra Corretgé http://saghul.net/blog | http://about.me/saghul

Nikolay Kim wrote:
i used uv_events as default event loop and ran all tulip tests with it. but those failures are not related to uv_events anyway I'd inherit uv_events.EventLoop from base_events.BaseEventLoop.
Why? There is not much code that can be reused, since timers are not needed and the _run_once function is also not applicable. -- Saúl Ibarra Corretgé http://saghul.net/blog | http://about.me/saghul

On Tue, Feb 5, 2013 at 4:45 PM, Saúl Ibarra Corretgé <saghul@gmail.com> wrote:
Because the test mocks base_events.socket. :-) That's not a great reason, I agree. Perhaps you and Nikolay can work on a better way to do the mocking so it doesn't rely on base_events? -- --Guido van Rossum (python.org/~guido)

Guido van Rossum wrote:
I see. I modified it to inherit from BaseEventLoop now. I found a bug in the way rose executes handler callbacks (https://github.com/saghul/rose/commit/94e984cf756c3d0730acf804c18cab682cffd6...) I need to think how to implement this, probably saving the first base exception, stopping the processing and reraising it... -- Saúl Ibarra Corretgé http://saghul.net/blog | http://about.me/saghul

On Mon, Jan 28, 2013 at 2:48 PM, Saúl Ibarra Corretgé <saghul@gmail.com> wrote:
Me neither! :-) Libuv has been brought up before, though I haven't looked at it in detail. I think you're bringing up good stuff.
Great to hear!
Here [2] is the source code, along with some notes I took during the implementation.
Hm... I see you just copied all of tulip and then hacked on it for a while. :-) I wonder if you could refactor things so that an app would be able to dynamically choose between tulip's and rose's event loop using tulip's EventLoopPolicy machinery? The app could just instantiate tulip.unix_eventloop._UnixEventLoop() (yes, this should really be renamed!) or rose.uv.EventLoop, but all its imports should come from tulip. Also, there's a refactoring of the event loop classes underway in tulip's iocp branch -- this adds IOCP support on Windows.
It will be a long time before tulip makes it into the stdlib -- but for easy experimentation it should be possible for apps to choose between tulip and rose without having to change all their tulip imports to rose imports.
They would only be needed if we ever were to support WSAPoll() on Windows, but I'm pretty much decided against that (need to check with Richard Oudkerk once more). Then we can kill add_connector and remove_connector.
Again that's probably for Richard to look into. I have no idea how it relates to IOCP.
- The transport abstraction seems quite tight to socket objects.
I'm confused to hear you say this, since the APIs for transports and protocols are one of the few places of PEP 3156 where sockets are *not* explicitly mentioned. (Though they are used in the implementations, but I am envisioning alternate implementations that don't use sockets.)
So it implements TCP and UDP without socket objects? I actually like this, because it validates my decision to keep socket objects out of the transport/protocol APIs. (Note that PEP 3156 and Tulip currently don't support UDP; it will require a somewhat different API between transports and protocols.)
Why would EINTR handling be important? That should occur almost never. Or did you mean EAGAIN?
This we are considering in another thread -- there are in fact two proposals on the table, one to add transport methods get_name() and get_peer(), which should return (host, port) pairs if possible, or None if the transport is not talking to an IP connection (or there are too many layers in between to dig out that information). The other proposal is a more generic API to get info out of the transport, e.g. get_extra_info("name") and get_extra_info("peer"), which can be more easily extended (without changing the PEP) to support other things, e.g. certificate info if the transport implements SSL.
Thanks for reading this far and keep up the good work.
Thanks for looking at this and reimplementing PEP 3156 on top of libuv! This is exactly the kind of thing I am hoping for.
-- --Guido van Rossum (python.org/~guido)

Hi! [snip]
Sure, that's the idea, I just put everything together so that it would still run even if some API changes :-) Anyway, since I plan to follow this more closely I'll definitely go for that and rose will just create a new EventLoopPolicy which uses the uv event loop.
Agreed.
Ok, good to hear :-)
I'm no windows expert either :-) AFAIS, IOCP provides a completion-based interface, but many people/libraries are used to level-triggered readiness notifications. It's apparently not easy to have unix style file descriptor polling in Windows, but that AFD Poll stuff (fairy dust to me, to be honest) does the trick. It only works for sockets, but I guess that's ok.
Indeed I meant the implementation. For example right now start_serving returns a Python socket object maybe some sort of ServerHandler class could hide that and provide some some convenience methods such as getsockname. If the eventloop implementation uses Python sockets it could just call the function in the underlying sockets, but some other implementations may have other means so gather that information.
Yes, the TCP and UDP handles from pyuv are wrappers to their corresponding types in libuv. They exist because JS doesn't have sockets so the had to create them for nodejs. The API, however, is completion style, here is a simple example on how data is read from a TCP handle: def on_data_received(handle, data, error): if error == pyuv.error.UV_EOF: # Remove closed the connection handle.close() return print(data) tcp_handle.start_read(on_data_received) This model actually fits pretty well in tulip's transport/protocol mechanism.
Actually, both. If the process receives signal epoll_wait would be interrupted, and libuv takes care of rearming the file descriptor, which happens in C without the GIL. Same goes for EAGAIN, basically libuv tries to read 64k chunks when start_read is called, and it automatically retires on EAGAIN. I don't have number to back this up (yet) but conceptually sounds pretty plausible.
The second model seems more flexible indeed. I guess the SSL transport could be tricky, because while currently Tulip uses the ssl module I have no TLS handle on pyuv so I'd have to build one on top of a TCP handle with pyOpenSSL (I have a prototype here [1]), so object types / APIs wouldn't match, unless Tulip provides some wrappers for SSL related objects such as certificates...
I'll follow up the discussion closer now :-) [1]: https://gist.github.com/4599801#file-uvtls-py Regards, -- Saúl Ibarra Corretgé http://saghul.net/blog | http://about.me/saghul

On Tue, Jan 29, 2013 at 12:08 PM, Saúl Ibarra Corretgé <saghul@gmail.com> wrote:
Yeah, so do the other polling things on Windows. (Well, mostly sockets. There are some other things supported like named pipes.) I guess in order to support this we'd need some kind of abstraction away from socket objects and file descriptors, at least for event loop methods like sock_recv() and add_reader(). But those are mostly meant for transports to build upon, so I think that would be fine.
- The transport abstraction seems quite tight to socket objects.
Ah, yes, the start_serving() API. It is far from ready. :-(
Yeah, I see. If we squint and read "handle" instead of "socket" we could even make it so that loop.sock_recv() takes one of these -- it would return a Future and your callback would set the Future's result, or its exception if an error was set.
Why would EINTR handling be important? That should occur almost never. Or did you mean EAGAIN?
Hm. Anything that uses signals for its normal operation sounds highly suspect to me. But it probably doesn't matter either way.
Hm, I thought certificates were just blobs of data? We should probably come up with a standard way to represent these that isn't tied to the stdlib's ssl module. But I don't think this should be part of PEP 3156 -- it's too big already.
Saúl Ibarra Corretgé http://saghul.net/blog | http://about.me/saghul
-- --Guido van Rossum (python.org/~guido)

Yeah, so do the other polling things on Windows. (Well, mostly sockets. There are some other things supported like named pipes.)
In pyuv there is a pecial handle for those (Pipe) which works on both unix and windows with the same interface.
I see, great! [snip]
YEah, sounds like it could work :-) Anyway, I wouldn't be opposed to leaving to APIs just for Python sockets (which I can interact with using a Poll handle) if transports can be built on top other entities such as TCP handles. [snip]
Yes, they are blobs, I meant the objects that wrap those blobs and provide verification functions and such. But that can indeed be left out and have implementation deal with it, having tulip just hand over the blobs. Regards, -- Saúl Ibarra Corretgé http://saghul.net/blog | http://about.me/saghul

On Wed, Jan 30, 2013 at 1:55 AM, Saúl Ibarra Corretgé <saghul@gmail.com> wrote:
PEP 3156 should add a new API for adding a pipe (either the read or write end). Someone worked on that for a bit, search last week's python-ideas archives.
The iocp branch now has all these refactorings.
Do you know how to write code like that? It would be illustrative to take the curl.py and crawl.py examples and adjust them so that if the protocol is https, the server's authenticity is checked and reported. I've never dealt with this myself so I would probably do it wrong... :-( -- --Guido van Rossum (python.org/~guido)

On 30 January 2013 15:45, Guido van Rossum <guido@python.org> wrote:
That was me. There's a patched version of tulip with pipe connector methods and a subprocess transport using them in my bitbucket repository: https://bitbucket.org/pmoore/tulip Paul

Hi again, I just updated rose [0] to match latest changes in Tulip API and remove Tulip itself from the code, now a proper EventLoopPolicy is defined, which will in turn use the pyuv-based event loop. Regards, [0]: https://github.com/saghul/rose -- Saúl Ibarra Corretgé http://saghul.net/blog | http://about.me/saghul

On Feb 5, 2013, at 2:58 PM, Guido van Rossum <guido@python.org> wrote:
two tests are failing. i think thats very good result. ====================================================================== FAIL: test_start_serving_cant_bind (tulip.events_test.UVEventLoopTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/unittest/mock.py", line 1088, in patched return func(*args, **keywargs) File "/Users/nikolay/dev/tulip/src/tulip/tulip/events_test.py", line 497, in test_start_serving_cant_bind self.assertRaises(Err, self.event_loop.run_until_complete, fut) AssertionError: Err not raised by run_until_complete ====================================================================== FAIL: test_baseexception_during_cancel (tulip.tasks_test.TaskTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/nikolay/dev/tulip/src/tulip/tulip/tasks_test.py", line 451, in test_baseexception_during_cancel self.assertRaises(BaseException, self.event_loop.run_once) AssertionError: BaseException not raised by run_once

Nikolay Kim wrote:
Hum, they all pass for me on OSX. Did you use the runtests.py script from rose?
I overwrote this test in rose/events_test.py
Hum, I have not seen this one. -- Saúl Ibarra Corretgé http://saghul.net/blog | http://about.me/saghul

Nikolay Kim wrote:
i used uv_events as default event loop and ran all tulip tests with it. but those failures are not related to uv_events anyway I'd inherit uv_events.EventLoop from base_events.BaseEventLoop.
Why? There is not much code that can be reused, since timers are not needed and the _run_once function is also not applicable. -- Saúl Ibarra Corretgé http://saghul.net/blog | http://about.me/saghul

On Tue, Feb 5, 2013 at 4:45 PM, Saúl Ibarra Corretgé <saghul@gmail.com> wrote:
Because the test mocks base_events.socket. :-) That's not a great reason, I agree. Perhaps you and Nikolay can work on a better way to do the mocking so it doesn't rely on base_events? -- --Guido van Rossum (python.org/~guido)

Guido van Rossum wrote:
I see. I modified it to inherit from BaseEventLoop now. I found a bug in the way rose executes handler callbacks (https://github.com/saghul/rose/commit/94e984cf756c3d0730acf804c18cab682cffd6...) I need to think how to implement this, probably saving the first base exception, stopping the processing and reraising it... -- Saúl Ibarra Corretgé http://saghul.net/blog | http://about.me/saghul
participants (4)
-
Guido van Rossum
-
Nikolay Kim
-
Paul Moore
-
Saúl Ibarra Corretgé