[IPython-dev] api between kernel, web server and frontend

Brian Granger ellisonbg at gmail.com
Thu May 13 17:25:01 EDT 2010


Ondrej,

I wish I could have joined you at Berkeley...

On Thu, May 13, 2010 at 12:04 PM, Ondrej Certik <ondrej at certik.cz> wrote:
> Hi,
>
> we were in Berkeley with Mateusz and we discussed the API between the
> computational kernel, web server and the javascript frontend (and
> other frontends), so I wanted to share our thoughts.

Great!

> The main workhorse is a kernel, that has two connections:
>
> 1) feed, that publishes all changes to a given session, anyone (who
> knows the session UID) can subscribe to it and observe the whole
> session (e.g. ipython session)
> 2) request/response channel, which is used for sending python code for
> evaluation and some other minor things. The results are published in
> the feed.
>
> The API is done using json, and it is described here:
>
> http://github.com/ellisonbg/pyzmq/blob/master/examples/kernel/message_spec.rst
>
> and demo implementation of this is here:
>
> http://github.com/ellisonbg/pyzmq/tree/master/examples/kernel/
>
> This demo implementation is using 0MQ (http://www.zeromq.org/) for the
> transport layer (i.e. for sending the json messages). 0MQ is probably
> the best library for sending/receiving messages over the net, but a
> big disadvantage is that it probably can't be used on the google app
> engine and it creates a dependency. But it's just the transport layer,
> in principle it can be replaced with anything else.

I am a little weary of calling 0MQ a "transport layer" as it does many
non-trivial things underneath the hood.  The reason I am weary about
this is that it suggests that we could simply swap out 0MQ for a
different "transport layer" in other contexts.  After struggling with
these networking issues for years, I don't think there is anything
else out there that will do what 0MQ does.  The only thing that even
comes close would be Twisted, but even it does really come close.
Thus, I am not sure what you have in mind when you way "it can be
replaced with anything else."  What other things are you thinking?  On
GAE, can you even do raw sockets?  As I understood, the only
networking related things you can do on GAE is an outbound HTTP client
request (query a non-Google HTTP server).

> So the above is the main logic for handling computational sessions,
> with many people (frontends) doing simultaneous calculations in the
> same session and one kernel (that however can dispatch the actual
> little calculations to many cores, e.g. do it in parallel). Note that
> this kernel has no database, no user accounts, nothing. All that it
> has is a session with UID (so you need to create the session somehow,
> but that's it) and it has a namespace for the running session, that's
> it.

Yep.

> Now we need a web server, that would use the above API to communicate
> with the kernel, and this web server would have a database with user
> accounts, it would also have a database with user worksheets and cells
> and it would expose this over HTTP. The problem is that using HTTP and
> the current web technologies pose some limitations currently to what
> can be done.

Definitely.  A big issue that I see with GAE is that it doesn't allow
long-lived requests.

> We talked with Mateusz how to do that in the car to Reno
> and we agreed that the best way is to use a JSONRPC, so the web server
> would define a Python class with some functionality, and then the
> javascript frontend (running in the browser) would use methods of this
> class to do things (create worksheets, get users, create cells,
> evaluate cells, get the feed from the kernel, ...). The JS frontend
> would have to periodically call some method on the server to get the
> feed, as we don't know how to "subscribe" to it using AJAX. Maybe in
> couple years, this would be simple using web sockets.

Yes, the JS frontend would need to poll probably.

> I would be interested in any feedback and discussion.

While I like the idea of being able to do things on GAE, I do think
that its constaints are so severe that it probably doesn't make sense
to implement these ideas in that context.  We could use GAE to manage
user accounts and the notebook sessions but I think the only way to
use this architecture is to have the kernels run elsewhere and have
GAE make HTTP calls to the kernels (through an HHTP-0MQ bridge).

We have struggle for years with trying to implement this architecture
and have gotten basically no where.  Part of this is that we are very
busy, but a huge part of the issue is technical.  Without the unique
capabilities that 0MQ provides, I am convinced we will end up hacking
things together than only partially work and are much more complex
than they need be.

Of course, I think you can probably get something working on GAE (you
already have), but I think the 0MQ based architecture is simple not
compatible with GAE.  But, I do think that we should think about how
to implement these things outside of GAE using 0MQ for real.

Some thoughts along those lines:

The core bit of technology that we need is an HTTP/JSONRPC to 0MQ
bridge.  This would make it possible for any HTTP client to interact
with a 0MQ based kernel:

HTTP client <--HTTP--> Bridge <---OMQ----> Kernel

The main thing that this bridge would do is to translate the more
complex message based communications to the simple request/reply style
of HTTP.  A single bridge could even manage multiple kernels.  But I
think that the bridge should be developed initially without any
knowledge of the notebook/user side of things.

This bridge could be written using any python webapp framework and
should be done in the "RESTful" manner
(http://en.wikipedia.org/wiki/Representational_State_Transfer) to
allow for JSONRPC calls to the bridge.  There are some issues related
to integrating the 0MQ client into a custom webapp, but these are
solvable problems.

> We now need to write a demo implementation of the above web server and
> a client that uses JSONRPC to do the above. The client can (should) be
> command line based and it will only use JSONRPC for all the
> communication. It will then be quite trivial (trivial in terms of the
> logic) to anyone to write a javascript client, as JSONRPC works nice
> with javascript, or adapt codenode or the sage notebook frontends to
> use this JSONRPC.

Yes, I agree, once the bridge exists.

>
> For testing purposes, I would really like to have a demo
> implementation that runs on the google app engine (both the web server
> and the kernel). Then anyone can run a simple client, that connects to
> it (and we will also have a simple javascript client, so you would
> just go to the demo page and it would work). For that to happen, we
> need to figure out some other transport layer than 0MQ, something that
> works on the app engine. This can also be the only option for people
> that don't want/can't to install 0MQ for some reason. Any ideas?

While IPython will always ship with a non-OMQ version, for things like
you are talking about, I don't see 0MQ as optional.  I simple don't
think there is another good transport layer (at least not one that is
pure python and more lightweight that 0MQ).

Cheers,

Brian

> Ondrej
> _______________________________________________
> IPython-dev mailing list
> IPython-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev
>



-- 
Brian E. Granger, Ph.D.
Assistant Professor of Physics
Cal Poly State University, San Luis Obispo
bgranger at calpoly.edu
ellisonbg at gmail.com



More information about the IPython-dev mailing list