<br><br><div class="gmail_quote">On Fri, Jun 1, 2012 at 2:07 PM, Jason Grout <span dir="ltr"><<a href="mailto:jason-sage@creativetrax.com" target="_blank">jason-sage@creativetrax.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi everyone,<br>
<br>
As mentioned yesterday, we've been exploring ways to implement the Sage<br>
cell server using the IPython capabilities better. This message is a<br>
call for comments, as well as a few questions about the directions of<br>
things like ipcluster.<br>
<br>
Our goals are to:<br>
<br>
* start multiple kernels/engines very quickly (e.g., through forking a<br>
master kernel)<br>
* connect single kernels to single web sessions<br>
* be able to transfer files back and forth between the executing<br>
environment and the web browser<br>
* do things as securely as is reasonable (e.g., workers are running on a<br>
firewalled virtual instance, etc.)<br>
* be able to use an enterprise-level web server, such as Google App Engine.<br>
<br>
<br>
We've explored two approaches, and have somewhat-barely-working partial<br>
proofs-of-concepts of each:<br>
<br>
IPCLUSTER<br>
---------<br>
<br>
We implemented a forking engine factory which is run on the worker<br>
computer. A cluster can start up new engines by sshing into the worker<br>
computer and sending a message to the forking engine factory. A new<br>
engine starts and then registers with the controller. We're thinking it<br>
might be best to implement a new scheduler that would take messages,<br>
check to see if the session id matches up with one of the<br>
currently-running engines, and send the message to the appropriate<br>
engine if so. If the message id does not match up to a currently<br>
running engine, then the scheduler would start up a new engine (can the<br>
scheduler do this?)<br></blockquote><div><br></div><div>Very little of the current scheduler's code would be useful for this. the scheduler object is for load-balanced execution, and all of its logic is involved in deciding what engines should get tasks. You seem to be looking for a pattern more like MUX (direct-routing). For this pattern, IPython just uses a pure-C MonitoredQueue function from pyzmq. It sounds like you want that is initially identical to the zmq_queue function, but on unroutable requests, it starts a new engine and then routes the request there.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
The client would basically be the zmq/<a href="http://socket.io" target="_blank">socket.io</a> bridge, translating<br>
messages to/from the controller and the browser. I guess we could<br>
implement one client per session, or we could implement one overall<br>
client that would route cluster output to the appropriate browser.<br></blockquote><div><br></div><div>Again, most of the Client code is involved in dealing with multiple engines, and interleaving asynchronous requests. So I would do one of:</div>
<div><br></div><div>* 1 Client per Server process, and 1 light View-like object per session</div><div>* build a much lighter Client that is more akin to the KernelManager, which only talks to one engine. It's possible this should just be a KernelManager subclass.</div>
<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
<br>
MULTI FORKING KERNEL MANAGER<br>
----------------------------<br>
<br>
This approach stays much closer to the current IPython notebook<br>
architecture. A MultiKernelManager starts up kernels using a<br>
ForkingKernelManager (based on Min's gist the other day). Each kernel<br>
sets up a connection to a specific set of websocket channels through a<br>
tornado-based bridge. We still have to implement things like separation<br>
of the forked kernels (on a separate ssh account somewhere) and the<br>
tornado handler, and things like that.<br>
<br>
<br>
THOUGHTS<br>
--------<br>
It seems that the multikernel manager is much lighter-weight, but we'd<br>
have to implement a lot of the enterprise-level functionality that the<br>
cluster already has. On the other hand, the ipcluster approach really<br>
does more than we need, so, in a sense, we need to trim it back. We're<br>
not asking you to make a decision for us, obviously, but it would be<br>
valuable to hear any comments or suggestions.<br>
<br>
SPECIFIC QUESTIONS<br>
------------------<br>
<br>
1. Brian, why did you decide to make a new multikernel manager instead<br>
of trying to leverage the ipcluster functionality to execute multiple<br>
engines?<br>
<br>
2. Some time ago, on this list or on a pull request, there was some<br>
discussion about the difference between kernels and engines (or the lack<br>
of difference). Are there some plans to simplify the ipcluster<br>
architecture a bit to merge the concepts of engines and kernels?<br></blockquote><div><br></div><div>Engines and Kernels really are the exact same thing (as of a recent PR, consolidating the two Kernel classes that had diverged during simultaneous development).</div>
<div><br></div><div>The IPython Kernel is a single object defined in IPython.zmq.ipkernel</div><div><br></div><div>But there are two ways to start a Kernel:</div><div><br></div><div>1. IPKernelApp (used by KernelManagers, and `ipython kernel`).</div>
<div>2. IPEngineApp (used by `ipengine`)</div><div><br></div><div>At the simplest level, an Engine is a Kernel that connects instead of binds (for consolidation of connection points at the Controller).</div><div><br></div>
<div>The Kernel object itself does not know whether it is binding or connecting (in fact, it can already do both at the same time).</div><div><br></div><div>All of the practical differences at this point are due to the different Application classes, which have a different set of command-line options (IPKernelApp has all the expected shell options, like pylab, etc., while the EngineApp has some options general to the IPython.parallel apps, like forwarding logging over zmq).</div>
<div><br></div><div>The remaining differences are configuration details, and ultimately the two entry points will probably be merged as well, and there will no longer be a distinction beyond:</div><div><br></div><div>Binding Kernel: `ipython kernel `</div>
<div>Engine: `ipython kernel --connect`</div><div>Both: `ipython kernel --connect --bind`</div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
3. Any idea about how much overhead the cluster introduces for lots of<br>
short computations involving lots of output? We'll test this too, but<br>
I'm curious if there was thought towards this use-case in the design.<br></blockquote><div><br></div><div>I haven't done performance tests with output before, but I would expect it to do fairly well. It's a direct PUB-SUB channel up to the Clients (one pure-C zmq_device hop at the Controller).</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
Thanks,<br>
<br>
Jason<br>
_______________________________________________<br>
IPython-dev mailing list<br>
<a href="mailto:IPython-dev@scipy.org">IPython-dev@scipy.org</a><br>
<a href="http://mail.scipy.org/mailman/listinfo/ipython-dev" target="_blank">http://mail.scipy.org/mailman/listinfo/ipython-dev</a><br>
</blockquote></div><br>