[IPython-dev] Starting to plan for 0.11 (this time for real)

MinRK benjaminrk at gmail.com
Sat Oct 30 03:10:44 EDT 2010

On Fri, Oct 29, 2010 at 23:25, Brian Granger <ellisonbg at gmail.com> wrote:

> Min,
> On Fri, Oct 29, 2010 at 10:55 PM, MinRK <benjaminrk at gmail.com> wrote:
> > This is more agressive than I expected, I'll have to get the new parallel
> > stuff in gear.
> If you stopped writing great code, we wouldn't be tempted to do crazy
> things like this ;-)
> > The main roadblock for me is merging work into the Kernels.  I plan to
> spend
> > tomorrow working on getting the new parallel code ready for review, and
> > identifying what needs to happen with code in master in order for this to
> go
> > into 0.11.  The only work that needs merge rather than drop-in is in
> Kernels
> > and Session.  I expect that just using the new Session will just be fine
> > after a rewview, but getting the existing Kernels to provide what is
> > necessary for the parallel code will be some work, and I'll try to
> identify
> > exactly what that will look like.
> Are you thinking of having only one top-level kernel script that
> handles both the parallel computing stuff and the interactive IPython?
>  I think the idea of that is fantastic, but I am not sure we need to
> have all of that working to merge your stuff.  I am not opposed to
> attempting this before/during the merge, but I don't view it as
> absolutely needed.  Also, it may make sense to review your code
> standalone first and then discuss merging the kernel and session stuff
> with what we already have.

I was thinking that we already have a remote execution object, and the only
difference between the two is the connection patterns. New features/bugfixes
will likely want to be shared by both.  My StreamKernel was derived from the
original pykernel, but I kept working on it while you were developing on it,
so they diverged.  I think they can be merged, as long as we do a few
things, mostly to do with abstracting the connections:

     * allow Kernels to connect, not just bind
     * use action-based, not socket-type names
     * allow execution requests to come from a *list* of connections, not
just one
     * use sessions/ioloop instead of direct send/recv_json

I also think using a KernelManager would be good, because it gets nice
process management (restart the kernel, etc.), and I can't really do that
without a Kernel, but I could subclass.

Related question:

why is ipkernel not a subclass of pykernel?  There's lots of identical code

> > The main things I know already:
> > * Names should change (GH-178). It's really a coincidence that we had
> just
> > one action per socket type, and the parallel code has several sockets of
> the
> > same type, and some actions that can be on different socket types,
> depending
> > on the scheduler.
> Yep.
> > * Use IOLoop/ZMQStream - this isn't necessarily critical, and I can
> probably
> > do it with a subclass if we don't want it in the main kernels.
> At this point I think that zmqstream has stablized enough that we
> *should* be using it in the kernel and kernel manager code anyways.  I
> am completely fine with this.
> > * apply_request. This should be all new code, and shouldn't collide with
> > anything.
> Ok.
> One other point that Fernando and I talked about is actually shipping
> the rest of tornado with pyzmq.  I have been thinking more about the
> architecture of the html notebook that James has been working on and
> it is an absolutely perfect fit for implementing the server using our
> zmq enabled Tornado event loop with tornado's regular http handling.
> It would also give us ssl support, authentication and lots of other
> web server goodies like websockets.  If we did this, I think it would
> be possible to have a decent prototype of James' html notebook in
> 0.11.  What do you think about this Min?  We are already shipping a
> good portion of tornado already with pyzmq and the rest is just a
> dozen or so .py files (there is one .c file that we don't need for
> python 2.6 and up).
> Eventually I would like to contribute our ioloop.py and zmqstream to
> tornado itself, but I don't think we have to worry about that yet.

I'm not very familiar with Tornado other than our use in pyzmq.  If we can
use it for authentication
without significant performance penalty, then that's a pretty big deal, and
well worth it.

It sounds like it would definitely provide a good toolkit for web backends,
so using it is probably a good idea.

I'm not sure that it should be *shipped* with pyzmq, though.  I think it
would be fine to ship with IPython
if we use it there, but I don't see a need to include it inside pyzmq.  If
we depend on it, then depend on it in PyPI,
but if it's only for some extended functionality, I don't see any problem
with asking people to install it, since it is
easy_installable (and apt-installable on Ubuntu).  PyZMQ is a pretty
low-level library - I don't think shipping someone else's
project inside it is a good idea unless there are significant benefits.

> Also, moving tornado into pyzmq would allow us to so secure https
> connections for the parallel computing client - controller connection.

Secure connections would be *great* if the performance is good enough.

> Cheers,
> Brian
> > Let me know what I can do to help things along.
> > -MinRK
> >
> > On Fri, Oct 29, 2010 at 20:28, Fernando Perez <fperez.net at gmail.com>
> wrote:
> >>
> >> On Fri, Oct 29, 2010 at 11:23 AM, Brian Granger <ellisonbg at gmail.com>
> >> wrote:
> >> > Remove all of the twisted stuff from 0.11 and put the new zmq stuff in
> >> > place as a prototype.
> >> >
> >> > Here is my logic:
> >> >
> >> > * The Twisted parallel stuff is *already* broken in 0.11 and if anyone
> >> > has stable code running on it, they should be using 0.10.
> >> > * If someone is happy to run non-production ready code, there is no
> >> > reason they should be using the Twisted stuff, they should use the
> >> > pyzmq stuff.
> >> > * Twisted is a *massive* burden on our code base:
> >> >  - For package managers, it brings in Twisted, Foolscap and
> >> > zope.interface.
> >> >  - It makes our test suite unstable and fragile because we have to
> >> > run tests in subprocesses and use trial sometimes and nose other
> >> > times.
> >> >  - It is a huge # of LOC.
> >> >  - It means that most of our codebase is Python 3 ready.
> >> >
> >> > There are lots of cons to this proposal:
> >> >
> >> > * That is really quick to drop support for the Twisted stuff.
> >> > * We may piss some people off.
> >> > * It possibly means maintaining the 0.10 series longer than we
> imagined.
> >> > * We don't have a security story for the pyzmq parallel stuff yet.
> >>
> >> I have to say that I simply didn't have Brian's boldness to propose
> >> this, but I think it's the right thing to do, ultimately.  It *is*
> >> painful in the short term, but it's also the honest approach.  I keep
> >> forgetting but Brian reminded me that even the Twisted-based code in
> >> 0.11 has serious regressions re. the 0.10.x series, since in the big
> >> refactoring for 0.11 not quite everything made it through.
> >>
> >> The 0.10 maintenance doesn't worry me a whole lot: as long as we limit
> >> it to small changes, by now merging them as self-contained pull
> >> requests is really easy (as I just did recently with the ones Paul and
> >> Tom sent).  And rolling out a new release when the total delta is
> >> small is actually not that much work.
> >>
> >> So I'm totally +1 on this radical, but I think ultimately beneficial,
> >> approach.  It's important to keep in mind that doing this will lift a
> >> big load off our shoulders, and we're a small enough team that this
> >> benefit is significant.  It will let us concentrate on moving the new
> >> machinery forward quickly without having to worry about the large
> >> Twisted code.  It will also help Thomas with his py3 efforts, as it's
> >> one less thing he has to keep getting out of his way.
> >>
> >> Concrete plan:
> >>
> >> - Wait a week or two for feedback.
> >> - If we decide to move ahead, make a shared branch on the main repo
> >> where we can do this work and review it, with all having the chance to
> >> contribute while it happens.
> >> - Move all twisted-using code (IPython/kernel and some code in
> >> IPython/testing) into IPython/deathrow.  This will let anyone who
> >> reall wants it find it easily, without having to dig through version
> >> control history.  Note that deathrow does *not* make it into official
> >> release tarballs.
> >>
> >> Cheers,
> >>
> >> f
> >> _______________________________________________
> >> IPython-dev mailing list
> >> IPython-dev at scipy.org
> >> http://mail.scipy.org/mailman/listinfo/ipython-dev
> >
> >
> --
> Brian E. Granger, Ph.D.
> Assistant Professor of Physics
> Cal Poly State University, San Luis Obispo
> bgranger at calpoly.edu
> ellisonbg at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20101030/9fb2e499/attachment.html>

More information about the IPython-dev mailing list