[IPython-dev] roadmap for IPython.zmq.parallel

Mon Dec 20 20:44:10 EST 2010

I think targeting the current zmq.parallel code is worthwhile right now.  I
would call the current state 'alpha' level, as it is has yet to be reviewed,
and is largely untested in the wild, but getting it up to code, so to speak,
shouldn't be a huge project.

The primary shortcomings currently:

* Configuration - Some very nice work has been done to add configurable
objects in IPython, and these tools are not yet used in the zmq parallel
code.
* Startup Scripts - Brian and others built some very nice tools for
deploying Twisted IPython on various clusters, and this work hasn't yet been
ported to use the existing ZeroMQ processes.
* Security - We do now allow for ssh tunnels, and it works with shell ssh,
as well as Paramiko. This is the newest code, and is largely untested
against the wide variety of key/password combinations used for ssh
authentication.
* Error handling - When code is going well, it's pretty solid, but there are
still decisions to be made on how to handle exceptions.  It survives errors
just fine, but exactly how we deal with the failures is likely to change.

The main pains you may see from it being alpha is that the API is not yet
frozen.  I wouldn't expect it to change much, but as we haven't had the
serious round of review yet, things are likely to change a little bit, so
you can expect to have your code require small adjustments while we iron
things out. But the basics are there, and won't change significantly.

-MinRK

On Fri, Dec 17, 2010 at 07:45, Barry Wark <barrywark at gmail.com> wrote:

> Hi all,
>
> It's been too long since I've been able to hang out in IPython land.
> Given my previous interests, it's really exciting to see the work in
> frontends accelerating with the new refactoring.
>
> I'm very excited to have a new opportunity to get back to IPython work
> on a client project. The contract is to build a scientific data
> processing and analysis framework. The analyses are expressed as a
> DAG, with computation at the nodes done by exectuables that take a
> standardized set of arguments and return a contracted output format.
> Some of the executables are C, some Matlab, some Python, etc--standard
> fare in academia. Our job is to build the engine to execute these
> workflows, monitor results, etc. Jobs will initially execute on a
> single machine (thus multiprocessing or a higher-level framework like
> Rufus, http://www.ruffus.org.uk/) make sense, but the user may
> eventually want to expand onto a local cluster.
>
> MinRK's IPython.zmq.parallel branch, with its support for DAG
> dependencies looks like it might fit the bill as a base for our work.
> I'm curious what you think is the status and timeline of this branch.
> I am happy to dedicate time to improving and helping with the
> IPython.zmq.parallel branch; the contract includes 1/4 time for the
> duration of the project for work on project dependencies. The timeline
> for deploying our project is roughly Feb-March. Is it
> reasonable/adviseable to build on IPython.zmp.parallel in that
> timeframe? It looks like ssh tunnels are the current basis for
> security in the zmq branch. Is that correct? Are there any plans to
> implement any sort of pluggable authentication/authorization?
>
> Thanks,
> Barry Wark
> _______________________________________________
> IPython-dev mailing list
> IPython-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20101220/8546afe8/attachment.html>