[IPython-dev] SciPy Sprint summary
Satrajit Ghosh
satra at mit.edu
Tue Jul 20 16:01:24 EDT 2010
hi brian,
i ran into a problem (my engines were not starting) and justin and i are
going to try and figure out what's causing it.
cheers,
satra
On Tue, Jul 20, 2010 at 3:19 PM, Brian Granger <ellisonbg at gmail.com> wrote:
> Satra,
>
> If you could test this as well, that would be great. Thanks. Justin,
> let us know when you think it is ready to go with the documentation
> and testing.
>
> Cheers,
>
> Brian
>
> On Tue, Jul 20, 2010 at 7:48 AM, Justin Riley <justin.t.riley at gmail.com>
> wrote:
> > On 07/19/2010 01:06 AM, Brian Granger wrote:
> >> * I like the design of the BatchEngineSet. This will be easy to port to
> >> 0.11.
> > Excellent :D
> >
> >> * I think if we are going to have default submission templates, we need
> to
> >> expose the queue name to the command line. This shouldn't be too
> tough.
> >
> > Added --queue option to my 0.10.1-sge branch and tested this with SGE
> > 62u3 and Torque 2.4.6. I don't have LSF to test but I added in the code
> > that *should* work with LSF.
> >
> >> * Have you tested this with Python 2.6. I saw that you mentioned that
> >> the engines were shutting down cleanly now. What did you do to fix
> that?
> >> I am even running into that in 0.11 so any info you can provide would
> >> be helpful.
> >
> > I've been testing the code with Python 2.6. I didn't do anything special
> > other than switch the BatchEngineSet to using job arrays (ie a single
> > qsub command instead of N qsubs). Now when I run "ipcluster sge -n 4"
> > the controller starts and the engines are launched and at that point the
> > ipcluster session is running indefinitely. If I then ctrl-c the
> > ipcluster session it catches the signal and calls kill() which
> > terminates the engines by canceling the job. Is this the same situation
> > you're trying to get working?
> >
> >> * For now, let's stick with the assumption of a shared $HOME for the
> furl files.
> >> * The biggest thing is if people can test this thoroughly. I don't have
> >> SGE/PBS/LSF access right now, so it is a bit difficult for me to help.
> I
> >> have a cluster coming later in the summer, but it is not here yet.
> Once
> >> people have tested it well and are satisfied with it, let's merge it.
> >> * If we can update the documentation about how the PBS/SGE support works
> >> that would be great. The file is here:
> >
> > That sounds fine to me. I'm testing this stuff on my workstation's local
> > sge/torque queues and it works fine. I'll also test this with
> > StarCluster and make sure it works on a real cluster. If someone else
> > can test using LSF on a real cluster (with shared $HOME) that'd be
> > great. I'll try to update the docs some time this week.
> >
> >>
> >> Once these small changes have been made and everyone has tested, me
> >> can merge it for the 0.10.1 release.
> > Excellent :D
> >
> >> Thanks for doing this work Justin and Satra! It is fantastic! Just
> >> so you all know where this is going in 0.11:
> >>
> >> * We are going to get rid of using Twisted in ipcluster. This means we
> have
> >> to re-write the process management stuff to use things like popen.
> >> * We have a new configuration system in 0.11. This allows users to
> maintain
> >> cluster profiles that are a set of configuration files for a
> particular
> >> cluster setup. This makes it easy for a user to have multiple
> clusters
> >> configured, which they can then start by name. The logging, security,
> etc.
> >> is also different for each cluster profile.
> >> * It will be quite a bit of work to get everything working in 0.11, so I
> am
> >> glad we are getting good PBS/SGE support in 0.10.1.
> >
> > I'm willing to help out with the PBS/SGE/LSF portion of ipcluster in
> > 0.11, I guess just let me know when is appropriate to start hacking.
> >
> > Thanks!
> >
> > ~Justin
> >
>
>
>
> --
> Brian E. Granger, Ph.D.
> Assistant Professor of Physics
> Cal Poly State University, San Luis Obispo
> bgranger at calpoly.edu
> ellisonbg at gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20100720/0032ce73/attachment.html>
More information about the IPython-dev
mailing list