[IPython-dev] SciPy Sprint summary
Justin Riley
justin.t.riley at gmail.com
Sun Jul 18 13:02:47 EDT 2010
Forgot to mention, in my fork PBS now automatically generates a launch
script as well if one is not specified. So, assuming you have either SGE
or Torque/PBS working it *should* be as simple as:
$ ipcluster sge -n 4
or
$ ipcluster pbs -n 4
You can of course still pass the --sge-script/--pbs-script options but
the user is no longer required to create a launch script themselves.
~Justin
On 07/18/2010 12:58 PM, Justin Riley wrote:
> Turns out that torque/pbs also support job arrays. I've updated my
> 0.10.1-sge branch with PBS job array support. Works well with torque
> 2.4.6. Also tested SGE support against 6.2u3.
>
> Since the code is extremely similar between PBS/SGE I decided to update
> the BatchEngineSet base class to handle the core job array logic. Given
> that PBS/SGE are the only subclasses I figured this was OK. If not,
> should be easy to break it out again.
>
> ~Justin
>
> On 07/18/2010 03:43 AM, Justin Riley wrote:
>> Hi Satra/Brian,
>>
>> I modified your code to use the job array feature of SGE. I've also made
>> it so that users don't need to specify --sge-script if they don't need a
>> custom SGE launch script. My guess is that most users will choose not to
>> specify --sge-script first and resort to using --sge-script when the
>> generated launch script no longer meets their needs. More details in the
>> git log here:
>>
>> http://github.com/jtriley/ipython/tree/0.10.1-sge
>>
>> Also, I need to test this, but I believe this code will fail if the
>> folder containing the furl file is not NFS-mounted on the SGE cluster.
>> Another option besides requiring NFS is to scp the furl file to each
>> host as is done in the ssh mode of ipcluster, however, this would
>> require password-less ssh to be configured properly (maybe not so bad).
>> Another option is to dump the generated furl file into the job script
>> itself. This has the advantage of only needing SGE installed but
>> certainly doesn't seem like the safest practice. Any thoughts on how to
>> approach this?
>>
>> Let me know what you think.
>>
>> ~Justin
>>
>> On 07/18/2010 12:05 AM, Brian Granger wrote:
>>> Is the array jobs feature what you want?
>>>
>>> http://wikis.sun.com/display/gridengine62u6/Submitting+Jobs
>>>
>>> Brian
>>>
>>> On Sat, Jul 17, 2010 at 9:00 PM, Brian Granger<ellisonbg at gmail.com>
>>> wrote:
>>>> On Sat, Jul 17, 2010 at 6:23 AM, Satrajit Ghosh<satra at mit.edu> wrote:
>>>>> hi ,
>>>>>
>>>>> i've pushed my changes to:
>>>>>
>>>>> http://github.com/satra/ipython/tree/0.10.1-sge
>>>>>
>>>>> notes:
>>>>>
>>>>> 1. it starts cleanly. i can connect and execute things. when i kill
>>>>> using
>>>>> ctrl-c, the messages appear to indicate that everything shut down
>>>>> well.
>>>>> however, the sge ipengine jobs are still running.
>>>>
>>>> What version of Python and Twisted are you running?
>>>>
>>>>> 2. the pbs option appears to require mpi to be present. i don't
>>>>> think one
>>>>> can launch multiple engines using pbs without mpi or without the
>>>>> workaround
>>>>> i've applied to the sge engine. basically it submits an sge job for
>>>>> each
>>>>> engine that i want to run. i would love to know if a single job can
>>>>> launch
>>>>> multiple engines on a sge/pbs cluster without mpi.
>>>>
>>>> I think you are right that pbs needs to use mpirun/mpiexec to start
>>>> multiple engines using a single PBS job. I am not that familiar with
>>>> SGE, can you start mulitple processes without mpi and with just a
>>>> single SGE job? If so, let's try to get that working.
>>>>
>>>> Cheers,
>>>>
>>>> Brian
>>>>
>>>>> cheers,
>>>>>
>>>>> satra
>>>>>
>>>>> On Thu, Jul 15, 2010 at 8:55 PM, Satrajit Ghosh<satra at mit.edu> wrote:
>>>>>>
>>>>>> hi justin,
>>>>>>
>>>>>> i hope to test it out tonight. from what fernando and i discussed,
>>>>>> this
>>>>>> should be relatively straightforward. once i'm done i'll push it to
>>>>>> my fork
>>>>>> of ipython and announce it here for others to test.
>>>>>>
>>>>>> cheers,
>>>>>>
>>>>>> satra
>>>>>>
>>>>>>
>>>>>> On Thu, Jul 15, 2010 at 4:33 PM, Justin
>>>>>> Riley<justin.t.riley at gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> This is great news. Right now StarCluster just takes advantage of
>>>>>>> password-less ssh already being installed and runs:
>>>>>>>
>>>>>>> $ ipcluster ssh --clusterfile /path/to/cluster_file.py
>>>>>>>
>>>>>>> This works fine for now, however, having SGE support would allow
>>>>>>> ipcluster's load to be accounted for by the queue.
>>>>>>>
>>>>>>> Is Satra on the list? I have experience with SGE and could help
>>>>>>> with the
>>>>>>> code if needed. I can also help test this functionality.
>>>>>>>
>>>>>>> ~Justin
>>>>>>>
>>>>>>> On 07/15/2010 03:34 PM, Fernando Perez wrote:
>>>>>>>> On Thu, Jul 15, 2010 at 10:34 AM, Brian
>>>>>>>> Granger<ellisonbg at gmail.com>
>>>>>>>> wrote:
>>>>>>>>> Thanks for the post. You should also know that it looks like
>>>>>>>>> someone
>>>>>>>>> is going to add native SGE support to ipcluster for 0.10.1.
>>>>>>>>
>>>>>>>> Yes, Satra and I went over this last night in detail (thanks to
>>>>>>>> Brian
>>>>>>>> for the pointers), and he said he might actually already have some
>>>>>>>> code for it. I suspect we'll get this in soon.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>> f
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> IPython-dev mailing list
>>>>>>> IPython-dev at scipy.org
>>>>>>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> IPython-dev mailing list
>>>>> IPython-dev at scipy.org
>>>>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Brian E. Granger, Ph.D.
>>>> Assistant Professor of Physics
>>>> Cal Poly State University, San Luis Obispo
>>>> bgranger at calpoly.edu
>>>> ellisonbg at gmail.com
>>>>
>>>
>>>
>>>
>>
>
More information about the IPython-dev
mailing list