[IPython-dev] DAG Dependencies

MinRK benjaminrk at gmail.com
Thu Oct 28 15:16:42 EDT 2010


On Thu, Oct 28, 2010 at 11:30, Satrajit Ghosh <satra at mit.edu> wrote:

> hi brian,
>
> thanks for the responses. i'll touch on a few of them.
>
> > * optionally offload the dag directly to the underlying scheduler if it
>> has
>> > dependency support (i.e., SGE, Torque/PBS, LSF)
>>
>> While we could support this, I actually think it would be a step
>> backwards.
>>
> ...
>>
> All of this means lots and lots of latency for each task in the DAG.
>> For tasks that have lots of data or lots of Python modules to import,
>> that will simply kill the parallel speedup you will get (ala Amdahl's
>> law).
>>
>
> here is the scenario where this becomes a useful thing (and hence to
> optionally have it). let's say under sge usage you have started 10
> clients/ipengines. now at the time of creating the clients one machine with
> 10 allocations was free and sge routed all the 10 clients to that machine.
> now this will be the machine that will be used for all ipcluster processing.
> whereas if the node distribution and ipengine startup were to happen
> simultaneously at the level of the sge scheduler, processes would get routed
> to the best available slot at the time of execution.
>

You should get this for free if our ipcluster script on SGE/etc. can
grow/shrink to fit available resources.  Remember, jobs don't get submitted
to engines until their dependencies are met.  It handles new engines coming
just fine (still some work to do to handle engines disappearing gracefully).
Engines should correspond to actual available resources, and we should
probably have an ipcluster script that supports changing resources on a
grid, but I'm not so sure about the scheduler itself.


> i agree that in several other scenarios, the current mechanism works great.
> but this is a common scenario that we have run into in a heavily used
> cluster (limited nodes + lots of users).
>
>
>> > * something we currently do in nipype is that we provide a configurable
>> > option to continue processing if a given node fails. we simply remove
>> the
>> > dependencies of the node from further execution and generate a report at
>> the
>> > end saying which nodes crashed.
>>
>> I guess I don't see how it was a true dependency then.  Is this like
>> an optional dependency?  What are the usage cases for this.
>>
>
> perhaps i misunderstood what happens in the current implementation. if you
> have a DAG such as (A,B) (B,E) (A,C) (C,D) and let's say C fails, does the
> current dag controller continue executing B,E? or does it crash at the first
> failure. we have the option to go either way in nipype. if something
> crashes, stop or if something crashes, process all things that are not
> dependent on the crash.
>

The Twisted Scheduler considers a task dependency unmet if the task raised
an error, and currently the ZMQ scheduler has no sense of error/success, so
it works the other way, but can easily be changed if I add the ok/error
status to the msg header (I should probably do this).  I think this is a
fairly reasonable use case to have a switch on failure/success, and it would
actually get you the callbacks you mention:

msg_id = client.apply(job1)
client.apply(cleanup, follow_failure=msg_id)
client.apply(job2, after_success=msg_id)
client.apply(job3, after_all=msg_id)

With this:
    iff job1 fails: cleanup will be run *in the same place* as job1
    iff job1 succeeds: job2 will be run somewhere
    when job1 finishes: job3 will be run somewhere, regardless of job1's
status

Satrajit: Does that sound adequate?
Brian: Does that sound too complex?


>
>
>> > * callback support for node: node_started_cb, node_finished_cb
>>
>> I am not sure we could support this, because once you create the DAG
>> and send it to the scheduler, the tasks are out of your local Python
>> session.  IOW, there is really no place to call such callbacks.
>>
>
> i'll have to think about this one a little more. one use case for this is
> reporting where things stand within the  execution graph (perhaps the
> scheduler can report this, although now, i'm back to polling instead of
> being called back.)
>
>
>> > * support for nodes themselves being DAGs
>>
> ...
>>
> I think for the node is a DAG case, we would just flatten that at
>> submission time.  IOW, apply the transformation:
>>
>> A DAG of nodes, each of which may be a DAG => A DAG of node.
>>
>> Would this work?
>>
>
> this would work, i think we have a slightly more complicated case of this
> implemented in nipype, but perhaps i need to think about it again. our case
> is like a maptask, where the same thing operates on a list of inputs and
> then we collate the outputs back. but as a general purpose mechanism, you
> should not worry about this use case now.
>

A DAG-node is really the same has having the root(s) of a sub-DAG have the
dependencies of the DAG-node, and anything that would depend on the DAG-node
actually depends on any|all of the termini of the sub-DAG, no?


>
>
>> Yes, it does make sense to support DRMAA in ipcluster.  Once Min's
>> stuff has been merged into master, we will begin to get it working
>> with the batch systems again.
>>
>
> great.
>
> cheers,
>
> satra
>
>
> _______________________________________________
> IPython-dev mailing list
> IPython-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20101028/cbe26e67/attachment.html>


More information about the IPython-dev mailing list