[IPython-dev] DAG Dependencies

Brian Granger ellisonbg at gmail.com
Thu Oct 28 13:46:19 EDT 2010


On Thu, Oct 28, 2010 at 12:57 AM, MinRK <benjaminrk at gmail.com> wrote:
> Hello,
> In order to test/demonstrate arbitrary DAG dependency support in the new ZMQ
> Python scheduler, I wrote an example using NetworkX, as Fernando suggested.
> It generates a random DAG with a given number of nodes and edges, runs a set
> of empty jobs (one for each node) using the DAG as a dependency graph, where
> each edge represents a job depending on another.
> It then validates the results, ensuring that no job ran before its
> dependencies, and draws the graph, with nodes arranged in X according to
> time, which means that all arrows must point to the right if the
> time-dependencies were met.

Very impressive demo and test.  Here is a very significant benchmark
we could do with this...

1. Make each node do a time.sleep(rand_time) where rand_time is a
random time interval over some range of times.
2. For a DAG of such tasks, you can calculate the fastest possible
parallel execution time by finding the shortest path through the DAG,
where, by shortest path, I mean the path where the sum of rand_time's
on that path is the smallest.  Call that time T_best.  By analyzing
the DAG, you can also tell the number of engines required to acheive
that T_best.  We can also calculate things like the parallel and
serial fraction of the DAG to find the max speedup.
3. Run that same DAG on 1, 2, 4, 8, ... engines to see how close we
can get to T_best and the max_speedup.

This would be a very rigorous way of testing the system over a variety
of different types of loads.

> It happily handles pretty elaborate (hundreds of edges) graphs.

That is quite impressive, but what is the limitation?  It should be
able to do 1000s or more of edges right?

> Too bad I didn't have this done for today's Py4Science talk.

Yes, defiinitely, that would have been "epic" as my teenage son would say.

> Script can be found here:
> http://github.com/minrk/ipython/blob/newparallel/examples/demo/dagdeps.py



> -MinRK

Brian E. Granger, Ph.D.
Assistant Professor of Physics
Cal Poly State University, San Luis Obispo
bgranger at calpoly.edu
ellisonbg at gmail.com

More information about the IPython-dev mailing list