Subject changed for tangent. On Sun, May 24, 2020 at 4:14 PM Dominik Vilsmeier <dominik.vilsmeier@gmx.de> wrote:
output = [] for x in data: a = delayed inc(x) b = delayed double(x) c = delayed add(a, b) output.append(c) total = sum(outputs) # concrete answer here.
Obviously the simple example of adding scalars isn't worth the delay thing. But if those were expensive operations that built up a call graph, it could be useful laziness.
Do you have an example which can't be solved by using generator expressions and itertools? As far as I understand the Dask docs the purpose of this is to execute in parallel which wouldn't be the case for pure Python I suppose? The above example can be written as:
a = (inc(x) for x in data) b = (double(x) for x in data) c = (add(x, y) for x, y in zip(a, b)) total = sum(c)
Obviously delayed execution CAN be done in Python, since Dask is a pure-Python library that does it. For the narrow example I took from the start of the dask.delayed docs, your version look equivalent. But there are many, not very complicated cases, where you cannot make the call graph as simple as a sequence of generator comprehensions. I could make some contrived example. Or with a little more work, I could make an actual useful example. For example, think of creating different delayed objects within conditional branches inside the loop. Yes, some could be expressed with an if in the comprehensions, but many cannot. It's true that Dask is most useful for parallel execution, whether in multiple threads, multiple processes, or multiple worker nodes. That doesn't mean it would be a bad thing for language level capabilities to make similar libraries easier. Kinda like the way we have asyncio, uvloop, and curio all built on the same primitives. But another really nice thing in delayed execution is that we do not necessarily want the *final* computation. Or indeed, the DAG might not have only one "final state." Building a DAG of delayed operations is almost free. We might build one with thousands or millions of different operations involved (and Dask users really do that). But imaging that different paths through the DAG lead to the states/values "final1", "final2", "final3" that share many, but not all of the same computation steps. After building the DAG, we can make a decision which computations to perform: if some_condition(): x = concretize final1 elif other_condition(): x = concretize final2 else: x = concretize final3 If we avoid 2/3, or even 1/3 of the computation by having that approach, that is a nice win where we are compute bound. -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.