
I had forgotten about Daisy! It's an interesting project too. The behavior of 'autodask()' is closer to what I'd want in new syntax than is plain dask.delayed(). I'm not sure of all the corners. But is definitely love to have it for expressions generally, not only pure functions. On Feb 17, 2017 12:03 AM, "Joseph Jevnik" <joejev@gmail.com> wrote:
You can let dask "see" into the function by entering it and wrapping all of the operations in `delayed`; this is how daisy[0] builds up large compute graphs. In this case, you could "inline" the identity function and the delayed object would flow through the function and the call to identity never makes it into the task graph.
[0] http://daisy-python.readthedocs.io/en/latest/ appendix.html#daisy.autodask
On Fri, Feb 17, 2017 at 2:26 AM, David Mertz <mertz@gnosis.cx> wrote:
On Thu, Feb 16, 2017 at 11:15 PM, David Mertz <mertz@gnosis.cx> wrote:
This also means that a 'delayed' object needs to be idempotent. So
x = delayed 2+2
y = delayed x
z = delayed delayed delayed y
Wrapping more delays around an existing delayed object should probably just keep the same object rather than "doubly delaying" it. If there is some reason to create separate delayed objects that isn't occurring to me, evaluating 'z' would still go through the multiple evaluation levels until it got to a non-delayed value.
This is sort of like how iterators "return self" and 'it = iter(it)'.
In the case of Dask, wrapping more delayed objects creates layers of these lazy objects. But I think it has to because it's not part of the syntax. Actually, I guess Dask could do graph reduction without actual computation if it wanted to. But this is the current behavior:
def unchanged(x): ... return x a = delayed(unchanged)(42) a Delayed('unchanged-1780fed6-f835-4c31-a86d-50015ae1449a') b = delayed(unchanged)(a) c = delayed(unchanged)(b) c Delayed('unchanged-adc5e307-6e33-45bf-ad73-150b906e921d') c.dask {'unchanged-1780fed6-f835-4c31-a86d-50015ae1449a': (<function __main__.unchanged>, 42), 'unchanged-adc5e307-6e33-45bf-ad73-150b906e921d': (<function __main__.unchanged>, 'unchanged-c3282bc4-bdaa-4148-8509-9155cac83ef0'), 'unchanged-c3282bc4-bdaa-4148-8509-9155cac83ef0': (<function __main__.unchanged>, 'unchanged-1780fed6-f835-4c31-a86d-50015ae1449a')}
c.compute() 42
Actually Dask *cannot* know that "unchanged()" is the function that makes no transformation on its one parameter. From what it can see, it's just a function that does *something*. And I guess similarly in the proposed syntax, anything other than a plain name after the 'delayed' would still need to create a new delayed object. So it's all an edge case that doesn't make much difference.
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.