[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

8 Dec 2021

      There are tens of concrete examples at the link I gave, and hundreds more
you can find easily by searching on Dask Delayed. This feels more like
trying to believe a contrary than seeking understanding.

Here's a concrete example that I wrote last summer. I wanted to write a
similar program in a bunch of programming languages to learn those
languages. From long ago, I had a Python implementation (which I improved
quite a lot through the exercise, as well).

https://github.com/DavidMertz/LanguagePractice

What the programs do is identify any duplicate files in a filesystem tree
(i.e. perhaps among millions of files, often with different names but same
content).

The basic idea is that a hash like SHA1 serves as a fingerprint of
contents. However, the main speedup potential is in NOT computing the hash
when files are either hardlinks or soft links to the same underlying inode.
I/O nowadays is more of a hit than CPU cycles, but the concept applies
either way.

Essentially the same technique is used in all the languages. But in the
Haskell case, it is NECESSARY to express this as deferred computation. I
don't want Python to be like Haskell, which was in most ways the most
difficult to work with.

However, it would be interesting and expressive to write a Python version
based around Dask Delayed... Or around a generalized "deferred" construct
in Python 3.13, maybe. I'm pretty sure it could be shorter and more
readable thereby.

On Wed, Dec 8, 2021, 6:28 PM Rob Cliffe via Python-ideas <
python-ideas@python.org> wrote:
...
On 08/12/2021 23:09, David Mertz, Ph.D. wrote:
On Wed, Dec 8, 2021, 5:55 PM Rob Cliffe via Python-ideas
...
But AIUI (i.e. practically not at all) Dask is about parallel computing,
which is not the same thing as deferred evaluation, though doubtless they
overlap.  Again AIUI, parallel computing is mainly useful when you have
multiple cores or multiple computers.
Much of Dask is about parallelism. But Dask Delayed really isn't. I mean,
yes it's a good adjunct to actual parallelism, but much of the benefit is
independent.
In particular, in Dask delayed—much as in a thoroughly lazy language like
Haskell—you can express a graph of interrelated computations that you might
POTENTIALLY perform.
There are many times when expressing those dependencies is useful, even
before you know which, if any, of them will actually need to be performed.
The site I linked as many more fleshed out examples, but suppose I have
this dataflow relationship:
A -> B -> C -> D -> E
Each of those letters name some expensive computation (or maybe expensive
I/O, or both).
In a particular run of our program, we might determine that we need the
data created by B. But in that particular run, we never wind up using C, D
or E. Of course, a different run, based on different conditions, will
actually need E.
In this simplest possible DAG, I've deliberately avoided any possible
parallelism. Every step entirely depends on the one before it. But delayed
compution can still be useful. Of course, when the DAG has branches, often
operating on branches can often be usefully parallelized (but that's still
not required for laziness to remain useful.
This is all abstract.  You give no clue to what your application is or
what it is meant to do.  Please, may I refer you to my previous post:
"*Can anyone give examples (in Python pseudo-code perhaps) showing
how *deferred evaluation* would be useful for a concrete task?  (Solving an
equation.  Drawing a graph.  Analysing a document.  Manufacturing a
widget.  Planning a journey.  Firing a missile.  Anything!  You name it.*
)"
David?  Anybody??
Best wishes
Rob Cliffe
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/BKLACJ...
Code of Conduct: http://python.org/psf/codeofconduct/