[Python-ideas] Re: Generalized deferred computation in Python

23 Jun 2022

      On Wed, Jun 22, 2022 at 11:22:05PM +0100, Paul Moore wrote:
...
Hang on, did the PEP change? The version I saw didn't have a compute()
method, deferred objects were just evaluated when they were
referenced.
You are right, the PEP does not mention a compute() method but uses the
that term. I just used to make explicit when the evaluation takes place
in the examples that I gave. My bad.
...
There's a *huge* difference (in my opinion) between auto-executing
deferred expressions, and a syntax for creating *objects* that can be
asked to calculate their value. And yes, the latter is extremely close
to being nothing more than "a shorter and more composable form of
zero-arg lambda", so it needs to be justifiable in comparison to
zero-arg lambda (which is why I'm more interested in the composability
aspect, building an AST by combining delayed expressions into larger
ones).
Agree, the *huge* difference is what I tried to highlight because it is
there where I see holes in the PEP.

Building an AST as you mentioned could fill on of those holes but how
they are iterated and evaluated is still missing.

Of course, the exactly details will depend of the library that
theoretically could use deferred expressions (like PySpark) but still I
see non trivial details to fill.

  - what would be the API for the objects of the AST that represents the
    deferred expresion(s) ?
  - how the "evaluator" of the expressions would iterate over them? Do
    will the "evaluator" have to check that every of the expressions is
    meaningful for it?
  - does the AST simplifies the implementation of existing libs
    implementing deferred methods?
  - who is the "evaluator" in the case of expressions that don't share a
    common "implementation"?

Allow me to expand on the last item:

# some Dask code
df = later dask_df.filter(...)
s = later df.sum()

# some selectq code
d = later sQ.select("div")
c = later d.count()

# now, mix and compute!
(s + c).compute()

I can see how the deferred expressions are linked and how the AST is
built but "who" knows how to execute it... I'm not sure. Will be Dask
that will know how to plan, optimize and execute the sum() over the
partitions of the dataframe, or will be selectq that knows how to build
an xpath and talk with Selenium? May be will be the Python VM? May be
the three?

I know that those questions have an answer but I still fill that there
are more unknowns (specially of why the PEP would be useful for some lib).

Thanks,
Martin.

[Python-ideas] Re: Generalized deferred computation in Python

Martin Di Paola