On Wed, Jun 22, 2022 at 11:22:05PM +0100, Paul Moore wrote:
Hang on, did the PEP change? The version I saw didn't have a compute() method, deferred objects were just evaluated when they were referenced.
You are right, the PEP does not mention a compute() method but uses the that term. I just used to make explicit when the evaluation takes place in the examples that I gave. My bad.
There's a *huge* difference (in my opinion) between auto-executing deferred expressions, and a syntax for creating *objects* that can be asked to calculate their value. And yes, the latter is extremely close to being nothing more than "a shorter and more composable form of zero-arg lambda", so it needs to be justifiable in comparison to zero-arg lambda (which is why I'm more interested in the composability aspect, building an AST by combining delayed expressions into larger ones).
Agree, the *huge* difference is what I tried to highlight because it is there where I see holes in the PEP.
Building an AST as you mentioned could fill on of those holes but how they are iterated and evaluated is still missing.
Of course, the exactly details will depend of the library that theoretically could use deferred expressions (like PySpark) but still I see non trivial details to fill.
- what would be the API for the objects of the AST that represents the deferred expresion(s) ? - how the "evaluator" of the expressions would iterate over them? Do will the "evaluator" have to check that every of the expressions is meaningful for it? - does the AST simplifies the implementation of existing libs implementing deferred methods? - who is the "evaluator" in the case of expressions that don't share a common "implementation"?
Allow me to expand on the last item:
# some Dask code df = later dask_df.filter(...) s = later df.sum()
# some selectq code d = later sQ.select("div") c = later d.count()
# now, mix and compute! (s + c).compute()
I can see how the deferred expressions are linked and how the AST is built but "who" knows how to execute it... I'm not sure. Will be Dask that will know how to plan, optimize and execute the sum() over the partitions of the dataframe, or will be selectq that knows how to build an xpath and talk with Selenium? May be will be the Python VM? May be the three?
I know that those questions have an answer but I still fill that there are more unknowns (specially of why the PEP would be useful for some lib).