[Python-ideas] Delayed Execution via Keyword

Abe Dillon abedillon at gmail.com
Thu Mar 2 21:24:15 EST 2017


Another problem I thought of was how this might complicate stack
tracebacks. If you execute the following code:

[1] a = ["hello", 1]
[2] b = "1" + 1
[3] a = "".join(a)
[4] print(a)

The interpreter would build a graph until it hit line 4 and was forced to
evaluate `a`. It would track `a` back to the branch:
[1]->[3]->

and raise an error from line [3] when you would expect line [2] to raise an
error first. I suppose it may be possible to catch any exceptions and force
full evaluation of nodes up to that point to find any preceding errors, but
that sounds like a harry proposition...

On Thu, Mar 2, 2017 at 8:10 PM, Abe Dillon <abedillon at gmail.com> wrote:

> without special casing iteration how do you know that `x1 = next(xs)`
>> depends on the value of `x0`?
>
> `x1 = next(xs)` doesn't depend on the value of `x0`, it depends on the
> state of xs. In order to evaluate `next(xs)` you have to jump into the
> function call and evaluate the relevant expressions within, which will,
> presumably, mean evaluating the value of some place-holder variable or
> something, which will trigger evaluation of preceding, pending expressions
> that modify the value of that place-holder variable, which includes `x0 =
> next(xs)`.
>
> You do have a point, though; if the pending-execution graph has to be fine
> enough scale to capture all that, then it's a dubious claim that juggling
> such a construct would save any time over simply executing the code as you
> go. This, I believe, goes beyond iterators and gets at the heart of what
> Josh said:
>
> What you really do want is functional purity, which is a different concept
>> and one that python as a language can't easily provide no matter what.
>
>
> `next` is not a pure function, because it has side-effects: it changes
> state variables. Even if those side-effects can be tracked by the
> interpreter, they present a challenge. In the example:
> >>> log.warning(expensive_function())
>
> Where we want to avoid executing expensive_function(). It's likely that
> the function iterates over some large amount of data. According to the `x1
> = next(xs)` example, that means building a huge pending-execution graph in
> case that function does need to be evaluated, so you can track the iterator
> state changes all the way back to the first iteration before executing.
>
> Perhaps there's some clever trick I'm not thinking of to keep the graph
> small and only expand it as needed. I don't know. Maybe, like Joshua
> Morton's JIT example, you could automatically identify loop patterns and
> collapse them somehow. I guess special casing iteration would help with
> that, though it's difficult to see what that would look like.
>
>
>
> On Thu, Mar 2, 2017 at 7:30 PM, Joseph Jevnik <joejev at gmail.com> wrote:
>
>> without special casing iteration how do you know that `x1 = next(xs)`
>> depends on the value of `x0`? If you assume every operation depends on
>> every other operation then you have implemented an eager evaluation model.
>>
>> On Thu, Mar 2, 2017 at 8:26 PM, Abe Dillon <abedillon at gmail.com> wrote:
>>
>>> I don't think you have to make a special case for iteration.
>>>
>>> When the interpreter hits:
>>> >>> print(x1)
>>>
>>> print falls under I/O, so it forces evaluation of x1, so we back-track
>>> to where x1 is evaluated:
>>> >>> x1 = next(xs)
>>>
>>> And in the next call, we find that we must evaluate the state of the
>>> iterator, so we have to back-track to:
>>> >>> x0 = next(xs)
>>>
>>> Evaluate that, then move forward.
>>>
>>> You essentially keep a graph of pending/unevaluated expressions linked
>>> by their dependencies and evaluate branches of the graph as needed. You
>>> need to evaluate state to navigate conditional branches, and whenever state
>>> is passed outside of the interpreter's scope (like I/O or multi-threading).
>>> I think problems might crop up in parts of the language that are pure
>>> c-code. For instance; I don't know if the state variables in a list
>>> iterator are actually visible to the Interpreter or if it's implemented in
>>> C that is inscrutable to the interpreter.
>>>
>>>
>>> On Mar 2, 2017 5:54 PM, "Joseph Jevnik" <joejev at gmail.com> wrote:
>>>
>>> Other things that scrutinize an expression are iteration or branching
>>> (with the current evaluation model). If `xs` is a thunk, then `for x in xs`
>>> must scrutinize `xs`. At first this doesn't seem required; however, in
>>> general `next` imposes a data dependency on the next call to `next`. For
>>> example:
>>>
>>> x0 = next(xs)
>>> x1 = next(xs)
>>>
>>> print(x1)
>>> print(x0)
>>>
>>> If `next` doesn't force computation then evaluating `x1` before `x0`
>>> will bind `x1` to `xs[0]` which is not what the eager version of the code
>>> does.
>>>
>>> To preserve the current semantics of the language you cannot defer
>>> arbitrary expressions because they may have observable side-effects.
>>> Automatically translating would require knowing ahead of time if a function
>>> can have observable side effects, but that is not possible in Python.
>>> Because it is impossible to tell in the general case, we must rely on the
>>> user to tell us when it is safe to defer an expression.
>>>
>>> On Thu, Mar 2, 2017 at 6:42 PM, Abe Dillon <abedillon at gmail.com> wrote:
>>>
>>>> I'm going to repeat here what I posted in the thread on lazy imports.
>>>> If it's possible for the interpreter to determine when it needs to
>>>> force evaluation of a lazy expression or statement, then why not use them
>>>> everywhere? If that's the case, then why not make everything lazy by
>>>> default? Why not make it a service of the language to lazify your code
>>>> (analogous to garbage collection) so a human doesn't have to worry about
>>>> screwing it up?
>>>>
>>>> There are, AFAIK, three things that *must* force evaluation of lazy
>>>> expressions or statements:
>>>>
>>>> 1) Before the GIL is released, all pending lazy code must be evaluated
>>>> since the current thread can't know what variables another thread will try
>>>> to access (unless there's a way to explicitly label variables as "shared",
>>>> then it will only force evaluation of those).
>>>>
>>>> 2) Branching statements force evaluation of anything required to
>>>> evaluate the conditional clause.
>>>>
>>>> 3) I/O forces evaluation of any involved lazy expressions.
>>>>
>>>>
>>>> On Mon, Feb 20, 2017 at 7:07 PM, Joshua Morton <
>>>> joshua.morton13 at gmail.com> wrote:
>>>>
>>>>> This comes from a bit of a misunderstanding of how an interpreter
>>>>> figures out what needs to be compiled. Most (all?) JIT compilers run code
>>>>> in an interpreted manner, and then compile subsections down to efficient
>>>>> machine code when they notice that the same code path is taken repeatedly,
>>>>> so in pypy something like
>>>>>
>>>>>     x = 0
>>>>>     for i in range(100000):
>>>>>         x += 1
>>>>>
>>>>> would, get, after 10-20 runs through the loop, turned into assembly
>>>>> that looked like what you'd write in pure C, instead of the very
>>>>> indirection and pointer heavy code that such a loop would be if you could
>>>>> take it and convert it to cpython actually executes, for example. So the
>>>>> "hot" code is still run.
>>>>>
>>>>> All that said, this is a bit of an off topic discussion and probably
>>>>> shouldn't be on list.
>>>>>
>>>>> What you really do want is functional purity, which is a different
>>>>> concept and one that python as a language can't easily provide no matter
>>>>> what.
>>>>>
>>>>> --Josh
>>>>>
>>>>> On Mon, Feb 20, 2017 at 7:53 PM Abe Dillon <abedillon at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> On Fri, Feb 17, 2017, Steven D'Aprano wrote:
>>>>>>
>>>>>> JIT compilation delays *compiling* the code to run-time. This is a
>>>>>> proposal for delaying *running* the code until such time as some other
>>>>>> piece of code actually needs the result.
>>>>>>
>>>>>>
>>>>>> My thought was that if a compiler is capable of determining what
>>>>>> needs to be compiled just in time, then an interpreter might be able to
>>>>>> determine what expressions need to be evaluated just when their results are
>>>>>> actually used.
>>>>>>
>>>>>> So if you had code that looked like:
>>>>>>
>>>>>> >>> log.debug("data: %s", expensive())
>>>>>>
>>>>>> The interpreter could skip evaluating the expensive function if the
>>>>>> result is never used. It would only evaluate it "just in time". This would
>>>>>> almost certainly require just in time compilation as well, otherwise the
>>>>>> byte code that calls the "log.debug" function would be unaware of the byte
>>>>>> code that implements the function.
>>>>>>
>>>>>> This is probably a pipe-dream, though; because the interpreter would
>>>>>> have to be aware of side effects.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Feb 20, 2017 at 5:18 AM, <tritium-list at sdamon.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> > -----Original Message-----
>>>>>> > From: Python-ideas [mailto:python-ideas-bounces+tritium-
>>>>>> > list=sdamon.com at python.org] On Behalf Of Michel Desmoulin
>>>>>> > Sent: Monday, February 20, 2017 3:30 AM
>>>>>> > To: python-ideas at python.org
>>>>>> > Subject: Re: [Python-ideas] Delayed Execution via Keyword
>>>>>> >
>>>>>> > I wrote a blog post about this, and someone asked me if it meant
>>>>>> > allowing lazy imports to make optional imports easier.
>>>>>> >
>>>>>> > Someting like:
>>>>>> >
>>>>>> > lazy import foo
>>>>>> > lazy from foo import bar
>>>>>> >
>>>>>> > So now if I don't use the imports, the module is not loaded, which
>>>>>> could
>>>>>> > also significantly speed up applications starting time with a lot of
>>>>>> > imports.
>>>>>>
>>>>>> Would that not also make a failure to import an error at the time of
>>>>>> executing the imported piece of code rather than at the place of
>>>>>> import?
>>>>>> And how would optional imports work if they are not loaded until
>>>>>> use?  Right
>>>>>> now, optional imports are done by wrapping the import statement in a
>>>>>> try/except, would you not need to do that handling everywhere the
>>>>>> imported
>>>>>> object is used instead?
>>>>>>
>>>>>> (I haven't been following the entire thread, and I don't know if this
>>>>>> is a
>>>>>> forest/tress argument)
>>>>>>
>>>>>> > _______________________________________________
>>>>>> > Python-ideas mailing list
>>>>>> > Python-ideas at python.org
>>>>>> > https://mail.python.org/mailman/listinfo/python-ideas
>>>>>> > Code of Conduct: http://python.org/psf/codeofconduct/
>>>>>>
>>>>>> _______________________________________________
>>>>>> Python-ideas mailing list
>>>>>> Python-ideas at python.org
>>>>>> https://mail.python.org/mailman/listinfo/python-ideas
>>>>>> Code of Conduct: http://python.org/psf/codeofconduct/
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Python-ideas mailing list
>>>>>> Python-ideas at python.org
>>>>>> https://mail.python.org/mailman/listinfo/python-ideas
>>>>>> Code of Conduct: http://python.org/psf/codeofconduct/
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> Python-ideas mailing list
>>>> Python-ideas at python.org
>>>> https://mail.python.org/mailman/listinfo/python-ideas
>>>> Code of Conduct: http://python.org/psf/codeofconduct/
>>>>
>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20170302/3aa37e14/attachment-0001.html>


More information about the Python-ideas mailing list