Mailman 3 Delayed Execution via Keyword - Python-ideas

Delayed Execution via Keyword

Joseph Hackman

Feb. 17, 2017

5:24 a.m.

Howdy All! This suggestion is inspired by the question on "Efficient debug logging". I propose a keyword to mark an expression for delayed/lazy execution, for the purposes of standardizing such behavior across the language. The proposed format is: delayed: <expr> i.e. log.info("info is %s", delayed: expensiveFunction()) Unlike 'lambda' which returns a function (so the receiver must be lambda-aware), delayed execution blocks are for all purposes values. The first time the value (rather than location) is read, or any method on the delayed object is called, the expression is executed and the delayed expression is replaced with the result. (Thus, the delayed expression is only every evaluated once). Ideally: a = delayed: 1+2 b = a print(a) #adds 1 and 2, prints 3 # a and b are now both just 3 print(b) #just prints 3 Mechanically, this would be similar to the following: class Delayed(): def __init__(self, func): self.__func = func self.__executed = False self.__value = None def __str__(self): if self.__executed: return self.__value.__str__() self.__value = self.__func() self.__executed = True return self.__value.__str__() def function_print(value): print('function_print') print(value) def function_return_stuff(value): print('function_return_stuff') return value function_print(function_return_stuff('no_delay')) function_print(Delayed(lambda: function_return_stuff('delayed'))) delayed = Delayed(lambda: function_return_stuff('delayed_object')) function_print(delayed) function_print(delayed) Unfortunately, due to https://docs.python.org/3/reference/datamodel.html#special-lookup , this magic delayed class would need to implement many magic methods, as __getattribute__ is not _always_ called.

Attachments:

attachment.htm (text/html — 4.4 KB)

Show replies by date

David Mertz

February 2017

5:51 a.m.

I rather like this at first brush! On Feb 16, 2017 9:25 PM, "Joseph Hackman" <josephhackman@gmail.com> wrote:

...

Howdy All!

This suggestion is inspired by the question on "Efficient debug logging".

I propose a keyword to mark an expression for delayed/lazy execution, for the purposes of standardizing such behavior across the language.

The proposed format is: delayed: <expr> i.e. log.info("info is %s", delayed: expensiveFunction())

Unlike 'lambda' which returns a function (so the receiver must be lambda-aware), delayed execution blocks are for all purposes values. The first time the value (rather than location) is read, or any method on the delayed object is called, the expression is executed and the delayed expression is replaced with the result. (Thus, the delayed expression is only every evaluated once).

Ideally: a = delayed: 1+2 b = a print(a) #adds 1 and 2, prints 3 # a and b are now both just 3 print(b) #just prints 3

Mechanically, this would be similar to the following:

class Delayed(): def __init__(self, func): self.__func = func self.__executed = False self.__value = None

def __str__(self): if self.__executed: return self.__value.__str__() self.__value = self.__func() self.__executed = True return self.__value.__str__()

def function_print(value): print('function_print') print(value)

def function_return_stuff(value): print('function_return_stuff') return value

function_print(function_return_stuff('no_delay'))

function_print(Delayed(lambda: function_return_stuff('delayed')))

delayed = Delayed(lambda: function_return_stuff('delayed_object')) function_print(delayed) function_print(delayed)

Unfortunately, due to https://docs.python.org/3/reference/datamodel.html# special-lookup , this magic delayed class would need to implement many magic methods, as __getattribute__ is not _always_ called.

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Joseph Jevnik

5:52 a.m.

You might be interested in https://github.com/llllllllll/lazy_python, which implements the features you describe but instead of a keyword it uses a decorator. On Fri, Feb 17, 2017 at 12:24 AM, Joseph Hackman <josephhackman@gmail.com> wrote:

...

Howdy All!

This suggestion is inspired by the question on "Efficient debug logging".

I propose a keyword to mark an expression for delayed/lazy execution, for the purposes of standardizing such behavior across the language.

The proposed format is: delayed: <expr> i.e. log.info("info is %s", delayed: expensiveFunction())

Unlike 'lambda' which returns a function (so the receiver must be lambda-aware), delayed execution blocks are for all purposes values. The first time the value (rather than location) is read, or any method on the delayed object is called, the expression is executed and the delayed expression is replaced with the result. (Thus, the delayed expression is only every evaluated once).

Ideally: a = delayed: 1+2 b = a print(a) #adds 1 and 2, prints 3 # a and b are now both just 3 print(b) #just prints 3

Mechanically, this would be similar to the following:

class Delayed(): def __init__(self, func): self.__func = func self.__executed = False self.__value = None

def __str__(self): if self.__executed: return self.__value.__str__() self.__value = self.__func() self.__executed = True return self.__value.__str__()

def function_print(value): print('function_print') print(value)

def function_return_stuff(value): print('function_return_stuff') return value

function_print(function_return_stuff('no_delay'))

function_print(Delayed(lambda: function_return_stuff('delayed')))

delayed = Delayed(lambda: function_return_stuff('delayed_object')) function_print(delayed) function_print(delayed)

Unfortunately, due to https://docs.python.org/3/reference/datamodel.html# special-lookup , this magic delayed class would need to implement many magic methods, as __getattribute__ is not _always_ called.

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

David Mertz

6:23 a.m.

Dask also has a function delayed() that may be used as an decorator and in other ways like:

...

...
...
from dask import delayed

...

...
...
from operator import add, mul a = delayed(add)(1, 2) b = delayed(mul)(a, 3)

...

...
...
b Delayed('mul-1907f29b-60a4-48af-ba2a-938556555f9b')

...

...
...
c = b.compute()

...

...
...
c 9

...

...
...
b.dask {'add-d49ba000-dd5d-4031-8c37-6514626a3d81': (<function _operator.add>, 1, 2), 'mul-1907f29b-60a4-48af-ba2a-938556555f9b': (<function _operator.mul>, 'add-d49ba000-dd5d-4031-8c37-6514626a3d81', 3)}

You *can* do pretty much anything you'd want to using this approach... including the real job of Dask, to identify latent parallelism and execute computations on many cores or many machines in a cluster. But actual syntax to do all of this would be really elegant. I think for this to be as useful as I'd want, you'd sometimes need to be to continue delaying computation rather than doing so on every access I guess as syntax, the `delayed:` construct would work (I think having no colon would more closely model `yield` and `yield from` and `await` and `async` which this is kinda-sorta akin to). So for example, in a hypothetical Python 3.7+:

...

...
...
a = delayed 1 + 2

...

...
...
b = delayed b * 3

...

...
...
c = delayed 12/3

...

...
...
my_lazy_func(b, delayed c) # evaluates b but not yet c b 9 delayed c <delayed object at 0x123456789>

If you want to do something like Dask... or for Dask itself to be able to use it in some eventual version, you'd need to be able to keep objects from evaluating even while you passed them around. The obvious use is for finding parallelism, but other things like passing callbacks might find this useful too. Dask delayed objects stay lazy until you explicitly `.compute()` on them (which does so recursively on every lazy object that might go into the computation). This hypothetical new keyword would have object evaluate eagerly *unless* you explicitly kept them lazy. But the idea is that the programmer would still get to decide in their code. On Feb 16, 2017 9:53 PM, "Joseph Jevnik" <joejev@gmail.com> wrote:

...

You might be interested in https://github.com/llllllllll/lazy_python, which implements the features you describe but instead of a keyword it uses a decorator.

On Fri, Feb 17, 2017 at 12:24 AM, Joseph Hackman <josephhackman@gmail.com> wrote:

...
Howdy All!

This suggestion is inspired by the question on "Efficient debug logging".

I propose a keyword to mark an expression for delayed/lazy execution, for

...
the purposes of standardizing such behavior across the language.

The proposed format is: delayed: <expr> i.e. log.info("info is %s", delayed: expensiveFunction())

Unlike 'lambda' which returns a function (so the receiver must be lambda-aware), delayed execution blocks are for all purposes values. The first time the value (rather than location) is read, or any method on the delayed object is called, the expression is executed and the delayed expression is replaced with the result. (Thus, the delayed expression is only every evaluated once).

Ideally: a = delayed: 1+2 b = a print(a) #adds 1 and 2, prints 3 # a and b are now both just 3 print(b) #just prints 3

Mechanically, this would be similar to the following:

class Delayed(): def __init__(self, func): self.__func = func self.__executed = False self.__value = None

def __str__(self): if self.__executed: return self.__value.__str__() self.__value = self.__func() self.__executed = True return self.__value.__str__()

def function_print(value): print('function_print') print(value)

def function_return_stuff(value): print('function_return_stuff') return value

function_print(function_return_stuff('no_delay'))

function_print(Delayed(lambda: function_return_stuff('delayed')))

delayed = Delayed(lambda: function_return_stuff('delayed_object')) function_print(delayed) function_print(delayed)

Unfortunately, due to https://docs.python.org/3/r eference/datamodel.html#special-lookup , this magic delayed class would need to implement many magic methods, as __getattribute__ is not _always_ called.

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Joshua Morton

6:33 a.m.

David, can you elaborate on your example? if we replaced line four with >>> x = my_lazy_func(b, delayed c) what would the value of `x` be, and how would this differ from either >>> x = delayed my_lazy_func(b, delayed c) or >>> x = delayed my_lazy_func(b, c) To put it another way, why does my_lazy_func being called not evaluate c, or are you implying that it too is some kind of special delayed function that needs to be explicitly computed, like in dask? --Josh On Fri, Feb 17, 2017 at 1:23 AM David Mertz <mertz@gnosis.cx> wrote:

...

Dask also has a function delayed() that may be used as an decorator and in other ways like:

...
...
...
from dask import delayed

...
...
...
from operator import add, mul a = delayed(add)(1, 2) b = delayed(mul)(a, 3)

...
...
...
b Delayed('mul-1907f29b-60a4-48af-ba2a-938556555f9b')

...
...
...
c = b.compute()

...
...
...
c 9

...
...
...
b.dask {'add-d49ba000-dd5d-4031-8c37-6514626a3d81': (<function _operator.add>, 1, 2), 'mul-1907f29b-60a4-48af-ba2a-938556555f9b': (<function _operator.mul>, 'add-d49ba000-dd5d-4031-8c37-6514626a3d81', 3)}

You *can* do pretty much anything you'd want to using this approach... including the real job of Dask, to identify latent parallelism and execute computations on many cores or many machines in a cluster.

But actual syntax to do all of this would be really elegant. I think for this to be as useful as I'd want, you'd sometimes need to be to continue delaying computation rather than doing so on every access I guess as syntax, the `delayed:` construct would work (I think having no colon would more closely model `yield` and `yield from` and `await` and `async` which this is kinda-sorta akin to).

So for example, in a hypothetical Python 3.7+:

...
...
...
a = delayed 1 + 2

...
...
...
b = delayed b * 3

...
...
...
c = delayed 12/3

...
...
...
my_lazy_func(b, delayed c) # evaluates b but not yet c b 9 delayed c <delayed object at 0x123456789>

If you want to do something like Dask... or for Dask itself to be able to use it in some eventual version, you'd need to be able to keep objects from evaluating even while you passed them around. The obvious use is for finding parallelism, but other things like passing callbacks might find this useful too.

Dask delayed objects stay lazy until you explicitly `.compute()` on them (which does so recursively on every lazy object that might go into the computation). This hypothetical new keyword would have object evaluate eagerly *unless* you explicitly kept them lazy. But the idea is that the programmer would still get to decide in their code.

On Feb 16, 2017 9:53 PM, "Joseph Jevnik" <joejev@gmail.com> wrote:

You might be interested in https://github.com/llllllllll/lazy_python, which implements the features you describe but instead of a keyword it uses a decorator.

On Fri, Feb 17, 2017 at 12:24 AM, Joseph Hackman <josephhackman@gmail.com> wrote:

Howdy All!

This suggestion is inspired by the question on "Efficient debug logging".

I propose a keyword to mark an expression for delayed/lazy execution, for the purposes of standardizing such behavior across the language.

The proposed format is: delayed: <expr> i.e. log.info("info is %s", delayed: expensiveFunction())

Unlike 'lambda' which returns a function (so the receiver must be lambda-aware), delayed execution blocks are for all purposes values. The first time the value (rather than location) is read, or any method on the delayed object is called, the expression is executed and the delayed expression is replaced with the result. (Thus, the delayed expression is only every evaluated once).

Ideally: a = delayed: 1+2 b = a print(a) #adds 1 and 2, prints 3 # a and b are now both just 3 print(b) #just prints 3

Mechanically, this would be similar to the following:

class Delayed(): def __init__(self, func): self.__func = func self.__executed = False self.__value = None

def __str__(self): if self.__executed: return self.__value.__str__() self.__value = self.__func() self.__executed = True return self.__value.__str__()

def function_print(value): print('function_print') print(value)

def function_return_stuff(value): print('function_return_stuff') return value

function_print(function_return_stuff('no_delay'))

function_print(Delayed(lambda: function_return_stuff('delayed')))

delayed = Delayed(lambda: function_return_stuff('delayed_object')) function_print(delayed) function_print(delayed)

Unfortunately, due to https://docs.python.org/3/reference/datamodel.html#special-lookup , this magic delayed class would need to implement many magic methods, as __getattribute__ is not _always_ called.

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

David Mertz

6:55 a.m.

On Thu, Feb 16, 2017 at 10:33 PM, Joshua Morton <joshua.morton13@gmail.com> wrote:

...

David, can you elaborate on your example?

if we replaced line four with

>>> x = my_lazy_func(b, delayed c)

what would the value of `x` be, and how would this differ from either

The value of the function would be whatever you want, including another delayed object, potentially. Here's a toy function just to show the pattern I have in mind: def my_lazy_func(x, y): if x == 0: # evaluate y if it was lazy, or just return plain value return y elif x > 0: # create a lazy object whether y was lazy or concrete # if y *was* lazy, it remains so in delayed expression return delayed y + 17 elif x < -10: # evaluate y, if delayed, by virtue of being part of # plain non-delayed RHS expression z = y * 6 return z else: # y never mentioned, so if it was lazy nothing changes return x - 22 Whether or not 'c' (called 'y' within the function scope) gets evaluated/computed depends on which branch was taken within the function. Once the delayed computation is done, the object at the delayed address becomes concrete thereafter to avoid repeated (and potentially stateful or side-effect causing evaluation). One question is whether you'd need to write: m = delayed 1 + 2 n = delayed m * 3 # This seems ugly and unnecessary (but should work) q = delayed m / delayed n # This seems better (does not evaluate m or n) # everything lazy within delayed expression stays unevaluated q2 = delayed m/n If you ever need to explicitly evaluate a delayed object, it is as simple as putting a name pointing to it on a line by itself: f = delayed 1 + 2 # We want to evaluate f before next op for some reason f # f is already a concrete value now, before calculating g g = f * 7 So for example, in a hypothetical Python 3.7+:

...

...
...
...
...
a = delayed 1 + 2

...
...
...
b = delayed b * 3

...
...
...
c = delayed 12/3

...
...
...
my_lazy_func(b, delayed c) # evaluates b but not yet c b 9 delayed c <delayed object at 0x123456789>

If you want to do something like Dask... or for Dask itself to be able to use it in some eventual version, you'd need to be able to keep objects from evaluating even while you passed them around. The obvious use is for finding parallelism, but other things like passing callbacks might find this useful too.

Dask delayed objects stay lazy until you explicitly `.compute()` on them (which does so recursively on every lazy object that might go into the computation). This hypothetical new keyword would have object evaluate eagerly *unless* you explicitly kept them lazy. But the idea is that the programmer would still get to decide in their code.

On Feb 16, 2017 9:53 PM, "Joseph Jevnik" <joejev@gmail.com> wrote:

You might be interested in https://github.com/llllllllll/lazy_python, which implements the features you describe but instead of a keyword it uses a decorator.

On Fri, Feb 17, 2017 at 12:24 AM, Joseph Hackman <josephhackman@gmail.com

...
wrote:

Howdy All!

This suggestion is inspired by the question on "Efficient debug logging".

I propose a keyword to mark an expression for delayed/lazy execution, for the purposes of standardizing such behavior across the language.

The proposed format is: delayed: <expr> i.e. log.info("info is %s", delayed: expensiveFunction())

Unlike 'lambda' which returns a function (so the receiver must be lambda-aware), delayed execution blocks are for all purposes values. The first time the value (rather than location) is read, or any method on the delayed object is called, the expression is executed and the delayed expression is replaced with the result. (Thus, the delayed expression is only every evaluated once).

Ideally: a = delayed: 1+2 b = a print(a) #adds 1 and 2, prints 3 # a and b are now both just 3 print(b) #just prints 3

Mechanically, this would be similar to the following:

class Delayed(): def __init__(self, func): self.__func = func self.__executed = False self.__value = None

def __str__(self): if self.__executed: return self.__value.__str__() self.__value = self.__func() self.__executed = True return self.__value.__str__()

def function_print(value): print('function_print') print(value)

def function_return_stuff(value): print('function_return_stuff') return value

function_print(function_return_stuff('no_delay'))

function_print(Delayed(lambda: function_return_stuff('delayed')))

delayed = Delayed(lambda: function_return_stuff('delayed_object')) function_print(delayed) function_print(delayed)

Unfortunately, due to https://docs.python.org/3/reference/datamodel.html# special-lookup , this magic delayed class would need to implement many magic methods, as __getattribute__ is not _always_ called.

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

David Mertz

7:15 a.m.

I think maybe the idiomatic pattern should be assignment rather than just bare name. E.g. f = delayed 1 + 2 # We want to evaluate f before next op for some reason *f = f* # f is already a concrete value now, before calculating g g = f * 7 I think if we follow my rule that "everything lazy within delayed expression stays unevaluated" that needs to apply to function calls too. So I think in: x = delayed my_lazy_func(b, c) This would not affect the laziness of 'b' or 'c' with that line. It also would not run any flow control within the function, etc. Basically, it's almost like wrapping it in a lambda: x = lambda: my_lazy_func(b, c) Except that when you finally *do* want the value out of 'x' you don't spell that as 'x() + 7' but simply as 'x + 7'. This also means that the above delayed function call would be functionally identical if you spelled it: x = delayed my_lazy_func(delayed b, delayed c) This also means that a 'delayed' object needs to be idempotent. So x = delayed 2+2 y = delayed x z = delayed delayed delayed y Wrapping more delays around an existing delayed object should probably just keep the same object rather than "doubly delaying" it. If there is some reason to create separate delayed objects that isn't occurring to me, evaluating 'z' would still go through the multiple evaluation levels until it got to a non-delayed value. On Thu, Feb 16, 2017 at 10:55 PM, David Mertz <mertz@gnosis.cx> wrote:

...

On Thu, Feb 16, 2017 at 10:33 PM, Joshua Morton <joshua.morton13@gmail.com

...
wrote:

...
David, can you elaborate on your example?

if we replaced line four with

>>> x = my_lazy_func(b, delayed c)

what would the value of `x` be, and how would this differ from either

The value of the function would be whatever you want, including another delayed object, potentially.

Here's a toy function just to show the pattern I have in mind:

def my_lazy_func(x, y):

if x == 0: # evaluate y if it was lazy, or just return plain value

return y

elif x > 0:

# create a lazy object whether y was lazy or concrete

# if y *was* lazy, it remains so in delayed expression

return delayed y + 17

elif x < -10:

# evaluate y, if delayed, by virtue of being part of

# plain non-delayed RHS expression

z = y * 6

return z

else:

# y never mentioned, so if it was lazy nothing changes

return x - 22

Whether or not 'c' (called 'y' within the function scope) gets evaluated/computed depends on which branch was taken within the function. Once the delayed computation is done, the object at the delayed address becomes concrete thereafter to avoid repeated (and potentially stateful or side-effect causing evaluation).

One question is whether you'd need to write:

m = delayed 1 + 2 n = delayed m * 3 # This seems ugly and unnecessary (but should work) q = delayed m / delayed n # This seems better (does not evaluate m or n) # everything lazy within delayed expression stays unevaluated

q2 = delayed m/n

If you ever need to explicitly evaluate a delayed object, it is as simple as putting a name pointing to it on a line by itself:

f = delayed 1 + 2

# We want to evaluate f before next op for some reason

f

# f is already a concrete value now, before calculating g

g = f * 7

So for example, in a hypothetical Python 3.7+:

...
...
...
...
...
a = delayed 1 + 2

...
...
...
b = delayed b * 3

...
...
...
c = delayed 12/3

...
...
...
my_lazy_func(b, delayed c) # evaluates b but not yet c b 9 delayed c <delayed object at 0x123456789>

If you want to do something like Dask... or for Dask itself to be able to use it in some eventual version, you'd need to be able to keep objects from evaluating even while you passed them around. The obvious use is for finding parallelism, but other things like passing callbacks might find this useful too.

Dask delayed objects stay lazy until you explicitly `.compute()` on them (which does so recursively on every lazy object that might go into the computation). This hypothetical new keyword would have object evaluate eagerly *unless* you explicitly kept them lazy. But the idea is that the programmer would still get to decide in their code.

On Feb 16, 2017 9:53 PM, "Joseph Jevnik" <joejev@gmail.com> wrote:

You might be interested in https://github.com/llllllllll/lazy_python, which implements the features you describe but instead of a keyword it uses a decorator.

On Fri, Feb 17, 2017 at 12:24 AM, Joseph Hackman < josephhackman@gmail.com> wrote:

Howdy All!

This suggestion is inspired by the question on "Efficient debug logging".

I propose a keyword to mark an expression for delayed/lazy execution, for the purposes of standardizing such behavior across the language.

The proposed format is: delayed: <expr> i.e. log.info("info is %s", delayed: expensiveFunction())

Unlike 'lambda' which returns a function (so the receiver must be lambda-aware), delayed execution blocks are for all purposes values. The first time the value (rather than location) is read, or any method on the delayed object is called, the expression is executed and the delayed expression is replaced with the result. (Thus, the delayed expression is only every evaluated once).

Ideally: a = delayed: 1+2 b = a print(a) #adds 1 and 2, prints 3 # a and b are now both just 3 print(b) #just prints 3

Mechanically, this would be similar to the following:

class Delayed(): def __init__(self, func): self.__func = func self.__executed = False self.__value = None

def __str__(self): if self.__executed: return self.__value.__str__() self.__value = self.__func() self.__executed = True return self.__value.__str__()

def function_print(value): print('function_print') print(value)

def function_return_stuff(value): print('function_return_stuff') return value

function_print(function_return_stuff('no_delay'))

function_print(Delayed(lambda: function_return_stuff('delayed')))

delayed = Delayed(lambda: function_return_stuff('delayed_object')) function_print(delayed) function_print(delayed)

Unfortunately, due to https://docs.python.org/3/r eference/datamodel.html#special-lookup , this magic delayed class would need to implement many magic methods, as __getattribute__ is not _always_ called.

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

David Mertz

7:26 a.m.

On Thu, Feb 16, 2017 at 11:15 PM, David Mertz <mertz@gnosis.cx> wrote:

...

This also means that a 'delayed' object needs to be idempotent. So

x = delayed 2+2

y = delayed x

z = delayed delayed delayed y

Wrapping more delays around an existing delayed object should probably just keep the same object rather than "doubly delaying" it. If there is some reason to create separate delayed objects that isn't occurring to me, evaluating 'z' would still go through the multiple evaluation levels until it got to a non-delayed value.

This is sort of like how iterators "return self" and 'it = iter(it)'. In the case of Dask, wrapping more delayed objects creates layers of these lazy objects. But I think it has to because it's not part of the syntax. Actually, I guess Dask could do graph reduction without actual computation if it wanted to. But this is the current behavior:

...

...
...
def unchanged(x): ... return x a = delayed(unchanged)(42) a Delayed('unchanged-1780fed6-f835-4c31-a86d-50015ae1449a') b = delayed(unchanged)(a) c = delayed(unchanged)(b) c Delayed('unchanged-adc5e307-6e33-45bf-ad73-150b906e921d') c.dask {'unchanged-1780fed6-f835-4c31-a86d-50015ae1449a': (<function __main__.unchanged>, 42), 'unchanged-adc5e307-6e33-45bf-ad73-150b906e921d': (<function __main__.unchanged>, 'unchanged-c3282bc4-bdaa-4148-8509-9155cac83ef0'), 'unchanged-c3282bc4-bdaa-4148-8509-9155cac83ef0': (<function __main__.unchanged>, 'unchanged-1780fed6-f835-4c31-a86d-50015ae1449a')}

...

...
...
c.compute() 42

Actually Dask *cannot* know that "unchanged()" is the function that makes no transformation on its one parameter. From what it can see, it's just a function that does *something*. And I guess similarly in the proposed syntax, anything other than a plain name after the 'delayed' would still need to create a new delayed object. So it's all an edge case that doesn't make much difference. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

Joseph Jevnik

8:02 a.m.

You can let dask "see" into the function by entering it and wrapping all of the operations in `delayed`; this is how daisy[0] builds up large compute graphs. In this case, you could "inline" the identity function and the delayed object would flow through the function and the call to identity never makes it into the task graph. [0] http://daisy-python.readthedocs.io/en/latest/appendix.html#daisy.autodask On Fri, Feb 17, 2017 at 2:26 AM, David Mertz <mertz@gnosis.cx> wrote:

...

On Thu, Feb 16, 2017 at 11:15 PM, David Mertz <mertz@gnosis.cx> wrote:

...
This also means that a 'delayed' object needs to be idempotent. So

x = delayed 2+2

y = delayed x

z = delayed delayed delayed y

Wrapping more delays around an existing delayed object should probably just keep the same object rather than "doubly delaying" it. If there is some reason to create separate delayed objects that isn't occurring to me, evaluating 'z' would still go through the multiple evaluation levels until it got to a non-delayed value.

This is sort of like how iterators "return self" and 'it = iter(it)'.

In the case of Dask, wrapping more delayed objects creates layers of these lazy objects. But I think it has to because it's not part of the syntax. Actually, I guess Dask could do graph reduction without actual computation if it wanted to. But this is the current behavior:

...
...
...
def unchanged(x): ... return x a = delayed(unchanged)(42) a Delayed('unchanged-1780fed6-f835-4c31-a86d-50015ae1449a') b = delayed(unchanged)(a) c = delayed(unchanged)(b) c Delayed('unchanged-adc5e307-6e33-45bf-ad73-150b906e921d') c.dask {'unchanged-1780fed6-f835-4c31-a86d-50015ae1449a': (<function __main__.unchanged>, 42), 'unchanged-adc5e307-6e33-45bf-ad73-150b906e921d': (<function __main__.unchanged>, 'unchanged-c3282bc4-bdaa-4148-8509-9155cac83ef0'), 'unchanged-c3282bc4-bdaa-4148-8509-9155cac83ef0': (<function __main__.unchanged>, 'unchanged-1780fed6-f835-4c31-a86d-50015ae1449a')}

...
...
...
c.compute() 42

Actually Dask *cannot* know that "unchanged()" is the function that makes no transformation on its one parameter. From what it can see, it's just a function that does *something*. And I guess similarly in the proposed syntax, anything other than a plain name after the 'delayed' would still need to create a new delayed object. So it's all an edge case that doesn't make much difference.

-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

David Mertz

8:31 a.m.

I had forgotten about Daisy! It's an interesting project too. The behavior of 'autodask()' is closer to what I'd want in new syntax than is plain dask.delayed(). I'm not sure of all the corners. But is definitely love to have it for expressions generally, not only pure functions. On Feb 17, 2017 12:03 AM, "Joseph Jevnik" <joejev@gmail.com> wrote:

...

You can let dask "see" into the function by entering it and wrapping all of the operations in `delayed`; this is how daisy[0] builds up large compute graphs. In this case, you could "inline" the identity function and the delayed object would flow through the function and the call to identity never makes it into the task graph.

[0] http://daisy-python.readthedocs.io/en/latest/ appendix.html#daisy.autodask

On Fri, Feb 17, 2017 at 2:26 AM, David Mertz <mertz@gnosis.cx> wrote:

...
On Thu, Feb 16, 2017 at 11:15 PM, David Mertz <mertz@gnosis.cx> wrote:

...
This also means that a 'delayed' object needs to be idempotent. So

x = delayed 2+2

y = delayed x

z = delayed delayed delayed y

Wrapping more delays around an existing delayed object should probably just keep the same object rather than "doubly delaying" it. If there is some reason to create separate delayed objects that isn't occurring to me, evaluating 'z' would still go through the multiple evaluation levels until it got to a non-delayed value.

This is sort of like how iterators "return self" and 'it = iter(it)'.

In the case of Dask, wrapping more delayed objects creates layers of these lazy objects. But I think it has to because it's not part of the syntax. Actually, I guess Dask could do graph reduction without actual computation if it wanted to. But this is the current behavior:

...
...
...
def unchanged(x): ... return x a = delayed(unchanged)(42) a Delayed('unchanged-1780fed6-f835-4c31-a86d-50015ae1449a') b = delayed(unchanged)(a) c = delayed(unchanged)(b) c Delayed('unchanged-adc5e307-6e33-45bf-ad73-150b906e921d') c.dask {'unchanged-1780fed6-f835-4c31-a86d-50015ae1449a': (<function __main__.unchanged>, 42), 'unchanged-adc5e307-6e33-45bf-ad73-150b906e921d': (<function __main__.unchanged>, 'unchanged-c3282bc4-bdaa-4148-8509-9155cac83ef0'), 'unchanged-c3282bc4-bdaa-4148-8509-9155cac83ef0': (<function __main__.unchanged>, 'unchanged-1780fed6-f835-4c31-a86d-50015ae1449a')}

...
...
...
c.compute() 42

Actually Dask *cannot* know that "unchanged()" is the function that makes no transformation on its one parameter. From what it can see, it's just a function that does *something*. And I guess similarly in the proposed syntax, anything other than a plain name after the 'delayed' would still need to create a new delayed object. So it's all an edge case that doesn't make much difference.

-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

Joseph Jevnik

9:06 a.m.

Even with the new syntax I would highly discourage delaying a function with observable side effects. It would make reasoning about the behavior of the program very difficult and debugging becomes much harder. On Fri, Feb 17, 2017 at 3:31 AM, David Mertz <mertz@gnosis.cx> wrote:

...

I had forgotten about Daisy! It's an interesting project too. The behavior of 'autodask()' is closer to what I'd want in new syntax than is plain dask.delayed(). I'm not sure of all the corners. But is definitely love to have it for expressions generally, not only pure functions.

On Feb 17, 2017 12:03 AM, "Joseph Jevnik" <joejev@gmail.com> wrote:

...
You can let dask "see" into the function by entering it and wrapping all of the operations in `delayed`; this is how daisy[0] builds up large compute graphs. In this case, you could "inline" the identity function and the delayed object would flow through the function and the call to identity never makes it into the task graph.

[0] http://daisy-python.readthedocs.io/en/latest/appendix.html# daisy.autodask

On Fri, Feb 17, 2017 at 2:26 AM, David Mertz <mertz@gnosis.cx> wrote:

...
On Thu, Feb 16, 2017 at 11:15 PM, David Mertz <mertz@gnosis.cx> wrote:

...
This also means that a 'delayed' object needs to be idempotent. So

x = delayed 2+2

y = delayed x

z = delayed delayed delayed y

Wrapping more delays around an existing delayed object should probably just keep the same object rather than "doubly delaying" it. If there is some reason to create separate delayed objects that isn't occurring to me, evaluating 'z' would still go through the multiple evaluation levels until it got to a non-delayed value.

This is sort of like how iterators "return self" and 'it = iter(it)'.

In the case of Dask, wrapping more delayed objects creates layers of these lazy objects. But I think it has to because it's not part of the syntax. Actually, I guess Dask could do graph reduction without actual computation if it wanted to. But this is the current behavior:

...
...
...
def unchanged(x): ... return x a = delayed(unchanged)(42) a Delayed('unchanged-1780fed6-f835-4c31-a86d-50015ae1449a') b = delayed(unchanged)(a) c = delayed(unchanged)(b) c Delayed('unchanged-adc5e307-6e33-45bf-ad73-150b906e921d') c.dask {'unchanged-1780fed6-f835-4c31-a86d-50015ae1449a': (<function __main__.unchanged>, 42), 'unchanged-adc5e307-6e33-45bf-ad73-150b906e921d': (<function __main__.unchanged>, 'unchanged-c3282bc4-bdaa-4148-8509-9155cac83ef0'), 'unchanged-c3282bc4-bdaa-4148-8509-9155cac83ef0': (<function __main__.unchanged>, 'unchanged-1780fed6-f835-4c31-a86d-50015ae1449a')}

...
...
...
c.compute() 42

Actually Dask *cannot* know that "unchanged()" is the function that makes no transformation on its one parameter. From what it can see, it's just a function that does *something*. And I guess similarly in the proposed syntax, anything other than a plain name after the 'delayed' would still need to create a new delayed object. So it's all an edge case that doesn't make much difference.

-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

David Mertz

9:14 a.m.

Agreed. But there might be cases where something occurring at most one—at some unspecified time—is desirable behavior. In general though, I think avoiding side effects should be programming recommendations, not anything enforced. This model isn't really so different from what we do with asyncio and its "call soon" indeterminate order. On Feb 17, 2017 1:07 AM, "Joseph Jevnik" <joejev@gmail.com> wrote:

...

Even with the new syntax I would highly discourage delaying a function with observable side effects. It would make reasoning about the behavior of the program very difficult and debugging becomes much harder.

On Fri, Feb 17, 2017 at 3:31 AM, David Mertz <mertz@gnosis.cx> wrote:

...
I had forgotten about Daisy! It's an interesting project too. The behavior of 'autodask()' is closer to what I'd want in new syntax than is plain dask.delayed(). I'm not sure of all the corners. But is definitely love to have it for expressions generally, not only pure functions.

On Feb 17, 2017 12:03 AM, "Joseph Jevnik" <joejev@gmail.com> wrote:

...
You can let dask "see" into the function by entering it and wrapping all of the operations in `delayed`; this is how daisy[0] builds up large compute graphs. In this case, you could "inline" the identity function and the delayed object would flow through the function and the call to identity never makes it into the task graph.

[0] http://daisy-python.readthedocs.io/en/latest/appendix.html#d aisy.autodask

On Fri, Feb 17, 2017 at 2:26 AM, David Mertz <mertz@gnosis.cx> wrote:

...
On Thu, Feb 16, 2017 at 11:15 PM, David Mertz <mertz@gnosis.cx> wrote:

...
This also means that a 'delayed' object needs to be idempotent. So

x = delayed 2+2

y = delayed x

z = delayed delayed delayed y

Wrapping more delays around an existing delayed object should probably just keep the same object rather than "doubly delaying" it. If there is some reason to create separate delayed objects that isn't occurring to me, evaluating 'z' would still go through the multiple evaluation levels until it got to a non-delayed value.

This is sort of like how iterators "return self" and 'it = iter(it)'.

In the case of Dask, wrapping more delayed objects creates layers of these lazy objects. But I think it has to because it's not part of the syntax. Actually, I guess Dask could do graph reduction without actual computation if it wanted to. But this is the current behavior:

...
...
> def unchanged(x): ... return x > a = delayed(unchanged)(42) > a Delayed('unchanged-1780fed6-f835-4c31-a86d-50015ae1449a') > b = delayed(unchanged)(a) > c = delayed(unchanged)(b) > c Delayed('unchanged-adc5e307-6e33-45bf-ad73-150b906e921d') > c.dask {'unchanged-1780fed6-f835-4c31-a86d-50015ae1449a': (<function __main__.unchanged>, 42), 'unchanged-adc5e307-6e33-45bf-ad73-150b906e921d': (<function __main__.unchanged>, 'unchanged-c3282bc4-bdaa-4148-8509-9155cac83ef0'), 'unchanged-c3282bc4-bdaa-4148-8509-9155cac83ef0': (<function __main__.unchanged>, 'unchanged-1780fed6-f835-4c31-a86d-50015ae1449a')}

...
...
> c.compute() 42

Actually Dask *cannot* know that "unchanged()" is the function that makes no transformation on its one parameter. From what it can see, it's just a function that does *something*. And I guess similarly in the proposed syntax, anything other than a plain name after the 'delayed' would still need to create a new delayed object. So it's all an edge case that doesn't make much difference.

-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

Stephan Houben

9:32 a.m.

Hi all, If we want this it might be interesting to investigate what the Scheme community has been doing, since they have had this (under the name "promises") for many years. Basically: Scheme: (delay expr) <=> proposed Python: delayed: expr The Scheme community has experimented with what they call "auto-forcing", i.e. a promise can be given to any primitive operation and is then forced. However this has not caught on. Possibly for a good reason ;-) (My gut feeling: too much magic. Explicit is better than implicit.) Note that Racket/PLT Scheme has also "lazy" in addition to "delay". The rationale for this is given in: "How to add laziness to a strict language without even being odd", Philip Wadler, Walid Taha, David MacQueen https://www.researchgate.net/publication/2646969_How_to_Add_Laziness_to_a_St... It would be good to read and consider this before we reinvent the square wheel ;-) Stephan 2017-02-17 10:14 GMT+01:00 David Mertz <david.mertz@gmail.com>:

...

Agreed. But there might be cases where something occurring at most one—at some unspecified time—is desirable behavior. In general though, I think avoiding side effects should be programming recommendations, not anything enforced.

This model isn't really so different from what we do with asyncio and its "call soon" indeterminate order.

On Feb 17, 2017 1:07 AM, "Joseph Jevnik" <joejev@gmail.com> wrote:

...
Even with the new syntax I would highly discourage delaying a function with observable side effects. It would make reasoning about the behavior of the program very difficult and debugging becomes much harder.

On Fri, Feb 17, 2017 at 3:31 AM, David Mertz <mertz@gnosis.cx> wrote:

...
I had forgotten about Daisy! It's an interesting project too. The behavior of 'autodask()' is closer to what I'd want in new syntax than is plain dask.delayed(). I'm not sure of all the corners. But is definitely love to have it for expressions generally, not only pure functions.

On Feb 17, 2017 12:03 AM, "Joseph Jevnik" <joejev@gmail.com> wrote:

...
You can let dask "see" into the function by entering it and wrapping all of the operations in `delayed`; this is how daisy[0] builds up large compute graphs. In this case, you could "inline" the identity function and the delayed object would flow through the function and the call to identity never makes it into the task graph.

[0] http://daisy-python.readthedocs.io/en/latest/appendix.html#d aisy.autodask

On Fri, Feb 17, 2017 at 2:26 AM, David Mertz <mertz@gnosis.cx> wrote:

...
On Thu, Feb 16, 2017 at 11:15 PM, David Mertz <mertz@gnosis.cx> wrote:

...
This also means that a 'delayed' object needs to be idempotent. So

x = delayed 2+2

y = delayed x

z = delayed delayed delayed y

Wrapping more delays around an existing delayed object should probably just keep the same object rather than "doubly delaying" it. If there is some reason to create separate delayed objects that isn't occurring to me, evaluating 'z' would still go through the multiple evaluation levels until it got to a non-delayed value.

This is sort of like how iterators "return self" and 'it = iter(it)'.

In the case of Dask, wrapping more delayed objects creates layers of these lazy objects. But I think it has to because it's not part of the syntax. Actually, I guess Dask could do graph reduction without actual computation if it wanted to. But this is the current behavior:

...
>> def unchanged(x): ... return x >> a = delayed(unchanged)(42) >> a Delayed('unchanged-1780fed6-f835-4c31-a86d-50015ae1449a') >> b = delayed(unchanged)(a) >> c = delayed(unchanged)(b) >> c Delayed('unchanged-adc5e307-6e33-45bf-ad73-150b906e921d') >> c.dask {'unchanged-1780fed6-f835-4c31-a86d-50015ae1449a': (<function __main__.unchanged>, 42), 'unchanged-adc5e307-6e33-45bf-ad73-150b906e921d': (<function __main__.unchanged>, 'unchanged-c3282bc4-bdaa-4148-8509-9155cac83ef0'), 'unchanged-c3282bc4-bdaa-4148-8509-9155cac83ef0': (<function __main__.unchanged>, 'unchanged-1780fed6-f835-4c31-a86d-50015ae1449a')}

...
>> c.compute() 42

Actually Dask *cannot* know that "unchanged()" is the function that makes no transformation on its one parameter. From what it can see, it's just a function that does *something*. And I guess similarly in the proposed syntax, anything other than a plain name after the 'delayed' would still need to create a new delayed object. So it's all an edge case that doesn't make much difference.

-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Steven D'Aprano

11:10 a.m.

On Fri, Feb 17, 2017 at 12:24:53AM -0500, Joseph Hackman wrote:

...

I propose a keyword to mark an expression for delayed/lazy execution, for the purposes of standardizing such behavior across the language.

The proposed format is: delayed: <expr> i.e. log.info("info is %s", delayed: expensiveFunction())

Keywords are difficult: since by definition they are not backwards compatible, they make it hard for people to write version independent code, and will break people's code. Especially something like "delayed", I expect that there is lots of code that used "delayed" as a regular name. if status.delayed: ... A new keyword means it can't be back-ported to older versions, and will break code.

...

Unlike 'lambda' which returns a function (so the receiver must be lambda-aware), delayed execution blocks are for all purposes values. The first time the value (rather than location) is read,

What counts as "reading" a value? Based on your example below, I can't tell if passing the object to *any* function is enough to trigger evaluation, or specifically print is the magic that makes it happen.

...

or any method on the delayed object is called,

I don't think that can work -- it would have to be any attribute access, surely, because Python couldn't tell if the attribute was a method or not until it evaluated the lazy object. Consider: spam = delayed: complex_calculation() a = spam.thingy What's `a` at this point? Is is still some sort of lazy object, waiting to be evaluated? If so, how is Python supposed to know if its a method? result = a()

...

the expression is executed and the delayed expression is replaced with the result. (Thus, the delayed expression is only every evaluated once).

That's easily done by having the "delayed" keyword cache each expression it sees, but that seems like a bad idea to me: spam = delayed: get_random_string() eggs = delayed: get_random_string() # the same expression spam.upper() # convert to a real value assert spam == eggs # always true, as they are the same expression Worse, suppose module a.py has: spam = delayed: calculate(1) and module b.py has: eggs = delayed: calculate(1) where a.calculate and b.calculate do completely different things. The result you get will depend on which happens to be evaluated first and cached, and would be a nightmare to debug. Truely spooky action-at-a- distance code. I think it is better to stick to a more straight-forward, easily understood and debugged system based on object identity rather than expressions.

...

Ideally: a = delayed: 1+2 b = a print(a) #adds 1 and 2, prints 3 # a and b are now both just 3 print(b) #just prints 3

That would work based on object identity too. By the way, that's probably not the best example, because the keyhole optimizer will likely have compiled 1+2 as just 3, so you're effectively writing: a = delayed: 3 At least, that's what I would want: I would argue strongly against lazy objects somehow defeating the keyhole optimizer. If I write: a = delayed: complex_calculation(1+2+3, 4.5/3, 'abcd'*3) what I hope will be compiled is: a = delayed: complex_calculation(6, 0.6428571428571429, 'abcdabcdabcd') same as it would be now (at least in CPython), apart from the "delayed:" keyword. a = delayed: complex_calculation(1+2+3, 4.5/3, 'abcd'*3)

...

Mechanically, this would be similar to the following:

class Delayed(): def __init__(self, func): self.__func = func self.__executed = False self.__value = None

def __str__(self): if self.__executed: return self.__value.__str__() self.__value = self.__func() self.__executed = True return self.__value.__str__()

So you're suggesting that calling str(delayed_object) is the one way to force evaluation? I have no idea what the following code is supposed to mean.

...

def function_print(value): print('function_print') print(value)

def function_return_stuff(value): print('function_return_stuff') return value

function_print(function_return_stuff('no_delay'))

function_print(Delayed(lambda: function_return_stuff('delayed')))

delayed = Delayed(lambda: function_return_stuff('delayed_object')) function_print(delayed) function_print(delayed)

-- Steve

Chris Angelico

11:21 a.m.

On Fri, Feb 17, 2017 at 10:10 PM, Steven D'Aprano <steve@pearwood.info> wrote:

...

...
the expression is executed and the delayed expression is replaced with the result. (Thus, the delayed expression is only every evaluated once).

That's easily done by having the "delayed" keyword cache each expression it sees, but that seems like a bad idea to me:

spam = delayed: get_random_string() eggs = delayed: get_random_string() # the same expression

spam.upper() # convert to a real value

assert spam == eggs # always true, as they are the same expression

Worse, suppose module a.py has:

spam = delayed: calculate(1)

and module b.py has:

eggs = delayed: calculate(1)

where a.calculate and b.calculate do completely different things. The result you get will depend on which happens to be evaluated first and cached, and would be a nightmare to debug. Truely spooky action-at-a- distance code.

My understanding is that a single 'delayed expression' will be evaluated at most once. It's more like this: spam = delayed: get_random_string() eggs = spam spam.upper() # At this point, the object becomes a real value assert spam is eggs # always true, as they are the same object Two instances of "delayed:" will create two distinct delayed expressions. The big question, though, is what triggers evaluation. AIUI, merely referencing the object doesn't (so "eggs = spam" won't force evaluation), but I'm not sure what does. Do delayed-expressions have identities or only values? For example: rand = delayed: random.randrange(10) otherrand = rand assert rand is otherrand # legal? randid = id(rand) # legal? print(rand) # force to concrete value assert any(rand is x for x in range(10)) # CPython int caching assert randid == id(rand) # now what? Alternatively, once the value becomes concrete, the delayed-expression becomes a trampoline/proxy to the actual value. Its identity remains unchanged, but all attribute lookups would be passed on to the other object. That does mean a permanent performance penalty though - particularly if it's doing it all in Python code rather than some sort of quick C bouncer. ChrisA

Pavol Lisy

3:02 p.m.

On 2/17/17, Chris Angelico <rosuav@gmail.com> wrote:

...

Do delayed-expressions have identities or only values? For example:

rand = delayed: random.randrange(10) otherrand = rand assert rand is otherrand # legal? randid = id(rand) # legal? print(rand) # force to concrete value assert any(rand is x for x in range(10)) # CPython int caching assert randid == id(rand) # now what?

Alternatively, once the value becomes concrete, the delayed-expression becomes a trampoline/proxy to the actual value. Its identity remains unchanged, but all attribute lookups would be passed on to the other object. That does mean a permanent performance penalty though - particularly if it's doing it all in Python code rather than some sort of quick C bouncer.

what about: lazy_string = delayed: f"{fnc()}" could it be possible? BTW I was also thinking about something like l-strings (aka lazy f-string) to solve this problem: logger.debug("format {expensive()}") Still not sure if it is a good idea but with something like this: class Lazy_String: ''' this is only quick and dirty test implementation! ''' def __init__(self, string=None): if string is None: self.body = "'None'" else: self.body = compile('f'+repr(string), "<lazy_string>", 'eval') def __sub__(self, string): self.__init__(string) return self def __str__(self): return eval(self.body) def __repr__(self): return self.__str__() lazy = Lazy_String() we could use this: from lazy_string import lazy as l logger.debug(l-"format {expensive()}") # kind of l-string without new syntax

Joseph Hackman

3:12 p.m.

A few points for clarity: Yes, I would expect each instance of delayed to result in a new delayed expression, without caching, except for multiple calls to that same delayed expression instance. Also, I suggested the colon : because unlike async/await, the following expression is NOT executed at the time delayed is executed. The only other ways to do this on Python I know of are def and lambda, both of which use colons to highlight this. As for what triggers execution? I think everything except being on the right side of an assignment. Even identity. So if a delayed expression would evaluate to None, then code that checks is None should return true. I think this is important to ensure that no code needs to be changed to support this feature. So for Chris: yes, rand is otherrand not because the delayed instances are the same, but because the value under them is the same, and the is should trigger evaluation. Otherwise code would need to be delayed aware. Finally as for the word delayed. Yes, it could be a symbol, or existing keyword, but I have no suggestions on that front. -Joseph

...

On Feb 17, 2017, at 6:21 AM, Chris Angelico <rosuav@gmail.com> wrote:

On Fri, Feb 17, 2017 at 10:10 PM, Steven D'Aprano <steve@pearwood.info> wrote:

...
...
the expression is executed and the delayed expression is replaced with the result. (Thus, the delayed expression is only every evaluated once).

That's easily done by having the "delayed" keyword cache each expression it sees, but that seems like a bad idea to me:

spam = delayed: get_random_string() eggs = delayed: get_random_string() # the same expression

spam.upper() # convert to a real value

assert spam == eggs # always true, as they are the same expression

Worse, suppose module a.py has:

spam = delayed: calculate(1)

and module b.py has:

eggs = delayed: calculate(1)

where a.calculate and b.calculate do completely different things. The result you get will depend on which happens to be evaluated first and cached, and would be a nightmare to debug. Truely spooky action-at-a- distance code.

My understanding is that a single 'delayed expression' will be evaluated at most once. It's more like this:

spam = delayed: get_random_string() eggs = spam

spam.upper() # At this point, the object becomes a real value assert spam is eggs # always true, as they are the same object

Two instances of "delayed:" will create two distinct delayed expressions.

The big question, though, is what triggers evaluation. AIUI, merely referencing the object doesn't (so "eggs = spam" won't force evaluation), but I'm not sure what does.

Do delayed-expressions have identities or only values? For example:

rand = delayed: random.randrange(10) otherrand = rand assert rand is otherrand # legal? randid = id(rand) # legal? print(rand) # force to concrete value assert any(rand is x for x in range(10)) # CPython int caching assert randid == id(rand) # now what?

Alternatively, once the value becomes concrete, the delayed-expression becomes a trampoline/proxy to the actual value. Its identity remains unchanged, but all attribute lookups would be passed on to the other object. That does mean a permanent performance penalty though - particularly if it's doing it all in Python code rather than some sort of quick C bouncer.

ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Chris Angelico

3:45 p.m.

On Sat, Feb 18, 2017 at 2:12 AM, Joseph Hackman <josephhackman@gmail.com> wrote:

...

As for what triggers execution? I think everything except being on the right side of an assignment. Even identity. So if a delayed expression would evaluate to None, then code that checks is None should return true. I think this is important to ensure that no code needs to be changed to support this feature.

So for Chris: yes, rand is otherrand not because the delayed instances are the same, but because the value under them is the same, and the is should trigger evaluation. Otherwise code would need to be delayed aware.

Interesting. Okay. So in effect, these things aren't objects, they're magic constructs that turn into objects the moment you do anything with them, even an identity check. That makes sense. Can you put deferred objects into collections, or will they instantly collapse into concrete ones? ChrisA

Joseph Hackman

4:29 p.m.

Pavol: I think that some sort of magic string that is not a string and is actually containing Python code could function, but is less elegant. ChrisA: I am not sure about collections. I think it may be fine to not special case it: if the act of putting it in the collection reads anything, then it is evaluated, and if it doesn't it isn't. The ideal design goal for this would be that all existing code continues to function as if the change wasn't made at all, except that the value is evaluated at a different time. -Joseph

...

On Feb 17, 2017, at 10:45 AM, Chris Angelico <rosuav@gmail.com> wrote:

...
On Sat, Feb 18, 2017 at 2:12 AM, Joseph Hackman <josephhackman@gmail.com> wrote: As for what triggers execution? I think everything except being on the right side of an assignment. Even identity. So if a delayed expression would evaluate to None, then code that checks is None should return true. I think this is important to ensure that no code needs to be changed to support this feature.

So for Chris: yes, rand is otherrand not because the delayed instances are the same, but because the value under them is the same, and the is should trigger evaluation. Otherwise code would need to be delayed aware.

Interesting. Okay. So in effect, these things aren't objects, they're magic constructs that turn into objects the moment you do anything with them, even an identity check. That makes sense.

Can you put deferred objects into collections, or will they instantly collapse into concrete ones?

ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Chris Angelico

4:34 p.m.

On Sat, Feb 18, 2017 at 3:29 AM, Joseph Hackman <josephhackman@gmail.com> wrote:

...

ChrisA: I am not sure about collections. I think it may be fine to not special case it: if the act of putting it in the collection reads anything, then it is evaluated, and if it doesn't it isn't. The ideal design goal for this would be that all existing code continues to function as if the change wasn't made at all, except that the value is evaluated at a different time.

Yeah, I'm just worried that it'll become useless without that. For instance, passing arguments to a function that uses *a,**kw is going to package your thunk into a collection, and that's how (eg) the logging module will process it. It's not going to be easy to have a simple AND useful definition of "this collapses the waveform, that keeps it in a quantum state", but sorting that out is fairly key to the proposal. ChrisA

Joseph Hackman

4:43 p.m.

Agreed. I think this may require some TLC to get right, but posting here for feedback on the idea overall seemed like a good start. As far as I know, the basic list and dict do not inspect what they contain. I.e. d = {} d['a']= delayed: stuff() b=d['a'] b would end up as still the thunk, and stuff wouldn't be executed until either d['a'] or b actually is read from. -Joseph

...

On Feb 17, 2017, at 11:34 AM, Chris Angelico <rosuav@gmail.com> wrote:

...
On Sat, Feb 18, 2017 at 3:29 AM, Joseph Hackman <josephhackman@gmail.com> wrote: ChrisA: I am not sure about collections. I think it may be fine to not special case it: if the act of putting it in the collection reads anything, then it is evaluated, and if it doesn't it isn't. The ideal design goal for this would be that all existing code continues to function as if the change wasn't made at all, except that the value is evaluated at a different time.

Yeah, I'm just worried that it'll become useless without that. For instance, passing arguments to a function that uses *a,**kw is going to package your thunk into a collection, and that's how (eg) the logging module will process it.

It's not going to be easy to have a simple AND useful definition of "this collapses the waveform, that keeps it in a quantum state", but sorting that out is fairly key to the proposal.

ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Abe Dillon

4:57 p.m.

I'd like to suggest a shorter keyword: `lazy` This isn't an endorsement. I haven't had time to digest how big this change would be. If this is implemented, I'd also like to suggest that perhaps packing and unpacking should be delayed by default and not evaluated until the contents are used. It might save on many pesky edge cases that would evaluate your expression unnecessarily. On Fri, Feb 17, 2017 at 10:43 AM, Joseph Hackman <josephhackman@gmail.com> wrote:

...

Agreed. I think this may require some TLC to get right, but posting here for feedback on the idea overall seemed like a good start. As far as I know, the basic list and dict do not inspect what they contain. I.e.

d = {} d['a']= delayed: stuff() b=d['a']

b would end up as still the thunk, and stuff wouldn't be executed until either d['a'] or b actually is read from.

-Joseph

...
On Feb 17, 2017, at 11:34 AM, Chris Angelico <rosuav@gmail.com> wrote:

...
On Sat, Feb 18, 2017 at 3:29 AM, Joseph Hackman < josephhackman@gmail.com> wrote: ChrisA: I am not sure about collections. I think it may be fine to not special case it: if the act of putting it in the collection reads anything, then it is evaluated, and if it doesn't it isn't. The ideal design goal for this would be that all existing code continues to function as if the change wasn't made at all, except that the value is evaluated at a different time.

Yeah, I'm just worried that it'll become useless without that. For instance, passing arguments to a function that uses *a,**kw is going to package your thunk into a collection, and that's how (eg) the logging module will process it.

It's not going to be easy to have a simple AND useful definition of "this collapses the waveform, that keeps it in a quantum state", but sorting that out is fairly key to the proposal.

ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Abe Dillon

5:12 p.m.

Actually, following from the idea that packing and unpacking variables should be delayed by default, it might make sense to use syntax like:

...

...
...
a = *(2+2) b = a + 1

Instead of

...

...
...
a = lazy 2+2 # or whatever you want the keyword to be b = a + 1

That syntax sort-of resembles generator expressions, however; I usually like how python favors actual words over obscure symbol combinations for readability's sake. On Fri, Feb 17, 2017 at 10:57 AM, Abe Dillon <abedillon@gmail.com> wrote:

...

I'd like to suggest a shorter keyword: `lazy`

This isn't an endorsement. I haven't had time to digest how big this change would be.

If this is implemented, I'd also like to suggest that perhaps packing and unpacking should be delayed by default and not evaluated until the contents are used. It might save on many pesky edge cases that would evaluate your expression unnecessarily.

On Fri, Feb 17, 2017 at 10:43 AM, Joseph Hackman <josephhackman@gmail.com> wrote:

...
Agreed. I think this may require some TLC to get right, but posting here for feedback on the idea overall seemed like a good start. As far as I know, the basic list and dict do not inspect what they contain. I.e.

d = {} d['a']= delayed: stuff() b=d['a']

b would end up as still the thunk, and stuff wouldn't be executed until either d['a'] or b actually is read from.

-Joseph

...
On Feb 17, 2017, at 11:34 AM, Chris Angelico <rosuav@gmail.com> wrote:

...
On Sat, Feb 18, 2017 at 3:29 AM, Joseph Hackman < josephhackman@gmail.com> wrote: ChrisA: I am not sure about collections. I think it may be fine to not special case it: if the act of putting it in the collection reads anything, then it is evaluated, and if it doesn't it isn't. The ideal design goal for this would be that all existing code continues to function as if the change wasn't made at all, except that the value is evaluated at a different time.

Yeah, I'm just worried that it'll become useless without that. For instance, passing arguments to a function that uses *a,**kw is going to package your thunk into a collection, and that's how (eg) the logging module will process it.

It's not going to be easy to have a simple AND useful definition of "this collapses the waveform, that keeps it in a quantum state", but sorting that out is fairly key to the proposal.

ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Joshua Morton

6:55 p.m.

I did some quick thinking and a bit of research about some aspects of this proposal: There are a number of keyword options (delay, defer, lazy, delayed, deferred, etc.), a quick look through github says that of these, "deferred" seems to be the least used, but it still comes up quite a lot (350K times, lazy appears over 2 million). That's unfortunate, but I'm wondering how common async and await were when that was proposed and accepted? Another potential pitfall I'm realizing now (and apologies if this is unnecessary bikeshedding) is that this may require another keyword. If I understand the proposal correctly, at this point the suggestion is that `delayed <EXPR>` delays the evaluation of EXPR until it is requested in a non-delayed context. That is >>> x = delayed 1 + 2 >>> y = delayed 3 + 4 >>> z = delayed x + y # neither x nor y has been evaluated yet >>> z 10 This works fine for simple cases, but there might be situations where someone wants to control evaluation a bit more. As soon as we add any kind of state things get icky: # assuming Foo is a class an attribute `value` and a method # `inc` that increments the value and returns the current value >>> foo = Foo(value=0) >>> x = delayed sum([foo.inc(), foo.inc()]) >>> foo.inc() 1 >>> x 5 However, assuming this is a real, more complex system, we might want a way to have one or both of those inc calls be eagerly evaluated, which requires a way of signalling that in some way: >>> foo = Foo(value=0) >>> x = delayed sum([foo.inc(), eager foo.inc()]) >>> foo.inc() 2 >>> foo.inc() 4 Perhaps luckily, `eager` is less commonly used than even `delayed`, but still 2 keywords is an even higher bar. I guess another alternative would be to require annotating all subexpressions with `delayed`, but then that turns the above into >>> foo = Foo(value=0) >>> x = delayed sum([delayed foo.inc(), delayed foo.inc()]) >>> foo.inc() 1 >>> x 5 At which point `delayed` would need to be a much shorter keyword (the heathen in me says overload `del`). --Josh On Fri, Feb 17, 2017 at 12:13 PM Abe Dillon <abedillon@gmail.com> wrote:

...

Actually, following from the idea that packing and unpacking variables should be delayed by default, it might make sense to use syntax like:

...
...
...
a = *(2+2) b = a + 1

Instead of

...
...
...
a = lazy 2+2 # or whatever you want the keyword to be b = a + 1

That syntax sort-of resembles generator expressions, however; I usually like how python favors actual words over obscure symbol combinations for readability's sake.

On Fri, Feb 17, 2017 at 10:57 AM, Abe Dillon <abedillon@gmail.com> wrote:

I'd like to suggest a shorter keyword: `lazy`

This isn't an endorsement. I haven't had time to digest how big this change would be.

If this is implemented, I'd also like to suggest that perhaps packing and unpacking should be delayed by default and not evaluated until the contents are used. It might save on many pesky edge cases that would evaluate your expression unnecessarily.

On Fri, Feb 17, 2017 at 10:43 AM, Joseph Hackman <josephhackman@gmail.com> wrote:

Agreed. I think this may require some TLC to get right, but posting here for feedback on the idea overall seemed like a good start. As far as I know, the basic list and dict do not inspect what they contain. I.e.

d = {} d['a']= delayed: stuff() b=d['a']

b would end up as still the thunk, and stuff wouldn't be executed until either d['a'] or b actually is read from.

-Joseph

...
On Feb 17, 2017, at 11:34 AM, Chris Angelico <rosuav@gmail.com> wrote:

...
On Sat, Feb 18, 2017 at 3:29 AM, Joseph Hackman < josephhackman@gmail.com> wrote: ChrisA: I am not sure about collections. I think it may be fine to not special case it: if the act of putting it in the collection reads anything, then it is evaluated, and if it doesn't it isn't. The ideal design goal for this would be that all existing code continues to function as if the change wasn't made at all, except that the value is evaluated at a different time.

Yeah, I'm just worried that it'll become useless without that. For instance, passing arguments to a function that uses *a,**kw is going to package your thunk into a collection, and that's how (eg) the logging module will process it.

It's not going to be easy to have a simple AND useful definition of "this collapses the waveform, that keeps it in a quantum state", but sorting that out is fairly key to the proposal.

ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Joseph Hackman

7:13 p.m.

Hey! Excellent feedback! In my mind, which word is selected doesn't matter much to me. I think the technical term is 'thunk'? I think delayed is most clear. I'm not sure if eager execution is so common in this framework it needs its own keyword. Notably, default Python will handle that case x= Delayed: sum((foo.inc(), foo.inc())) Can become y=foo.inc() x= delayed: sum((foo.inc(), y)) It's less sugary, but seems ok to me. How does that seem? -Joseph

...

On Feb 17, 2017, at 1:55 PM, Joshua Morton <joshua.morton13@gmail.com> wrote:

I did some quick thinking and a bit of research about some aspects of this proposal:

There are a number of keyword options (delay, defer, lazy, delayed, deferred, etc.), a quick look through github says that of these, "deferred" seems to be the least used, but it still comes up quite a lot (350K times, lazy appears over 2 million). That's unfortunate, but I'm wondering how common async and await were when that was proposed and accepted?

Another potential pitfall I'm realizing now (and apologies if this is unnecessary bikeshedding) is that this may require another keyword. If I understand the proposal correctly, at this point the suggestion is that `delayed <EXPR>` delays the evaluation of EXPR until it is requested in a non-delayed context. That is

>>> x = delayed 1 + 2 >>> y = delayed 3 + 4 >>> z = delayed x + y # neither x nor y has been evaluated yet >>> z 10

This works fine for simple cases, but there might be situations where someone wants to control evaluation a bit more. As soon as we add any kind of state things get icky:

# assuming Foo is a class an attribute `value` and a method # `inc` that increments the value and returns the current value >>> foo = Foo(value=0) >>> x = delayed sum([foo.inc(), foo.inc()]) >>> foo.inc() 1 >>> x 5

However, assuming this is a real, more complex system, we might want a way to have one or both of those inc calls be eagerly evaluated, which requires a way of signalling that in some way:

>>> foo = Foo(value=0) >>> x = delayed sum([foo.inc(), eager foo.inc()]) >>> foo.inc() 2 >>> foo.inc() 4

Perhaps luckily, `eager` is less commonly used than even `delayed`, but still 2 keywords is an even higher bar. I guess another alternative would be to require annotating all subexpressions with `delayed`, but then that turns the above into

>>> foo = Foo(value=0) >>> x = delayed sum([delayed foo.inc(), delayed foo.inc()]) >>> foo.inc() 1 >>> x 5

At which point `delayed` would need to be a much shorter keyword (the heathen in me says overload `del`).

--Josh

...
On Fri, Feb 17, 2017 at 12:13 PM Abe Dillon <abedillon@gmail.com> wrote: Actually, following from the idea that packing and unpacking variables should be delayed by default, it might make sense to use syntax like:

...
...
...
a = *(2+2) b = a + 1

Instead of

...
...
...
a = lazy 2+2 # or whatever you want the keyword to be b = a + 1

That syntax sort-of resembles generator expressions, however; I usually like how python favors actual words over obscure symbol combinations for readability's sake.

On Fri, Feb 17, 2017 at 10:57 AM, Abe Dillon <abedillon@gmail.com> wrote: I'd like to suggest a shorter keyword: `lazy`

This isn't an endorsement. I haven't had time to digest how big this change would be.

If this is implemented, I'd also like to suggest that perhaps packing and unpacking should be delayed by default and not evaluated until the contents are used. It might save on many pesky edge cases that would evaluate your expression unnecessarily.

On Fri, Feb 17, 2017 at 10:43 AM, Joseph Hackman <josephhackman@gmail.com> wrote: Agreed. I think this may require some TLC to get right, but posting here for feedback on the idea overall seemed like a good start. As far as I know, the basic list and dict do not inspect what they contain. I.e.

d = {} d['a']= delayed: stuff() b=d['a']

b would end up as still the thunk, and stuff wouldn't be executed until either d['a'] or b actually is read from.

-Joseph

...
On Feb 17, 2017, at 11:34 AM, Chris Angelico <rosuav@gmail.com> wrote:

...
On Sat, Feb 18, 2017 at 3:29 AM, Joseph Hackman <josephhackman@gmail.com> wrote: ChrisA: I am not sure about collections. I think it may be fine to not special case it: if the act of putting it in the collection reads anything, then it is evaluated, and if it doesn't it isn't. The ideal design goal for this would be that all existing code continues to function as if the change wasn't made at all, except that the value is evaluated at a different time.

Yeah, I'm just worried that it'll become useless without that. For instance, passing arguments to a function that uses *a,**kw is going to package your thunk into a collection, and that's how (eg) the logging module will process it.

It's not going to be easy to have a simple AND useful definition of "this collapses the waveform, that keeps it in a quantum state", but sorting that out is fairly key to the proposal.

ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Ed Kellett

7:26 p.m.

I think trying to eager-ify subexpressions is absurdly difficult to do right, and also a problem that occurs in other places in Python already, so solving it only for this new thing that might very well go no further is a bit odd. I don't think versions that aren't transparent are much use.

...

Interesting. Okay. So in effect, these things aren't objects, they're magic constructs that turn into objects the moment you do anything with them, even an identity check. That makes sense.

This seems unfortunate. Why not make these things objects that replace themselves with the evaluated-to object when they're used?

...

"this collapses the waveform, that keeps it in a quantum state"

That's a bit of a false dichotomy ;) I suggest that operators on delayed-objects defer evaluation iff all of their operands are delayed, with some hopefully-obvious exceptions: - Function call: delayed_thing() should evaluate delayed_thing - Attribute and item access should evaluate the container and key: even if both operands are delayed, in Python we have to assume things are mutable (especially if we don't know what they are yet), so we can't guarantee that delaying the lookup is valid. Just passing something to a function shouldn't collapse it. That'd make this completely useless.

Joseph Jevnik

7:38 p.m.

Delayed execution and respecting mutable semantics seems like a nightmare. For most indexers we assume hashability which implies immutability, why can't we also do that here? Also, why do we need to evaluate callables eagerly? re the thunk replacing itself with the result instead of memoizing the result and living as an indirection: This is most likely impossible with the current memory model in CPython. Not all objects occupy the same space in memory so you wouldn't know how much space to allocate for the thunk. The interpreter has no way to find all the pointers in use so it cannot just do pointer cleanups to make everyone point to the newly allocated result. On Fri, Feb 17, 2017 at 2:26 PM, Ed Kellett <edk141@gmail.com> wrote:

...

I think trying to eager-ify subexpressions is absurdly difficult to do right, and also a problem that occurs in other places in Python already, so solving it only for this new thing that might very well go no further is a bit odd.

I don't think versions that aren't transparent are much use.

...
Interesting. Okay. So in effect, these things aren't objects, they're magic constructs that turn into objects the moment you do anything with them, even an identity check. That makes sense.

This seems unfortunate. Why not make these things objects that replace themselves with the evaluated-to object when they're used?

...
"this collapses the waveform, that keeps it in a quantum state"

That's a bit of a false dichotomy ;)

I suggest that operators on delayed-objects defer evaluation iff all of their operands are delayed, with some hopefully-obvious exceptions: - Function call: delayed_thing() should evaluate delayed_thing - Attribute and item access should evaluate the container and key: even if both operands are delayed, in Python we have to assume things are mutable (especially if we don't know what they are yet), so we can't guarantee that delaying the lookup is valid.

Just passing something to a function shouldn't collapse it. That'd make this completely useless.

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Ed Kellett

9:14 p.m.

On Fri, 17 Feb 2017 at 19:38 Joseph Jevnik <joejev@gmail.com> wrote:

...

Delayed execution and respecting mutable semantics seems like a nightmare. For most indexers we assume hashability which implies immutability, why can't we also do that here? Also, why do we need to evaluate callables eagerly?

Respecting mutability: we just have to always, we don't know if a delayed thing is hashable until we evaluate it. This thing has implications for existing code (since delayed objects can get anywhere) so it should be careful not to do anything too unpredictable, and I think d[k] meaning "whatever is in d[k] in five minutes' time" is unpredictable. One can always delay: d[k] if it's wanted. Evaluate calls: because if you don't, there's no way to say "strictly evaluate x() for its side effects".

Joseph Jevnik

9:17 p.m.

There is no existing code that uses delayed execution so we don't need to worry about breaking it. I think it would be much easier to reason about if forcing an expression was always explicit. I am not sure what you mean with the second case; why are you delaying a function if you care about the observable side-effect? On Fri, Feb 17, 2017 at 4:14 PM, Ed Kellett <edk141@gmail.com> wrote:

...

On Fri, 17 Feb 2017 at 19:38 Joseph Jevnik <joejev@gmail.com> wrote:

...
Delayed execution and respecting mutable semantics seems like a nightmare. For most indexers we assume hashability which implies immutability, why can't we also do that here? Also, why do we need to evaluate callables eagerly?

Respecting mutability: we just have to always, we don't know if a delayed thing is hashable until we evaluate it. This thing has implications for existing code (since delayed objects can get anywhere) so it should be careful not to do anything too unpredictable, and I think d[k] meaning "whatever is in d[k] in five minutes' time" is unpredictable. One can always delay: d[k] if it's wanted.

Evaluate calls: because if you don't, there's no way to say "strictly evaluate x() for its side effects".

Joseph Jevnik

9:20 p.m.

About the "whatever is d[k]" in five minutes comment: If I created an explict closure like: `thunk = lambda: d[k]` and then mutated `d` before evaluating the closure you would have the same issue. I don't think it is that confusing. If you need to know what `d[k]` evaluates to right now then the order of evaluation is part of the correctness of your program and you need to sequence execution such that `d` is evaluated before creating that closure. On Fri, Feb 17, 2017 at 4:17 PM, Joseph Jevnik <joejev@gmail.com> wrote:

...

There is no existing code that uses delayed execution so we don't need to worry about breaking it. I think it would be much easier to reason about if forcing an expression was always explicit. I am not sure what you mean with the second case; why are you delaying a function if you care about the observable side-effect?

On Fri, Feb 17, 2017 at 4:14 PM, Ed Kellett <edk141@gmail.com> wrote:

...
On Fri, 17 Feb 2017 at 19:38 Joseph Jevnik <joejev@gmail.com> wrote:

...
Delayed execution and respecting mutable semantics seems like a nightmare. For most indexers we assume hashability which implies immutability, why can't we also do that here? Also, why do we need to evaluate callables eagerly?

Respecting mutability: we just have to always, we don't know if a delayed thing is hashable until we evaluate it. This thing has implications for existing code (since delayed objects can get anywhere) so it should be careful not to do anything too unpredictable, and I think d[k] meaning "whatever is in d[k] in five minutes' time" is unpredictable. One can always delay: d[k] if it's wanted.

Evaluate calls: because if you don't, there's no way to say "strictly evaluate x() for its side effects".

Ed Kellett

9:52 p.m.

On Fri, 17 Feb 2017 at 21:21 Joseph Jevnik <joejev@gmail.com> wrote:

...

About the "whatever is d[k]" in five minutes comment: If I created an explict closure like: `thunk = lambda: d[k]` and then mutated `d` before evaluating the closure you would have the same issue. I don't think it is that confusing. If you need to know what `d[k]` evaluates to right now then the order of evaluation is part of the correctness of your program and you need to sequence execution such that `d` is evaluated before creating that closure.

If you create an explicit closure, sure. With delayed expressions, you could explicitly delay d[k], too. If you have an existing d and k, potentially passed to you by somebody else's code, the delayedness of d and k should not inflict arbitrarily-delayed sequencing on your attempt to find out what d[k] is now.

Joshua Morton

9:57 p.m.

Ed, I'm not seeing this perceived problem either. if we have >>> d = delayed {'a': 1, 'b': 2} # I'm not sure how this is delayed exactly, but sure >>> k = delayed string.ascii_lowercase[0] >>> d[k] 1 I'm not sure how the delayedness of any of the subexpressions matter, since evaluating the parent expression will evaluate all the way down. --Josh On Fri, Feb 17, 2017 at 4:53 PM Ed Kellett <edk141@gmail.com> wrote:

...

On Fri, 17 Feb 2017 at 21:21 Joseph Jevnik <joejev@gmail.com> wrote:

About the "whatever is d[k]" in five minutes comment: If I created an explict closure like: `thunk = lambda: d[k]` and then mutated `d` before evaluating the closure you would have the same issue. I don't think it is that confusing. If you need to know what `d[k]` evaluates to right now then the order of evaluation is part of the correctness of your program and you need to sequence execution such that `d` is evaluated before creating that closure.

If you create an explicit closure, sure. With delayed expressions, you could explicitly delay d[k], too. If you have an existing d and k, potentially passed to you by somebody else's code, the delayedness of d and k should not inflict arbitrarily-delayed sequencing on your attempt to find out what d[k] is now. _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Ed Kellett

10:02 p.m.

On Fri, 17 Feb 2017 at 21:58 Joshua Morton <joshua.morton13@gmail.com> wrote:

...

Ed, I'm not seeing this perceived problem either.

if we have

>>> d = delayed {'a': 1, 'b': 2} # I'm not sure how this is delayed exactly, but sure >>> k = delayed string.ascii_lowercase[0] >>> d[k] 1

My problem with this doesn't have to do with subexpressions. If d[k] for delayed d and k yields delayed d[k], then someone mutates d, you get an unexpected result. So I'm suggesting that d[k] for delayed d and k should evaluate d and k, instead.

Ed Kellett

9:49 p.m.

On Fri, 17 Feb 2017 at 21:18 Joseph Jevnik <joejev@gmail.com> wrote:

...

There is no existing code that uses delayed execution so we don't need to worry about breaking it.

I think you're missing the point here. This thing is transparent—that's sort of the entire point—so you can pass delayed expressions to other things, and it would be better if they didn't have insane behaviour.

...

I think it would be much easier to reason about if forcing an expression was always explicit. I am not sure what you mean with the second case; why are you delaying a function if you care about the observable side-effect?

You don't delay the function, you delay an expression that evaluates to it. You should be able to pass the result to *any* existing code that expects a function and sometimes calls it, and the function should be called when that happens, rather than evaluated to a delayed object and then discarded.

Joseph Jevnik

9:56 p.m.

...

You should be able to pass the result to *any* existing code that expects a function and sometimes calls it, and the function should be called when that happens, rather than evaluated to a delayed object and then discarded.

I disagree with this claim because I do not think that you should have side effects and delayed execution anywhere near each other. You only open youself up to a long list of special cases for when and where things get evaluated. On Fri, Feb 17, 2017 at 4:49 PM, Ed Kellett <edk141@gmail.com> wrote:

...

On Fri, 17 Feb 2017 at 21:18 Joseph Jevnik <joejev@gmail.com> wrote:

...
There is no existing code that uses delayed execution so we don't need to worry about breaking it.

I think you're missing the point here. This thing is transparent—that's sort of the entire point—so you can pass delayed expressions to other things, and it would be better if they didn't have insane behaviour.

...
I think it would be much easier to reason about if forcing an expression was always explicit. I am not sure what you mean with the second case; why are you delaying a function if you care about the observable side-effect?

You don't delay the function, you delay an expression that evaluates to it. You should be able to pass the result to *any* existing code that expects a function and sometimes calls it, and the function should be called when that happens, rather than evaluated to a delayed object and then discarded.

Ed Kellett

10:01 p.m.

On Fri, 17 Feb 2017 at 21:57 Joseph Jevnik <joejev@gmail.com> wrote:

...

...
You should be able to pass the result to *any* existing code that expects a function and sometimes calls it, and the function should be called when that happens, rather than evaluated to a delayed object and then discarded.

I disagree with this claim because I do not think that you should have side effects and delayed execution anywhere near each other.

If Python gets delayed execution, it's going to be near side effects. That's just the reality we live in.

...

You only open youself up to a long list of special cases for when and where things get evaluated.

Not really. With the function call example, as long as x() always evaluates x (rather than becoming a delayed call to x), we're all good. Remember that this has nothing to do with the contents of x, which indeed shouldn't use delays if it cares about side effects—what might be delayed here is the expression that finds x.

Mark E. Haase

7:39 p.m.

On Fri, Feb 17, 2017 at 1:55 PM, Joshua Morton <joshua.morton13@gmail.com> wrote:

...

but I'm wondering how common async and await were when that was proposed and accepted?

Actually, "async" and "await" are backwards compatible due to a clever tokenizer hack. The "async" keyword may only appear in a few places (e.g. async def), and it is treated as a name anywhere else.The "await" keyword may only appear inside an "async def" and is treated as a name everywhere else. Therefore... >>> async = 1 >>> await = 1 ...these are both valid in Python 3.5. This example is helpful when proposing new keywords. More info: https://www.python.org/dev/peps/pep-0492/#transition-plan

Joseph Hackman

8:31 p.m.

Couldn't the same thing be true of delayed if it is always followed by a colon? I.e. delayed=1 x= delayed: slow_function() print(delayed) # prints 1 -Joseph

...

On Feb 17, 2017, at 2:39 PM, Mark E. Haase <mehaase@gmail.com> wrote:

...
On Fri, Feb 17, 2017 at 1:55 PM, Joshua Morton <joshua.morton13@gmail.com> wrote: but I'm wondering how common async and await were when that was proposed and accepted?

Actually, "async" and "await" are backwards compatible due to a clever tokenizer hack. The "async" keyword may only appear in a few places (e.g. async def), and it is treated as a name anywhere else.The "await" keyword may only appear inside an "async def" and is treated as a name everywhere else. Therefore...

>>> async = 1 >>> await = 1

...these are both valid in Python 3.5. This example is helpful when proposing new keywords.

More info: https://www.python.org/dev/peps/pep-0492/#transition-plan

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Joshua Morton

8:40 p.m.

I think it could even be true without, but the colon may cause ambiguity problems with function annotations. def foo(delayed: delayed: 1 + 2) is a bit odd, especially if `delayed` is chainable. --Josh On Fri, Feb 17, 2017 at 3:32 PM Joseph Hackman <josephhackman@gmail.com> wrote:

...

Couldn't the same thing be true of delayed if it is always followed by a colon?

I.e. delayed=1 x= delayed: slow_function() print(delayed) # prints 1

-Joseph

On Feb 17, 2017, at 2:39 PM, Mark E. Haase <mehaase@gmail.com> wrote:

On Fri, Feb 17, 2017 at 1:55 PM, Joshua Morton <joshua.morton13@gmail.com> wrote:

but I'm wondering how common async and await were when that was proposed and accepted?

Actually, "async" and "await" are backwards compatible due to a clever tokenizer hack. The "async" keyword may only appear in a few places (e.g. async def), and it is treated as a name anywhere else.The "await" keyword may only appear inside an "async def" and is treated as a name everywhere else. Therefore...

>>> async = 1 >>> await = 1

...these are both valid in Python 3.5. This example is helpful when proposing new keywords.

More info: https://www.python.org/dev/peps/pep-0492/#transition-plan

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

David Mertz

10:09 p.m.

That was a problem with the colon that occurred to me. I think it can't be tokenized in function annotations. Plus I still think the no-colon looks better. But that's bikeshedding. Also other words are plausible. I like lazy even more than delayed, I think. Still, I'd love the construct whatever the exact spelling. On Feb 17, 2017 12:41 PM, "Joshua Morton" <joshua.morton13@gmail.com> wrote:

...

I think it could even be true without, but the colon may cause ambiguity problems with function annotations.

def foo(delayed: delayed: 1 + 2)

is a bit odd, especially if `delayed` is chainable.

--Josh

On Fri, Feb 17, 2017 at 3:32 PM Joseph Hackman <josephhackman@gmail.com> wrote:

...
Couldn't the same thing be true of delayed if it is always followed by a colon?

I.e. delayed=1 x= delayed: slow_function() print(delayed) # prints 1

-Joseph

On Feb 17, 2017, at 2:39 PM, Mark E. Haase <mehaase@gmail.com> wrote:

On Fri, Feb 17, 2017 at 1:55 PM, Joshua Morton <joshua.morton13@gmail.com

...
wrote:

but I'm wondering how common async and await were when that was proposed and accepted?

Actually, "async" and "await" are backwards compatible due to a clever tokenizer hack. The "async" keyword may only appear in a few places (e.g. async def), and it is treated as a name anywhere else.The "await" keyword may only appear inside an "async def" and is treated as a name everywhere else. Therefore...

>>> async = 1 >>> await = 1

...these are both valid in Python 3.5. This example is helpful when proposing new keywords.

More info: https://www.python.org/dev/peps/pep-0492/#transition-plan

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Joseph Hackman

10:35 p.m.

I think we should use the colon to make the delayed word (or whatever word is selected), unambiguously used in this way (and to prevent any existing code from breaking). On 17 February 2017 at 17:09, David Mertz <mertz@gnosis.cx> wrote:

...

That was a problem with the colon that occurred to me. I think it can't be tokenized in function annotations.

I don't see any reason for delayed execution and function annotations to mix. i.e. def foo(delayed: bar): pass would define a function that takes one argument, named delayed, of type bar.

...

Plus I still think the no-colon looks better. But that's bikeshedding. Also other words are plausible. I like lazy even more than delayed, I think. Still, I'd love the construct whatever the exact spelling.

I'm not particularly married to delayed, but I don't know how to properly vet this inside the community. I'm glad you like the proposal! -Joseph

Joshua Morton

11:13 p.m.

@ Joseph Function annotations can be arbitrary python expressions, it is completely legal to have something like >>> def foo(bar: lambda x: x + 1): ... pass Why you would want that I can't say, but it is legal. In the same way, `def foo(bar: delayed 1 + 1)` should probably be legal syntax, even if the use is inexplicable. (also note that the `:` works with lambda because lambda cannot be used as an identifier). In any case, as David said, bikeshedding. @ Ed Its my understanding that d[k] is always d[k], even if d or k or both are delayed. On the other hand, `delayed d[k]` would not be, but you would need to explicitly state that. I think its worth expanding on the example Joseph made. I think it makes sense to consider this proposal to be `x = delayed <EXPR>` is essentially equivalent to `x = lambda: <EXPR>`, except that there will be no need to explicitly call `x()` to get the delayed value, instead it will be evaluated the first time its needed, transparently. This, for the moment, assumes that this doesn't cause enormous interpreter issues, but I don't think it will. That is, there is no "delayed" object that is created and called, and so as a user you really won't care if an object is "delayed" or not, you'll just use it and it will be there. Do you understand this proposal differently? --Josh On Fri, Feb 17, 2017 at 5:35 PM Joseph Hackman <josephhackman@gmail.com> wrote:

...

I think we should use the colon to make the delayed word (or whatever word is selected), unambiguously used in this way (and to prevent any existing code from breaking).

On 17 February 2017 at 17:09, David Mertz <mertz@gnosis.cx> wrote:

That was a problem with the colon that occurred to me. I think it can't be tokenized in function annotations.

I don't see any reason for delayed execution and function annotations to mix. i.e. def foo(delayed: bar): pass would define a function that takes one argument, named delayed, of type bar.

Plus I still think the no-colon looks better. But that's bikeshedding. Also other words are plausible. I like lazy even more than delayed, I think. Still, I'd love the construct whatever the exact spelling.

I'm not particularly married to delayed, but I don't know how to properly vet this inside the community. I'm glad you like the proposal!

-Joseph

Joseph Hackman

11:20 p.m.

On 17 February 2017 at 18:13, Joshua Morton <joshua.morton13@gmail.com> wrote:

...

@ Joseph

Function annotations can be arbitrary python expressions, it is completely legal to have something like

>>> def foo(bar: lambda x: x + 1): ... pass

Why you would want that I can't say, but it is legal. In the same way, `def foo(bar: delayed 1 + 1)` should probably be legal syntax, even if the use is inexplicable. (also note that the `:` works with lambda because lambda cannot be used as an identifier). In any case, as David said, bikeshedding.

Sorry for lack of clarity: I see that it is legal for lambda, I suggest that the value of extending this to delayed: to be not worth the cost of potentially being backwards-impompatible. I say that because if delayed: were to work in function definitions, whenever the definition was evaluated, the delayed would be as well. (The same is true for if, for, and while.) This would be different if the delayed was inside a function call inside a function definition. (but in that case there would be no collision).

Ed Kellett

12:28 a.m.

On Fri, 17 Feb 2017 at 23:14 Joshua Morton <joshua.morton13@gmail.com> wrote:

...

@ Ed

Its my understanding that d[k] is always d[k], even if d or k or both are delayed. On the other hand, `delayed d[k]` would not be, but you would need to explicitly state that. I think its worth expanding on the example Joseph made.

I think it makes sense to consider this proposal to be `x = delayed <EXPR>` is essentially equivalent to `x = lambda: <EXPR>`, except that there will be no need to explicitly call `x()` to get the delayed value, instead it will be evaluated the first time its needed, transparently. This, for the moment, assumes that this doesn't cause enormous interpreter issues, but I don't think it will. That is, there is no "delayed" object that is created and called, and so as a user you really won't care if an object is "delayed" or not, you'll just use it and it will be there.

Do you understand this proposal differently?

--Josh

Chris mentioned something about it being difficult to decide what evaluates a delayed thing, and what maintains it. This tangent started with my suggesting that operators should maintain delayed-ness iff all their operands are delayed, with ., [] and () as exceptions. That is, I'm suggesting that d[k] always evaluate d and k, but a + b might defer evaluation if a and b are both delayed already. Roughly, I guess my rationale is something like "let operators combine multiple delayed-objects into one, unless that would break things"—and at least by convention, operators that aren't ., [] or () don't have behaviour that would be broken. Ed

David Mertz

5:27 a.m.

On Fri, Feb 17, 2017 at 2:35 PM, Joseph Hackman <josephhackman@gmail.com> wrote:

...

I think we should use the colon to make the delayed word (or whatever word is selected), unambiguously used in this way (and to prevent any existing code from breaking).

On 17 February 2017 at 17:09, David Mertz <mertz@gnosis.cx> wrote:

...
That was a problem with the colon that occurred to me. I think it can't be tokenized in function annotations.

I don't see any reason for delayed execution and function annotations to mix. i.e. def foo(delayed: bar): pass would define a function that takes one argument, named delayed, of type bar.

I still think the colon is ugly and inconsistent with other Python uses. I know you are trying for analogy with lambda (which is related, yes). But the analogies with yield, yield from, async, and await feel much stronger to me. Also, 'lambda' *requires* the colon since it might take arguments and that is necessary to tell when they end.[*] 'delayed' like those other words I mention has no such need. That said, I think you are right that it makes no sense to declare a function signature with 'delayed' (or 'lazy', 'deferred', whatever word). Calling it definitely! This feels important: x = foo(delayed very_complex_computation()) But in the definition signature it feels nonsensical, I agree. However, that still doesn't answer other uses of the colon: {delayed: 17} # No idea if this is a set of one delayed object or a dictionary lambda delayed: delayed: 17 # Just don't know where to start with this All these problems simply go away if we drop the colon. [*] i.e. what would this colon-free lambda mean: 'lambda a, b, c'? A function of no arguments return a tuple? a function of three arguments? -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

Joseph Hackman

5:35 a.m.

I'm not married to the colon. Does anyone else see any issue with dropping it? On 18 February 2017 at 00:27, David Mertz <mertz@gnosis.cx> wrote:

...

On Fri, Feb 17, 2017 at 2:35 PM, Joseph Hackman <josephhackman@gmail.com> wrote:

...
I think we should use the colon to make the delayed word (or whatever word is selected), unambiguously used in this way (and to prevent any existing code from breaking).

On 17 February 2017 at 17:09, David Mertz <mertz@gnosis.cx> wrote:

...
That was a problem with the colon that occurred to me. I think it can't be tokenized in function annotations.

I don't see any reason for delayed execution and function annotations to mix. i.e. def foo(delayed: bar): pass would define a function that takes one argument, named delayed, of type bar.

I still think the colon is ugly and inconsistent with other Python uses. I know you are trying for analogy with lambda (which is related, yes). But the analogies with yield, yield from, async, and await feel much stronger to me. Also, 'lambda' *requires* the colon since it might take arguments and that is necessary to tell when they end.[*] 'delayed' like those other words I mention has no such need.

That said, I think you are right that it makes no sense to declare a function signature with 'delayed' (or 'lazy', 'deferred', whatever word). Calling it definitely! This feels important:

x = foo(delayed very_complex_computation())

But in the definition signature it feels nonsensical, I agree. However, that still doesn't answer other uses of the colon:

{delayed: 17} # No idea if this is a set of one delayed object or a dictionary lambda delayed: delayed: 17 # Just don't know where to start with this

All these problems simply go away if we drop the colon.

[*] i.e. what would this colon-free lambda mean: 'lambda a, b, c'? A function of no arguments return a tuple? a function of three arguments?

-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

Steven D'Aprano

11:39 a.m.

Using "delayed" in function signatures: On Fri, Feb 17, 2017 at 09:27:35PM -0800, David Mertz wrote:

...

That said, I think you are right that it makes no sense to declare a function signature with 'delayed' (or 'lazy', 'deferred', whatever word).

It makes perfect sense! That gives us function defaults which are evaluated only the first time they are needed, instead of when the function is defined. def function(spam, eggs=delayed nth_prime(10**9)): ... would be equivalent to: _CACHED_DEFAULT = None def function(spam, eggs=None): if eggs is None: if _CACHED_DEFAULT is None: _CACHED_DEFAULT = nth_prime(10**9) return _CACHED_DEFAULT ... I've done this in real life, e.g. to set up an expensive lookup table. The caller might provide their own, but if not, the function has its own default, where I want to delay generating the default until it is actually needed. -- Steve

Abe Dillon

8:41 p.m.

...

Couldn't the same thing be true of delayed if it is always followed by a colon?

No. Because there are other reasons you'd follow the variable `delayed` with a colon:

...

...
...
delayed = 1 d = {delayed: "oops!"}

My earlier proposal (using unpacking syntax) doesn't work for the same reason. On Fri, Feb 17, 2017 at 2:31 PM, Joseph Hackman <josephhackman@gmail.com> wrote:

...

Couldn't the same thing be true of delayed if it is always followed by a colon?

I.e. delayed=1 x= delayed: slow_function() print(delayed) # prints 1

-Joseph

On Feb 17, 2017, at 2:39 PM, Mark E. Haase <mehaase@gmail.com> wrote:

On Fri, Feb 17, 2017 at 1:55 PM, Joshua Morton <joshua.morton13@gmail.com> wrote:

...
but I'm wondering how common async and await were when that was proposed and accepted?

Actually, "async" and "await" are backwards compatible due to a clever tokenizer hack. The "async" keyword may only appear in a few places (e.g. async def), and it is treated as a name anywhere else.The "await" keyword may only appear inside an "async def" and is treated as a name everywhere else. Therefore...

>>> async = 1 >>> await = 1

...these are both valid in Python 3.5. This example is helpful when proposing new keywords.

More info: https://www.python.org/dev/peps/pep-0492/#transition-plan

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Joseph Hackman

9:30 p.m.

Abe- You are correct. However I think it may still be salvageable. In your code example, you could be either making a dict with a key of 1, or a set of a delayed object. But there's no reason to build a set of a delayed object because hashing it would immediately un-delay. Similarly, I am not sure delayed should be allowed inside of function headers. So we could say that dictionary keys and sets shouldn't be allowed to use the delayed keyword. Same with function headers. Are there any other collisions? -Joseph On Feb 17, 2017, at 3:41 PM, Abe Dillon <abedillon@gmail.com> wrote:

...

...
Couldn't the same thing be true of delayed if it is always followed by a colon?

No. Because there are other reasons you'd follow the variable `delayed` with a colon:

...
...
...
delayed = 1 d = {delayed: "oops!"}

My earlier proposal (using unpacking syntax) doesn't work for the same reason.

...
On Fri, Feb 17, 2017 at 2:31 PM, Joseph Hackman <josephhackman@gmail.com> wrote: Couldn't the same thing be true of delayed if it is always followed by a colon?

I.e. delayed=1 x= delayed: slow_function() print(delayed) # prints 1

-Joseph

...
On Feb 17, 2017, at 2:39 PM, Mark E. Haase <mehaase@gmail.com> wrote:

...
On Fri, Feb 17, 2017 at 1:55 PM, Joshua Morton <joshua.morton13@gmail.com> wrote: but I'm wondering how common async and await were when that was proposed and accepted?

Actually, "async" and "await" are backwards compatible due to a clever tokenizer hack. The "async" keyword may only appear in a few places (e.g. async def), and it is treated as a name anywhere else.The "await" keyword may only appear inside an "async def" and is treated as a name everywhere else. Therefore...

>>> async = 1 >>> await = 1

...these are both valid in Python 3.5. This example is helpful when proposing new keywords.

More info: https://www.python.org/dev/peps/pep-0492/#transition-plan

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Abe Dillon

10:02 p.m.

I'm fairly novice, so I could be way off base here, but it seems like the inevitable conclusion to this problem is something like JIT compilation, right? (admittedly, I know very little about JIT compilation) Python seems to be accumulating a lot of different approaches to achieving very similar things: asynchronous and/or lazy execution. We have generators, futures, asyncio, async/await, and probably more that I'm not thinking of. It seems like it should be possible for the interpreter to determine when an expression absolutely *must* be evaluated in many cases. If I write:

...

...
...
log.debug("data = %s", some_expensive_function())

Then it should be possible for python to put off evaluating that function or even building a full argument tuple if the log level is higher than debug. I know code with side-effects, especially I/O related side-effects would be difficult or impossible to manage within that context (the interpreter wouldn't be able to know that a write to a file has to occur before a read from that file for instance. Maybe it would make more sense to mark such side-effects to tell the interpreter "you must evaluate all deferred expressions before executing this because it changes stuff that Python can't keep track of". instead of marking code that the compiler can optimize. My gut reaction says this is not the answer because it emphasizes optimization over correct behavior, but I wanted to share the thought. On Fri, Feb 17, 2017 at 3:30 PM, Joseph Hackman <josephhackman@gmail.com> wrote:

...

Abe-

You are correct. However I think it may still be salvageable.

In your code example, you could be either making a dict with a key of 1, or a set of a delayed object. But there's no reason to build a set of a delayed object because hashing it would immediately un-delay.

Similarly, I am not sure delayed should be allowed inside of function headers.

So we could say that dictionary keys and sets shouldn't be allowed to use the delayed keyword. Same with function headers. Are there any other collisions?

-Joseph

On Feb 17, 2017, at 3:41 PM, Abe Dillon <abedillon@gmail.com> wrote:

Couldn't the same thing be true of delayed if it is always followed by a

...
colon?

No. Because there are other reasons you'd follow the variable `delayed` with a colon:

...
...
...
delayed = 1 d = {delayed: "oops!"}

My earlier proposal (using unpacking syntax) doesn't work for the same reason.

On Fri, Feb 17, 2017 at 2:31 PM, Joseph Hackman <josephhackman@gmail.com> wrote:

...
Couldn't the same thing be true of delayed if it is always followed by a colon?

I.e. delayed=1 x= delayed: slow_function() print(delayed) # prints 1

-Joseph

On Feb 17, 2017, at 2:39 PM, Mark E. Haase <mehaase@gmail.com> wrote:

On Fri, Feb 17, 2017 at 1:55 PM, Joshua Morton <joshua.morton13@gmail.com

...
wrote:

...
but I'm wondering how common async and await were when that was proposed and accepted?

Actually, "async" and "await" are backwards compatible due to a clever tokenizer hack. The "async" keyword may only appear in a few places (e.g. async def), and it is treated as a name anywhere else.The "await" keyword may only appear inside an "async def" and is treated as a name everywhere else. Therefore...

>>> async = 1 >>> await = 1

...these are both valid in Python 3.5. This example is helpful when proposing new keywords.

More info: https://www.python.org/dev/peps/pep-0492/#transition-plan

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Steven D'Aprano

12:41 a.m.

On Fri, Feb 17, 2017 at 04:02:01PM -0600, Abe Dillon wrote:

...

I'm fairly novice, so I could be way off base here, but it seems like the inevitable conclusion to this problem is something like JIT compilation, right? (admittedly, I know very little about JIT compilation)

No. JIT compilation delays *compiling* the code to run-time. This is a proposal for delaying *running* the code until such time as some other piece of code actually needs the result. An example might help. Suppose we want to get the one millionth prime number, a task which is of moderate difficulty and may take a while: print("Start") result = get_nth_prime(10**6) print("Done") print(result) On my computer, using a pure-Python implementation, it takes about 11 seconds to find the millionth prime 15485863, so there'll be a delay of 11 seconds between printing Start and Done, but printing the result is instantaneous. That's true regardless of when and how the code is compiled. (Where a JIT compiler is useful is that it may be possible to use runtime information available to the interpreter to compile all or some of the Python code to efficient machine code, allowing the function to run faster. That's how PyPy works.) If we make the code *delayed* then the situation is different: print("Start") result = delayed: get_nth_prime(10**6) # I dislike this syntax print("Done") print(result) Now Start and Done are printed virtually instantaneously, but there is an 11 second delay *after* Done is printed, when the result is reified (made real; the calculation is actually performed) and printed.

...

Python seems to be accumulating a lot of different approaches to achieving very similar things: asynchronous and/or lazy execution. We have generators, futures, asyncio, async/await, and probably more that I'm not thinking of. It seems like it should be possible for the interpreter to determine when an expression absolutely *must* be evaluated in many cases.

If the people debating this proposal cannot even agree on when the expression must be evaluated, how could the interpreter do it?

...

I know code with side-effects, especially I/O related side-effects would be difficult or impossible to manage within that context (the interpreter wouldn't be able to know that a write to a file has to occur before a read from that file for instance.

I think side-effects is a red herring. The obvious rule is: side-effects occur when the delayed thunk is reified. If you care about the actual timing of the side-effects, then don't use delayed evaluation. If you don't care, then who cares if the side-effect is delayed? -- Steve

Pavol Lisy

1:59 p.m.

On 2/18/17, Steven D'Aprano <steve@pearwood.info> wrote: Sorry Steve that I use your words probably too much out of context! I just want to reuse your examples to analyze if proposed "delayed execution" is really necessary. Thanks for them! :)

...

print("Start") result = delayed: get_nth_prime(10**6) # I dislike this syntax print("Done") print(result)

If we change "delayed" keyword to "lambda" keyword then we just need to explicitly say when to evaluate (which I am not sure is bad thing). On 2/18/17, Steven D'Aprano <steve@pearwood.info> wrote:

...

The caching means that: spam = delayed: calculate(1) eggs = spam

eggs == spam would be true, and calculate would have only been called once, not twice.

If we really need something like this then we could use "def" keyword (for delayed execution) and explicitely caching (for example with functools.lru_cache). print("Start") @functools.lru_cache(1) def result(): return get_nth_prime(10**6) # this code is delayed print("Done") print(result()) print("Start 2") print(result()) result2 = result print(result() == result2()) print("Done 2")

Joseph Hackman

11:06 p.m.

On 17 February 2017 at 06:10, Steven D'Aprano <steve@pearwood.info> wrote:

...

On Fri, Feb 17, 2017 at 12:24:53AM -0500, Joseph Hackman wrote:

...
I propose a keyword to mark an expression for delayed/lazy execution, for the purposes of standardizing such behavior across the language.

The proposed format is: delayed: <expr> i.e. log.info("info is %s", delayed: expensiveFunction())

Keywords are difficult: since by definition they are not backwards compatible, they make it hard for people to write version independent code, and will break people's code. Especially something like "delayed", I expect that there is lots of code that used "delayed" as a regular name.

if status.delayed: ...

I think it would be key, like async/await, to narrowly define the scope in which the word delayed functions as a keyword. There is enough information for the compiler to know that you don't mean the delayed keyword there because: 1. It's immediately after a dot, but more importantly 2. It's in a bare 'if'. There's no way the execution could be delayed. delayed=False if delayed: is still protected by #2. In a case where delayed would make sense, it also is unambiguous if either_or(True, delayed:expensive_function()): is clearly using the delayed keyword, rather than the delayed defined as False above. (Notably, the built-in 'and' and 'or' shouldn't use delayed:, as the short-circuiting logic is already well defined. So, in short, in an if, for or while, the delayed keyword is only used if it is inside a function call (or something like that).

...

A new keyword means it can't be back-ported to older versions, and will break code.

async and await both work fine, for the reasons listed above. I'll admit there may be more nuance required here, but it should be both possible, and fairly intuitive based on when people would be using delayed execution.

...

...
Unlike 'lambda' which returns a function (so the receiver must be lambda-aware), delayed execution blocks are for all purposes values. The first time the value (rather than location) is read,

What counts as "reading" a value? Based on your example below, I can't tell if passing the object to *any* function is enough to trigger evaluation, or specifically print is the magic that makes it happen.

So far I'm going with pretty much anything that isn't being the right-hand of an assignment. So coercion to different types, hashing (for use as a key in a dict or set), __repr__, etc would all be covered, as well as identity and comparisons. i.e.: def expensive_function(x,y): if x and y is not None: print('yippie skippy') expensive_function(True, delayed: evaluates_to_none()) The idea put forth here would cover this, by evaluating to perform the is.

...

...
or any method on the delayed object is called,

I don't think that can work -- it would have to be any attribute access, surely, because Python couldn't tell if the attribute was a method or not until it evaluated the lazy object. Consider:

spam = delayed: complex_calculation() a = spam.thingy

Since spam.thingy is an access on spam, it would have been evaluated before 'thingy' was read.

...

What's `a` at this point? Is is still some sort of lazy object, waiting to be evaluated? If so, how is Python supposed to know if its a method?

result = a()

...
the expression is executed and the delayed expression is replaced with the result. (Thus, the delayed expression is only every evaluated once).

That's easily done by having the "delayed" keyword cache each expression it sees, but that seems like a bad idea to me:

spam = delayed: get_random_string() eggs = delayed: get_random_string() # the same expression

spam.upper() # convert to a real value

...

assert spam == eggs # always true, as they are the same expression

Since spam and eggs are two different instances of delayed expression, each one would be evaluated separately when they are read from (as operands for the equals operator). So no, even without the spam.upper(), they would not match.

...

Worse, suppose module a.py has:

spam = delayed: calculate(1)

and module b.py has:

eggs = delayed: calculate(1)

where a.calculate and b.calculate do completely different things. The result you get will depend on which happens to be evaluated first and cached, and would be a nightmare to debug. Truely spooky action-at-a- distance code.

I think it is better to stick to a more straight-forward, easily understood and debugged system based on object identity rather than expressions.

The caching means that: spam = delayed: calculate(1) eggs = spam eggs == spam would be true, and calculate would have only been called once, not twice.

...

...
Ideally: a = delayed: 1+2 b = a print(a) #adds 1 and 2, prints 3 # a and b are now both just 3 print(b) #just prints 3

That would work based on object identity too.

By the way, that's probably not the best example, because the keyhole optimizer will likely have compiled 1+2 as just 3, so you're effectively writing:

a = delayed: 3

At least, that's what I would want: I would argue strongly against lazy objects somehow defeating the keyhole optimizer. If I write:

a = delayed: complex_calculation(1+2+3, 4.5/3, 'abcd'*3)

what I hope will be compiled is:

a = delayed: complex_calculation(6, 0.6428571428571429, 'abcdabcdabcd')

same as it would be now (at least in CPython), apart from the "delayed:" keyword. a = delayed: complex_calculation(1+2+3, 4.5/3, 'abcd'*3)

I'm not sure what this is trying to do, so it's hard for me to weigh in. It's totally fine if delayed:1+2 is exactly the same as delayed:3 and/or 3 itself.

...

...
Mechanically, this would be similar to the following:

class Delayed(): def __init__(self, func): self.__func = func self.__executed = False self.__value = None

def __str__(self): if self.__executed: return self.__value.__str__() self.__value = self.__func() self.__executed = True return self.__value.__str__()

So you're suggesting that calling str(delayed_object) is the one way to force evaluation?

Not at all, pleas see above.

...

I have no idea what the following code is supposed to mean.

...
def function_print(value): print('function_print') print(value)

def function_return_stuff(value): print('function_return_stuff') return value

function_print(function_return_stuff('no_delay'))

function_print(Delayed(lambda: function_return_stuff('delayed')))

delayed = Delayed(lambda: function_return_stuff('delayed_object')) function_print(delayed) function_print(delayed)

If you run this code block, it will demonstrate a number of different orders of execution, indicating that the delayed execution does function as expected.

...

-- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Steven D'Aprano

1:23 a.m.

On Fri, Feb 17, 2017 at 06:06:26PM -0500, Joseph Hackman wrote: [...]

...

I think it would be key, like async/await, to narrowly define the scope in which the word delayed functions as a keyword.

The PEP makes it clear that's just a transition phase: they will be turned into proper keywords in Python 3.7. https://www.python.org/dev/peps/pep-0492/#id80 Python has had "pseudo-keywords" in the past, like "as": [steve@ando ~]$ python2.5 -c "import math as as; print as" <module 'math' from '/usr/local/lib/python2.5/lib-dynload/math.so'> and it is my understanding that the core developers dislike this sort of thing. As do I. You shouldn't count as getting the same special treament as async/await. Maybe you will, maybe you won't.

...

...
A new keyword means it can't be back-ported to older versions, and will break code.

async and await both work fine, for the reasons listed above.

You're missing the point: code that uses async and await, whether as pseduo-keywords or actual keywords, cannot easily be backported to Python 3.4 or older. If Python introduces a new built-in, say Aardvark, then it can be back-ported: try: Aardvark except NameError: from backport import Aardvark No such thing is possible for new syntax. So that counts as a disadvantage of new syntax. Are we positive that there *must* be new syntax to solve this problem? (I think probably so, but it still counts as a disadvantage: that means that the usefulness is reduced.)

...

...
...
Unlike 'lambda' which returns a function (so the receiver must be lambda-aware), delayed execution blocks are for all purposes values. The first time the value (rather than location) is read,

What counts as "reading" a value? Based on your example below, I can't tell if passing the object to *any* function is enough to trigger evaluation, or specifically print is the magic that makes it happen.

So far I'm going with pretty much anything that isn't being the right-hand of an assignment. So coercion to different types, hashing (for use as a key in a dict or set), __repr__, etc would all be covered, as well as identity and comparisons. i.e.: [...]

That will make it pretty much impossible to tell whether something is a delayed "thunk" or not, since *any* attempt to inspect it in any way will cause it to reify. Maybe that's what we want.

...

...
That's easily done by having the "delayed" keyword cache each expression it sees, but that seems like a bad idea to me:

spam = delayed: get_random_string() eggs = delayed: get_random_string() # the same expression

spam.upper() # convert to a real value

assert spam == eggs # always true, as they are the same expression

Since spam and eggs are two different instances of delayed expression, each one would be evaluated separately when they are read from (as operands for the equals operator). So no, even without the spam.upper(), they would not match.

Earlier we talked about delayed *expressions* always generating the same value, now you're talking about *instances* rather than expressions. It makes sense to have keep the standard Python object semantics, rather than have the value of a delayed thunk cached by the textual expression that generated it.

...

...
I think it is better to stick to a more straight-forward, easily understood and debugged system based on object identity rather than expressions.

The caching means that: spam = delayed: calculate(1) eggs = spam

eggs == spam would be true, and calculate would have only been called once, not twice.

That's not caching, that's simple identity. That's how assignment works in Python, delayed calculation or not. -- Steve

Joseph Hackman

2:14 a.m.

On 17 February 2017 at 20:23, Steven D'Aprano <steve@pearwood.info> wrote:

...

...
I think it would be key, like async/await, to narrowly define the scope in which the word delayed functions as a keyword.

The PEP makes it clear that's just a transition phase: they will be turned into proper keywords in Python 3.7.

https://www.python.org/dev/peps/pep-0492/#id80

Python has had "pseudo-keywords" in the past, like "as":

[steve@ando ~]$ python2.5 -c "import math as as; print as" <module 'math' from '/usr/local/lib/python2.5/lib-dynload/math.so'>

and it is my understanding that the core developers dislike this sort of thing. As do I. You shouldn't count as getting the same special treament as async/await. Maybe you will, maybe you won't.

Very well put! Do you have any suggestions for doing something in the same vein? I think there's been a [...]

...

That will make it pretty much impossible to tell whether something is a delayed "thunk" or not, since *any* attempt to inspect it in any way will cause it to reify.

Maybe that's what we want.

In my mind, this is a plus. The only way to determine if something is delayed would be something that doesn't apply to anything else, so code never needs to be aware of delayed.

...

Earlier we talked about delayed *expressions* always generating the same value, now you're talking about *instances* rather than expressions. It makes sense to have keep the standard Python object semantics, rather than have the value of a delayed thunk cached by the textual expression that generated it.

You are totally right. I agree that the nomenclature is important, and I think we're on the same page. [...] Steve- I really appreciate the thoughtful feedback! Please let me know if you have suggestions; I don't expect the idea to be acceptable out-of-the-gate. :) -Joseph

David Mertz

5:45 a.m.

On Fri, Feb 17, 2017 at 5:23 PM, Steven D'Aprano <steve@pearwood.info> wrote:

...

try: Aardvark except NameError: from backport import Aardvark

No such thing is possible for new syntax. So that counts as a disadvantage of new syntax. Are we positive that there *must* be new syntax to solve this problem?

I agree it counts as a disadvantage. But Dask and lazy, and even just using lambdas as "thunks" push what we can do as far as we can without syntax. Those will always require a `obj.compute()` or `obj()` or `eval(obj)` or something else like that to force the "thunk" to concretize.

...

I think side-effects is a red herring. The obvious rule is: side-effects occur when the delayed thunk is reified. If you care about the actual timing of the side-effects, then don't use delayed evaluation. If you don't care, then who cares if the side-effect is delayed?

Exactly! The same rule applies when writing any computational functions too. If you worry whether they are pure, don't have side effects. Or if you don't care about the side-effects too much (for example, when or if they happen specifically), that's fine and accept it.

...

...
So far I'm going with pretty much anything that isn't being the right-hand of an assignment. So coercion to different types, hashing (for use as a key in a dict or set), __repr__, etc would all be covered, as well as identity and comparisons. i.e.: [...]

That will make it pretty much impossible to tell whether something is a delayed "thunk" or not, since *any* attempt to inspect it in any way will cause it to reify. Maybe that's what we want.

This feels like a disadvantage, and an important one. Most "normal" programmers should never have to care whether something is delayed or has been concretized already. But people writing debuggers, profilers, etc. really do want to know. There should be some way at poking at an object if you really want to without concretizing it. I wouldn't care if this was some ugly and obscure device like 'inspect._is_delayed(my_obj._co_delayed)' that has different semantics than other function calls. Maybe "the uglier the better" in this case, since it *should* be reserved for special purposes only. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

David Mertz

5:53 a.m.

On Fri, Feb 17, 2017 at 9:45 PM, David Mertz <mertz@gnosis.cx> wrote:

...

That will make it pretty much impossible to tell whether something is a

...
delayed "thunk" or not, since *any* attempt to inspect it in any way

...
will cause it to reify. Maybe that's what we want.

This feels like a disadvantage, and an important one. Most "normal" programmers should never have to care whether something is delayed or has been concretized already. But people writing debuggers, profilers, etc. really do want to know.

There should be some way at poking at an object if you really want to without concretizing it. I wouldn't care if this was some ugly and obscure device like 'inspect._is_delayed(my_obj._co_delayed)' that has different semantics than other function calls. Maybe "the uglier the better" in this case, since it *should* be reserved for special purposes only.

If we assume idempotency (which I'm not certain whether we can/should) then we could spell the check like this: if delayed delayed my_obj is delayed my_obj: print("Yep, it's delayed and I haven't concretized it") That has the dual advantages of being both ugly and obvious. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

Nathaniel Smith

2:20 a.m.

On Thu, Feb 16, 2017 at 9:24 PM, Joseph Hackman <josephhackman@gmail.com> wrote:

...

Howdy All!

This suggestion is inspired by the question on "Efficient debug logging".

I propose a keyword to mark an expression for delayed/lazy execution, for the purposes of standardizing such behavior across the language.

The proposed format is: delayed: <expr> i.e. log.info("info is %s", delayed: expensiveFunction())

People seem very excited about this as an idea, but I don't understand how it can be implemented. For example, how do you propose to handle code like this? value = delayed: some_dict.get("whatever") if value is None: ... I.e., the question is, how does 'is' work on delayed objects? I guess it has to force the promise and walk the proxy chain in each input and then do an 'is' on the base objects? This seems like a really deep and confusing change to Python's object model for a pretty marginal feature. (This is a special case of the general observation that it's just not possible to implement fully-transparent proxy objects in Python.) -n -- Nathaniel J. Smith -- https://vorpus.org

David Mertz

5:34 a.m.

On Fri, Feb 17, 2017 at 6:20 PM, Nathaniel Smith <njs@pobox.com> wrote:

...

value = delayed: some_dict.get("whatever") if value is None: ...

I.e., the question is, how does 'is' work on delayed objects? I guess it has to force the promise and walk the proxy chain in each input and then do an 'is' on the base objects?

You've explained the semantics exactly. That's not confusing at all. If the next line after creating delayed 'value' is to check it against something (whether for equality or identity) then obviously it was pointless to make it delayed. But that's not the only pattern: value = delayed some_dict.get(expensive_key_lookup(), expensive_default_calculation()) ... lots more code ... # OK, there might come a time to concretize if some_unrelated_thing and value is None: # do this, but only concretize 'value' if # the first part of conjunction was truthy -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

Nathaniel Smith

11:10 a.m.

On Fri, Feb 17, 2017 at 9:34 PM, David Mertz <mertz@gnosis.cx> wrote:

...

On Fri, Feb 17, 2017 at 6:20 PM, Nathaniel Smith <njs@pobox.com> wrote:

...
value = delayed: some_dict.get("whatever") if value is None: ...

I.e., the question is, how does 'is' work on delayed objects? I guess it has to force the promise and walk the proxy chain in each input and then do an 'is' on the base objects?

You've explained the semantics exactly. That's not confusing at all.

Okay... so what if I want to check if two objects refer to the same delayed computation? I guess you can say that's just not supported, but that's *extraordinarily weird* for Python. And at the implementation level... so you just added two type checks and two branches to every 'is' call; this seems concerning. And now 'is' can raise an error, which it never could before. You also AFAICT have to modify every single C extension function to check for and handle these things, which is probably impossible even if the overhead of all the checks is acceptable, which isn't obvious. I'm just not seeing how this could be implemented. -n -- Nathaniel J. Smith -- https://vorpus.org

Sven R. Kunze

10:59 a.m.

I would like to add another view of this feature might be very useful for cleaning up existing code bases: Just have a look at https://pypi.org/project/xfork/ and specifically I would like to point you to the following lines https://github.com/srkunze/fork/blob/afecde0/fork.py#L216 till #419. As you can see these whooping 203 lines are for the sake of implementing a delaying proxy class (of a 520 lines project). If this feature gets added, the almost the whole implementation could *shrink to a mere line* of (I hope): delayed: future.result() That would be awesome! Btw. adding parameters to 'delayed' like lambda would also be useful. [I gather that delayed is a mere placeholder for now.] On 18.02.2017 12:10, Nathaniel Smith wrote:

...

On Fri, Feb 17, 2017 at 9:34 PM, David Mertz <mertz@gnosis.cx> wrote:

...
On Fri, Feb 17, 2017 at 6:20 PM, Nathaniel Smith <njs@pobox.com> wrote:

...
value = delayed: some_dict.get("whatever") if value is None: ...

I.e., the question is, how does 'is' work on delayed objects? I guess it has to force the promise and walk the proxy chain in each input and then do an 'is' on the base objects?

You've explained the semantics exactly. That's not confusing at all. Okay... so what if I want to check if two objects refer to the same delayed computation? I guess you can say that's just not supported, but that's *extraordinarily weird* for Python.

It's new to Python yes, but it's not weird. You already can implement such proxies today as I've demonstrated.

...

And at the implementation level... so you just added two type checks and two branches to every 'is' call; this seems concerning. And now 'is' can raise an error, which it never could before. You also AFAICT have to modify every single C extension function to check for and handle these things, which is probably impossible even if the overhead of all the checks is acceptable, which isn't obvious. I'm just not seeing how this could be implemented.

I don't share your concerns here. What you describe is the nature of *delayed*. It's not only applicable to 'is' but to all operations which evaluate delayed objects. My point of view from the other side: me and other people need delayed proxies, so they implement them, and all of us make the same mistakes over and over gain. Cheers, Sven

Joseph Hackman

5:34 a.m.

Well, yes. I think the 'is' operator is where other attempts fall short, and why it would require a change to Python. But yes, it would need to force the promise. On 17 February 2017 at 21:20, Nathaniel Smith <njs@pobox.com> wrote:

...

On Thu, Feb 16, 2017 at 9:24 PM, Joseph Hackman <josephhackman@gmail.com> wrote:

...
Howdy All!

This suggestion is inspired by the question on "Efficient debug logging".

I propose a keyword to mark an expression for delayed/lazy execution, for the purposes of standardizing such behavior across the language.

The proposed format is: delayed: <expr> i.e. log.info("info is %s", delayed: expensiveFunction())

People seem very excited about this as an idea, but I don't understand how it can be implemented.

For example, how do you propose to handle code like this?

value = delayed: some_dict.get("whatever") if value is None: ...

I.e., the question is, how does 'is' work on delayed objects? I guess it has to force the promise and walk the proxy chain in each input and then do an 'is' on the base objects? This seems like a really deep and confusing change to Python's object model for a pretty marginal feature. (This is a special case of the general observation that it's just not possible to implement fully-transparent proxy objects in Python.)

-n

-- Nathaniel J. Smith -- https://vorpus.org

Michel Desmoulin

4:24 p.m.

A great proposal, although now I would have to explain to my students the subtle difference between: res = (print(i * i) for i in range(x)) print('foo') print(res) And res = delayed [print(i * i) for i in range(x)] print('foo') all(res) They seems doing something similar, but they really don't. Overall, I still love it. When I read about it, I immidiatly though about how Django handles translation in models: - define your string in english - mark it with ugettext_lazy and NOT ugettext - the framework delays the translation until a request comes around with data about the user lang The proposed featured would solve the problem nicely. Although I'm not clear on the result of: def stuff(arg=delayed []): Does this mean we create a NEW list everytime in the body function ? Or just a new one the first time than the reference stays in arg ? Because the first behavior would solve a problem Python had with mutable default arguments since the begining. But that would mean the result of "delayed []" is a concrete thing we store in arg. The "delayed" keyword sounds a lot like something used in async io, so I like "lazy" much more. Not only it is shorter, but it convey the meaning of what we are doing better. Talking about async, we need to be clear on what those do: a = (await|yield) lazy stuff a = lazy (await|yield) stuff (should it even allowed ?) a = (lazy stuff(x) for x in stuff) a = None with open(x) as f: a = lazy stuff() # raise IOError print(a) try: a = lazy stuff() # raise except Exception: pass a = lazy f'{name}' + stuff(age) # is there a closure where we store "name" and 'age'? I can see a reasonable outcome for most of this, but it must be very clear. However, I can see several very important things we need to be taking in consederation debugging wise. First, if there is an exception in the lazy expression, Python must indicate in the stack trace where this expression has been defined and where it's evaluated. Pdb must also be able to allow easily to step in those in a coherent manner. Evnetually we also may need to allow this: a = lazy stuff if a is not lazy: print(a) But then lazy can't be used a var name to help with the transition. One last thing: my vote is not dropping the ":" in front of they keyword.

David Mertz

5:31 p.m.

On Sun, Feb 19, 2017 at 8:24 AM, Michel Desmoulin <desmoulinmichel@gmail.com

...

wrote:

...

A great proposal, although now I would have to explain to my students the subtle difference between:

res = (print(i * i) for i in range(x)) res = delayed [print(i * i) for i in range(x)]

...

They seems doing something similar, but they really don't.

Well, at the introductory level they are kinda similar. I know the mechanism would have to be different. But at a first brush it's the difference between delaying the whole concrete collection and delaying one item at a time. That wouldn't be terrible for a first Compsci lesson.

...

def stuff(arg=delayed []):

Does this mean we create a NEW list every time in the body function ? Or just a new one the first time than the reference stays in arg ?

I think this cannot make a new list each time. Of course, I'm one of those people who have used the mutable default deliberately, albeit now it's mostly superseded by functools.lru_cache(). But the idea of a "delayed object" is one that transforms into a concrete (or at least *different* value) on first access. In a sense, the transformation from a delayed object to an iterator is still keeping it lazy; and clearly `x = delayed my_gen()` is a possible pattern. The pattern of `def stuff(arg=delayed expensive_computation(): ...` is important to have. But as in my longer example, `arg` might or might not be accessed in the function body depending on condition execution paths. Still, once `expensive_computation()` happens one time, that should be it, we have a result. Obviously `list()` is not an expensive operation, but the syntax cannot make a boundary for "how costly."

...

The "delayed" keyword sounds a lot like something used in async io, so I like "lazy" much more. Not only it is shorter, but it convey the meaning of what we are doing better.

I like `lazy` too.

...

a = (await|yield) lazy stuff a = lazy (await|yield) stuff (should it even allowed ?) a = (lazy stuff(x) for x in stuff)

a = lazy f'{name}' + stuff(age) # is there a closure where we store "name"

...

and 'age'?

I don't quite have a clear intuition about how lazy/delayed and await/yield/async should interact. I think it would be perfectly consistent with other Python patterns if we decided some combinations cannot be used together. Likewise you can't write `x = await yield from foo`, and that's fine, even though `yield from` is an expression.

...

First, if there is an exception in the lazy expression, Python must indicate in the stack trace where this expression has been defined and where it's evaluated.

Yes. I mentioned that there needs to be *some* way, even if it's an ugly construct, to find out that something is delayed without concretizing it. I think the best idea is hinted at in my preliminary thought. I.e. we can have a special member of a delayed object that does not concretize the object on access. So maybe `object._delayed_code` of something similar. Since it's the interpreter itself, we can say that accessing that member of the object is not a concretization, unlike accessing any other member. Every object that is *not* a delayed/lazy one should probably have None for that value. But delayed ones should have, I guess, the closure that would get executed on access (then once accessed, the object becomes whatever the result of the expression is, with `._delayed_code` then set to None on that transformed object).

...

a = lazy stuff if a is not lazy: print(a)

So my spelling would be: a = lazy stuff if a._delayed_code is not None: print(a)

...

One last thing: my vote is not dropping the ":" in front of they keyword.

I think the colon has parser problems, as I showed in some examples. Plus I don't like how it looks. But I'd much rather have `a = lazy: stuff` than not have the construct at all, nonetheless. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

Michel Desmoulin

5:36 p.m.

...

One last thing: my vote is not dropping the ":" in front of they keyword.

I think the colon has parser problems, as I showed in some examples. Plus I don't like how it looks. But I'd much rather have `a = lazy: stuff` than not have the construct at all, nonetheless.

This was a typo on my part. I prefer to AVOID the ":" in front of the keyword.

Joseph Hackman

6:13 p.m.

Michel- Thanks for the feedback! On 19 February 2017 at 11:24, Michel Desmoulin <desmoulinmichel@gmail.com> wrote:

...

A great proposal, although now I would have to explain to my students the subtle difference between:

res = (print(i * i) for i in range(x)) print('foo') print(res)

And

res = delayed [print(i * i) for i in range(x)] print('foo') all(res)

They seems doing something similar, but they really don't.

Overall, I still love it.

When I read about it, I immidiatly though about how Django handles translation in models:

- define your string in english - mark it with ugettext_lazy and NOT ugettext - the framework delays the translation until a request comes around with data about the user lang

The proposed featured would solve the problem nicely.

Although I'm not clear on the result of:

def stuff(arg=delayed []):

Does this mean we create a NEW list everytime in the body function ? Or just a new one the first time than the reference stays in arg ?

Because the first behavior would solve a problem Python had with mutable default arguments since the begining. But that would mean the result of "delayed []" is a concrete thing we store in arg.

My honest preference would be that the [] is evaluated fresh each time the function is called. def stuff(arg=delayed f()): would result in f() being called every time stuff() is. This seems more valuable to me than just doing it once when the function is first called.

...

The "delayed" keyword sounds a lot like something used in async io, so I like "lazy" much more. Not only it is shorter, but it convey the meaning of what we are doing better.

I'm fine with either delayed or lazy.

...

Talking about async, we need to be clear on what those do:

a = (await|yield) lazy stuff

suggestion: a = await lazy stuff # same as await stuff, the await forces the lazy to be evaluated. a = yield lazy stuff # yields a lazy expression that will be evaluated when read, a is still set on the push as usual.

...

a = lazy (await|yield) stuff (should it even allowed ?)

a = lazy await stuff # returns a lazy expression that, when evaluated will await stuff. I know this is dangerous, but I think it fits the pattern and Python is a 'consenting adults' language. If it attempts to evaluate outside a coroutine, I'm fine with it raising an exception.I'm also totally cool with this not being allowed. a = lazy yield stuff # the generator doesn't yield/pause until a is read from. see above. a = (lazy stuff(x) for x in stuff)

...

a generator that returns lazy expressions that are not executed unless read.

...

a = None

...

with open(x) as f: a = lazy stuff() # raise IOError print(a)

...

try: a = lazy stuff() # raise except Exception: pass

I think this is one of the best points. My guess is that the exception should be raised where the expression is evaluated. We're all consenting adults here, and if you want to cause an uncaught exception somewhere, who am I to stop you? a = lazy f'{name}' + stuff(age) # is there a closure where we store

...

"name" and 'age'?

I suggest yes, where possible/reasonable. I can see a reasonable outcome for most of this, but it must be very clear.

...

However, I can see several very important things we need to be taking in consederation debugging wise.

First, if there is an exception in the lazy expression, Python must indicate in the stack trace where this expression has been defined and where it's evaluated.

Pdb must also be able to allow easily to step in those in a coherent manner.

Evnetually we also may need to allow this:

a = lazy stuff if a is not lazy: print(a)

I do think that this is probably the best greenfield solution for the problem. My only strong feeling is that it should be VERY difficult to 'accidentally' inspect a lazy, rather than evaluating it.

...

But then lazy can't be used a var name to help with the transition.

Yeah. :(

...

One last thing: my vote is not dropping the ":" in front of they keyword.

I don't understand what your meaning is here.

David Mertz

6:33 p.m.

On Sun, Feb 19, 2017 at 10:13 AM, Joseph Hackman <josephhackman@gmail.com> wrote:

...

My honest preference would be that the [] is evaluated fresh each time the function is called. def stuff(arg=delayed f()): would result in f() being called every time stuff() is. This seems more valuable to me than just doing it once when the function is first called.

This doesn't make sense. Function definition time is very different than function execution time. Changing that distinction is a WAY bigger change than I think we should contemplate. Moreover, there is a completely obvious way to spell the behavior you want: def stuff(): arg = f() # ... whatever ... This is exactly the obvious way to spell "f() is called every time stuff() is". -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

Joseph Hackman

6:47 p.m.

...

This doesn't make sense. Function definition time is very different than function execution time. Changing that distinction is a WAY bigger change than I think we should contemplate. Moreover, there is a completely obvious way to spell the behavior you want: def stuff(): arg = f() # ... whatever ... This is exactly the obvious way to spell "f() is called every time stuff() is".

I think it would be useful, but yeah, it really doesn't fit in with the rest of lazy/delayed. The present format of defaulting to none and then doing if arg is None: is totally functional. On the flip side, doing a lazy in the function definition would save time evaluating defaults while also fitting in. Your argument has convinced me, and I now take (what i believe to be) your position: def stuff(arg = lazy f()): should result in a function where the default value of arg is not evaluated until first function call, and then the value of the expression is used as the default. Now, back to what Michel was probably actually asking. In the case of: def stuff(arg = lazy []): is the default value of arg a new list with each execution? (i.e. the resulting value of the expression is `make a new list`) I would say that for consistency's sake, not, which I believe would be consistent with the logic behind why default values being [] are kept between calls. a = lazy [] b = a a.append('a') print(b) # expected behavior is ['a'] I maintain that it would be nice for there to be a way to say (the default value of this argument is to run some expression *every time*), but delayed/lazy probably isn't that. On 19 February 2017 at 13:33, David Mertz <mertz@gnosis.cx> wrote:

...

On Sun, Feb 19, 2017 at 10:13 AM, Joseph Hackman <josephhackman@gmail.com> wrote:

...
My honest preference would be that the [] is evaluated fresh each time the function is called. def stuff(arg=delayed f()): would result in f() being called every time stuff() is. This seems more valuable to me than just doing it once when the function is first called.

This doesn't make sense. Function definition time is very different than function execution time. Changing that distinction is a WAY bigger change than I think we should contemplate.

Moreover, there is a completely obvious way to spell the behavior you want:

def stuff():

arg = f()

# ... whatever ...

This is exactly the obvious way to spell "f() is called every time stuff() is".

-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

David Mertz

7:27 p.m.

On Sun, Feb 19, 2017 at 10:47 AM, Joseph Hackman <josephhackman@gmail.com> wrote:

...

Your argument has convinced me, and I now take (what i believe to be) your

...
position:

def stuff(arg = lazy f()):

should result in a function where the default value of arg is not evaluated until first function call, and then the value of the expression is used as the default.

Indeed. And in particular, f() *might not* be excuted even during that first (or any) function call, depending on what conditional path are taken within the function body. That's the crucial part. The function may have perfectly good uses where you don't want to take the computational time, or have the side-effects, but other uses where you need that deferred value or action. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

Pavol Lisy

10:15 p.m.

On 2/19/17, David Mertz <mertz@gnosis.cx> wrote:

...

On Sun, Feb 19, 2017 at 10:13 AM, Joseph Hackman <josephhackman@gmail.com> wrote:

...
My honest preference would be that the [] is evaluated fresh each time the function is called. def stuff(arg=delayed f()): would result in f() being called every time stuff() is. This seems more valuable to me than just doing it once when the function is first called.

This doesn't make sense. Function definition time is very different than function execution time. Changing that distinction is a WAY bigger change than I think we should contemplate.

Moreover, there is a completely obvious way to spell the behavior you want:

def stuff():

arg = f()

# ... whatever ...

This is exactly the obvious way to spell "f() is called every time stuff() is".

A few more complicated will be set non default value to argument. But how to spell same behaviour in this case? -> def stuff(arg=delayed locals())

Pavol Lisy

10:54 p.m.

On 2/19/17, Michel Desmoulin <desmoulinmichel@gmail.com> wrote:

...

Evnetually we also may need to allow this:

a = lazy stuff if a is not lazy: print(a)

But then lazy can't be used a var name to help with the transition.

What about this? if not inspect.islazy(a): print(a) Next idea is probably obvious: class Busy_Beaver: ''' we want to be sure that beaver is disturbed only if it is really necessary ''' def __call_me_later__(self, n): return too_expensive(n)

Michel Desmoulin

8:30 a.m.

I wrote a blog post about this, and someone asked me if it meant allowing lazy imports to make optional imports easier. Someting like: lazy import foo lazy from foo import bar So now if I don't use the imports, the module is not loaded, which could also significantly speed up applications starting time with a lot of imports.

tritium-list＠sdamon.com

11:18 a.m.

...

-----Original Message----- From: Python-ideas [mailto:python-ideas-bounces+tritium- list=sdamon.com@python.org] On Behalf Of Michel Desmoulin Sent: Monday, February 20, 2017 3:30 AM To: python-ideas@python.org Subject: Re: [Python-ideas] Delayed Execution via Keyword

I wrote a blog post about this, and someone asked me if it meant allowing lazy imports to make optional imports easier.

Someting like:

lazy import foo lazy from foo import bar

So now if I don't use the imports, the module is not loaded, which could also significantly speed up applications starting time with a lot of imports.

Would that not also make a failure to import an error at the time of executing the imported piece of code rather than at the place of import? And how would optional imports work if they are not loaded until use? Right now, optional imports are done by wrapping the import statement in a try/except, would you not need to do that handling everywhere the imported object is used instead? (I haven't been following the entire thread, and I don't know if this is a forest/tress argument)

...

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Abe Dillon

12:52 a.m.

On Fri, Feb 17, 2017, Steven D'Aprano wrote:

...

JIT compilation delays *compiling* the code to run-time. This is a proposal for delaying *running* the code until such time as some other piece of code actually needs the result.

My thought was that if a compiler is capable of determining what needs to be compiled just in time, then an interpreter might be able to determine what expressions need to be evaluated just when their results are actually used. So if you had code that looked like:

...

...
...
log.debug("data: %s", expensive())

The interpreter could skip evaluating the expensive function if the result is never used. It would only evaluate it "just in time". This would almost certainly require just in time compilation as well, otherwise the byte code that calls the "log.debug" function would be unaware of the byte code that implements the function. This is probably a pipe-dream, though; because the interpreter would have to be aware of side effects. On Mon, Feb 20, 2017 at 5:18 AM, <tritium-list@sdamon.com> wrote:

...

...
-----Original Message----- From: Python-ideas [mailto:python-ideas-bounces+tritium- list=sdamon.com@python.org] On Behalf Of Michel Desmoulin Sent: Monday, February 20, 2017 3:30 AM To: python-ideas@python.org Subject: Re: [Python-ideas] Delayed Execution via Keyword

I wrote a blog post about this, and someone asked me if it meant allowing lazy imports to make optional imports easier.

Someting like:

lazy import foo lazy from foo import bar

So now if I don't use the imports, the module is not loaded, which could also significantly speed up applications starting time with a lot of imports.

Would that not also make a failure to import an error at the time of executing the imported piece of code rather than at the place of import? And how would optional imports work if they are not loaded until use? Right now, optional imports are done by wrapping the import statement in a try/except, would you not need to do that handling everywhere the imported object is used instead?

(I haven't been following the entire thread, and I don't know if this is a forest/tress argument)

...
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Joshua Morton

1:07 a.m.

This comes from a bit of a misunderstanding of how an interpreter figures out what needs to be compiled. Most (all?) JIT compilers run code in an interpreted manner, and then compile subsections down to efficient machine code when they notice that the same code path is taken repeatedly, so in pypy something like x = 0 for i in range(100000): x += 1 would, get, after 10-20 runs through the loop, turned into assembly that looked like what you'd write in pure C, instead of the very indirection and pointer heavy code that such a loop would be if you could take it and convert it to cpython actually executes, for example. So the "hot" code is still run. All that said, this is a bit of an off topic discussion and probably shouldn't be on list. What you really do want is functional purity, which is a different concept and one that python as a language can't easily provide no matter what. --Josh On Mon, Feb 20, 2017 at 7:53 PM Abe Dillon <abedillon@gmail.com> wrote:

...

On Fri, Feb 17, 2017, Steven D'Aprano wrote:

JIT compilation delays *compiling* the code to run-time. This is a proposal for delaying *running* the code until such time as some other piece of code actually needs the result.

My thought was that if a compiler is capable of determining what needs to be compiled just in time, then an interpreter might be able to determine what expressions need to be evaluated just when their results are actually used.

So if you had code that looked like:

...
...
...
log.debug("data: %s", expensive())

The interpreter could skip evaluating the expensive function if the result is never used. It would only evaluate it "just in time". This would almost certainly require just in time compilation as well, otherwise the byte code that calls the "log.debug" function would be unaware of the byte code that implements the function.

This is probably a pipe-dream, though; because the interpreter would have to be aware of side effects.

On Mon, Feb 20, 2017 at 5:18 AM, <tritium-list@sdamon.com> wrote:

...
-----Original Message----- From: Python-ideas [mailto:python-ideas-bounces+tritium- list=sdamon.com@python.org] On Behalf Of Michel Desmoulin Sent: Monday, February 20, 2017 3:30 AM To: python-ideas@python.org Subject: Re: [Python-ideas] Delayed Execution via Keyword

I wrote a blog post about this, and someone asked me if it meant allowing lazy imports to make optional imports easier.

Someting like:

lazy import foo lazy from foo import bar

So now if I don't use the imports, the module is not loaded, which could also significantly speed up applications starting time with a lot of imports.

Would that not also make a failure to import an error at the time of executing the imported piece of code rather than at the place of import? And how would optional imports work if they are not loaded until use? Right now, optional imports are done by wrapping the import statement in a try/except, would you not need to do that handling everywhere the imported object is used instead?

(I haven't been following the entire thread, and I don't know if this is a forest/tress argument)

...
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Abe Dillon

March 2017

11:42 p.m.

I'm going to repeat here what I posted in the thread on lazy imports. If it's possible for the interpreter to determine when it needs to force evaluation of a lazy expression or statement, then why not use them everywhere? If that's the case, then why not make everything lazy by default? Why not make it a service of the language to lazify your code (analogous to garbage collection) so a human doesn't have to worry about screwing it up? There are, AFAIK, three things that *must* force evaluation of lazy expressions or statements: 1) Before the GIL is released, all pending lazy code must be evaluated since the current thread can't know what variables another thread will try to access (unless there's a way to explicitly label variables as "shared", then it will only force evaluation of those). 2) Branching statements force evaluation of anything required to evaluate the conditional clause. 3) I/O forces evaluation of any involved lazy expressions. On Mon, Feb 20, 2017 at 7:07 PM, Joshua Morton <joshua.morton13@gmail.com> wrote:

...

This comes from a bit of a misunderstanding of how an interpreter figures out what needs to be compiled. Most (all?) JIT compilers run code in an interpreted manner, and then compile subsections down to efficient machine code when they notice that the same code path is taken repeatedly, so in pypy something like

x = 0 for i in range(100000): x += 1

would, get, after 10-20 runs through the loop, turned into assembly that looked like what you'd write in pure C, instead of the very indirection and pointer heavy code that such a loop would be if you could take it and convert it to cpython actually executes, for example. So the "hot" code is still run.

All that said, this is a bit of an off topic discussion and probably shouldn't be on list.

What you really do want is functional purity, which is a different concept and one that python as a language can't easily provide no matter what.

--Josh

On Mon, Feb 20, 2017 at 7:53 PM Abe Dillon <abedillon@gmail.com> wrote:

...
On Fri, Feb 17, 2017, Steven D'Aprano wrote:

JIT compilation delays *compiling* the code to run-time. This is a proposal for delaying *running* the code until such time as some other piece of code actually needs the result.

My thought was that if a compiler is capable of determining what needs to be compiled just in time, then an interpreter might be able to determine what expressions need to be evaluated just when their results are actually used.

So if you had code that looked like:

...
...
...
log.debug("data: %s", expensive())

The interpreter could skip evaluating the expensive function if the result is never used. It would only evaluate it "just in time". This would almost certainly require just in time compilation as well, otherwise the byte code that calls the "log.debug" function would be unaware of the byte code that implements the function.

This is probably a pipe-dream, though; because the interpreter would have to be aware of side effects.

On Mon, Feb 20, 2017 at 5:18 AM, <tritium-list@sdamon.com> wrote:

...
-----Original Message----- From: Python-ideas [mailto:python-ideas-bounces+tritium- list=sdamon.com@python.org] On Behalf Of Michel Desmoulin Sent: Monday, February 20, 2017 3:30 AM To: python-ideas@python.org Subject: Re: [Python-ideas] Delayed Execution via Keyword

I wrote a blog post about this, and someone asked me if it meant allowing lazy imports to make optional imports easier.

Someting like:

lazy import foo lazy from foo import bar

So now if I don't use the imports, the module is not loaded, which could also significantly speed up applications starting time with a lot of imports.

Would that not also make a failure to import an error at the time of executing the imported piece of code rather than at the place of import? And how would optional imports work if they are not loaded until use? Right now, optional imports are done by wrapping the import statement in a try/except, would you not need to do that handling everywhere the imported object is used instead?

(I haven't been following the entire thread, and I don't know if this is a forest/tress argument)

...
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Joseph Jevnik

11:53 p.m.

Other things that scrutinize an expression are iteration or branching (with the current evaluation model). If `xs` is a thunk, then `for x in xs` must scrutinize `xs`. At first this doesn't seem required; however, in general `next` imposes a data dependency on the next call to `next`. For example: x0 = next(xs) x1 = next(xs) print(x1) print(x0) If `next` doesn't force computation then evaluating `x1` before `x0` will bind `x1` to `xs[0]` which is not what the eager version of the code does. To preserve the current semantics of the language you cannot defer arbitrary expressions because they may have observable side-effects. Automatically translating would require knowing ahead of time if a function can have observable side effects, but that is not possible in Python. Because it is impossible to tell in the general case, we must rely on the user to tell us when it is safe to defer an expression. On Thu, Mar 2, 2017 at 6:42 PM, Abe Dillon <abedillon@gmail.com> wrote:

...

I'm going to repeat here what I posted in the thread on lazy imports. If it's possible for the interpreter to determine when it needs to force evaluation of a lazy expression or statement, then why not use them everywhere? If that's the case, then why not make everything lazy by default? Why not make it a service of the language to lazify your code (analogous to garbage collection) so a human doesn't have to worry about screwing it up?

There are, AFAIK, three things that *must* force evaluation of lazy expressions or statements:

1) Before the GIL is released, all pending lazy code must be evaluated since the current thread can't know what variables another thread will try to access (unless there's a way to explicitly label variables as "shared", then it will only force evaluation of those).

2) Branching statements force evaluation of anything required to evaluate the conditional clause.

3) I/O forces evaluation of any involved lazy expressions.

On Mon, Feb 20, 2017 at 7:07 PM, Joshua Morton <joshua.morton13@gmail.com> wrote:

...
This comes from a bit of a misunderstanding of how an interpreter figures out what needs to be compiled. Most (all?) JIT compilers run code in an interpreted manner, and then compile subsections down to efficient machine code when they notice that the same code path is taken repeatedly, so in pypy something like

x = 0 for i in range(100000): x += 1

would, get, after 10-20 runs through the loop, turned into assembly that looked like what you'd write in pure C, instead of the very indirection and pointer heavy code that such a loop would be if you could take it and convert it to cpython actually executes, for example. So the "hot" code is still run.

All that said, this is a bit of an off topic discussion and probably shouldn't be on list.

What you really do want is functional purity, which is a different concept and one that python as a language can't easily provide no matter what.

--Josh

On Mon, Feb 20, 2017 at 7:53 PM Abe Dillon <abedillon@gmail.com> wrote:

...
On Fri, Feb 17, 2017, Steven D'Aprano wrote:

JIT compilation delays *compiling* the code to run-time. This is a proposal for delaying *running* the code until such time as some other piece of code actually needs the result.

My thought was that if a compiler is capable of determining what needs to be compiled just in time, then an interpreter might be able to determine what expressions need to be evaluated just when their results are actually used.

So if you had code that looked like:

...
...
...
log.debug("data: %s", expensive())

The interpreter could skip evaluating the expensive function if the result is never used. It would only evaluate it "just in time". This would almost certainly require just in time compilation as well, otherwise the byte code that calls the "log.debug" function would be unaware of the byte code that implements the function.

This is probably a pipe-dream, though; because the interpreter would have to be aware of side effects.

On Mon, Feb 20, 2017 at 5:18 AM, <tritium-list@sdamon.com> wrote:

...
-----Original Message----- From: Python-ideas [mailto:python-ideas-bounces+tritium- list=sdamon.com@python.org] On Behalf Of Michel Desmoulin Sent: Monday, February 20, 2017 3:30 AM To: python-ideas@python.org Subject: Re: [Python-ideas] Delayed Execution via Keyword

I wrote a blog post about this, and someone asked me if it meant allowing lazy imports to make optional imports easier.

Someting like:

lazy import foo lazy from foo import bar

So now if I don't use the imports, the module is not loaded, which could also significantly speed up applications starting time with a lot of imports.

Would that not also make a failure to import an error at the time of executing the imported piece of code rather than at the place of import? And how would optional imports work if they are not loaded until use? Right now, optional imports are done by wrapping the import statement in a try/except, would you not need to do that handling everywhere the imported object is used instead?

(I haven't been following the entire thread, and I don't know if this is a forest/tress argument)

...
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Abe Dillon

1:26 a.m.

I don't think you have to make a special case for iteration. When the interpreter hits:

...

...
...
print(x1)

print falls under I/O, so it forces evaluation of x1, so we back-track to where x1 is evaluated:

...

...
...
x1 = next(xs)

And in the next call, we find that we must evaluate the state of the iterator, so we have to back-track to:

...

...
...
x0 = next(xs)

Evaluate that, then move forward. You essentially keep a graph of pending/unevaluated expressions linked by their dependencies and evaluate branches of the graph as needed. You need to evaluate state to navigate conditional branches, and whenever state is passed outside of the interpreter's scope (like I/O or multi-threading). I think problems might crop up in parts of the language that are pure c-code. For instance; I don't know if the state variables in a list iterator are actually visible to the Interpreter or if it's implemented in C that is inscrutable to the interpreter. On Mar 2, 2017 5:54 PM, "Joseph Jevnik" <joejev@gmail.com> wrote: Other things that scrutinize an expression are iteration or branching (with the current evaluation model). If `xs` is a thunk, then `for x in xs` must scrutinize `xs`. At first this doesn't seem required; however, in general `next` imposes a data dependency on the next call to `next`. For example: x0 = next(xs) x1 = next(xs) print(x1) print(x0) If `next` doesn't force computation then evaluating `x1` before `x0` will bind `x1` to `xs[0]` which is not what the eager version of the code does. To preserve the current semantics of the language you cannot defer arbitrary expressions because they may have observable side-effects. Automatically translating would require knowing ahead of time if a function can have observable side effects, but that is not possible in Python. Because it is impossible to tell in the general case, we must rely on the user to tell us when it is safe to defer an expression. On Thu, Mar 2, 2017 at 6:42 PM, Abe Dillon <abedillon@gmail.com> wrote:

...

I'm going to repeat here what I posted in the thread on lazy imports. If it's possible for the interpreter to determine when it needs to force evaluation of a lazy expression or statement, then why not use them everywhere? If that's the case, then why not make everything lazy by default? Why not make it a service of the language to lazify your code (analogous to garbage collection) so a human doesn't have to worry about screwing it up?

There are, AFAIK, three things that *must* force evaluation of lazy expressions or statements:

1) Before the GIL is released, all pending lazy code must be evaluated since the current thread can't know what variables another thread will try to access (unless there's a way to explicitly label variables as "shared", then it will only force evaluation of those).

2) Branching statements force evaluation of anything required to evaluate the conditional clause.

3) I/O forces evaluation of any involved lazy expressions.

On Mon, Feb 20, 2017 at 7:07 PM, Joshua Morton <joshua.morton13@gmail.com> wrote:

...
This comes from a bit of a misunderstanding of how an interpreter figures out what needs to be compiled. Most (all?) JIT compilers run code in an interpreted manner, and then compile subsections down to efficient machine code when they notice that the same code path is taken repeatedly, so in pypy something like

x = 0 for i in range(100000): x += 1

would, get, after 10-20 runs through the loop, turned into assembly that looked like what you'd write in pure C, instead of the very indirection and pointer heavy code that such a loop would be if you could take it and convert it to cpython actually executes, for example. So the "hot" code is still run.

All that said, this is a bit of an off topic discussion and probably shouldn't be on list.

What you really do want is functional purity, which is a different concept and one that python as a language can't easily provide no matter what.

--Josh

On Mon, Feb 20, 2017 at 7:53 PM Abe Dillon <abedillon@gmail.com> wrote:

...
On Fri, Feb 17, 2017, Steven D'Aprano wrote:

JIT compilation delays *compiling* the code to run-time. This is a proposal for delaying *running* the code until such time as some other piece of code actually needs the result.

My thought was that if a compiler is capable of determining what needs to be compiled just in time, then an interpreter might be able to determine what expressions need to be evaluated just when their results are actually used.

So if you had code that looked like:

...
...
...
log.debug("data: %s", expensive())

The interpreter could skip evaluating the expensive function if the result is never used. It would only evaluate it "just in time". This would almost certainly require just in time compilation as well, otherwise the byte code that calls the "log.debug" function would be unaware of the byte code that implements the function.

This is probably a pipe-dream, though; because the interpreter would have to be aware of side effects.

On Mon, Feb 20, 2017 at 5:18 AM, <tritium-list@sdamon.com> wrote:

...
-----Original Message----- From: Python-ideas [mailto:python-ideas-bounces+tritium- list=sdamon.com@python.org] On Behalf Of Michel Desmoulin Sent: Monday, February 20, 2017 3:30 AM To: python-ideas@python.org Subject: Re: [Python-ideas] Delayed Execution via Keyword

I wrote a blog post about this, and someone asked me if it meant allowing lazy imports to make optional imports easier.

Someting like:

lazy import foo lazy from foo import bar

So now if I don't use the imports, the module is not loaded, which could also significantly speed up applications starting time with a lot of imports.

Would that not also make a failure to import an error at the time of executing the imported piece of code rather than at the place of import? And how would optional imports work if they are not loaded until use? Right now, optional imports are done by wrapping the import statement in a try/except, would you not need to do that handling everywhere the imported object is used instead?

(I haven't been following the entire thread, and I don't know if this is a forest/tress argument)

...
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Joseph Jevnik

1:30 a.m.

without special casing iteration how do you know that `x1 = next(xs)` depends on the value of `x0`? If you assume every operation depends on every other operation then you have implemented an eager evaluation model. On Thu, Mar 2, 2017 at 8:26 PM, Abe Dillon <abedillon@gmail.com> wrote:

...

I don't think you have to make a special case for iteration.

When the interpreter hits:

...
...
...
print(x1)

print falls under I/O, so it forces evaluation of x1, so we back-track to where x1 is evaluated:

...
...
...
x1 = next(xs)

And in the next call, we find that we must evaluate the state of the iterator, so we have to back-track to:

...
...
...
x0 = next(xs)

Evaluate that, then move forward.

You essentially keep a graph of pending/unevaluated expressions linked by their dependencies and evaluate branches of the graph as needed. You need to evaluate state to navigate conditional branches, and whenever state is passed outside of the interpreter's scope (like I/O or multi-threading). I think problems might crop up in parts of the language that are pure c-code. For instance; I don't know if the state variables in a list iterator are actually visible to the Interpreter or if it's implemented in C that is inscrutable to the interpreter.

On Mar 2, 2017 5:54 PM, "Joseph Jevnik" <joejev@gmail.com> wrote:

Other things that scrutinize an expression are iteration or branching (with the current evaluation model). If `xs` is a thunk, then `for x in xs` must scrutinize `xs`. At first this doesn't seem required; however, in general `next` imposes a data dependency on the next call to `next`. For example:

x0 = next(xs) x1 = next(xs)

print(x1) print(x0)

If `next` doesn't force computation then evaluating `x1` before `x0` will bind `x1` to `xs[0]` which is not what the eager version of the code does.

To preserve the current semantics of the language you cannot defer arbitrary expressions because they may have observable side-effects. Automatically translating would require knowing ahead of time if a function can have observable side effects, but that is not possible in Python. Because it is impossible to tell in the general case, we must rely on the user to tell us when it is safe to defer an expression.

On Thu, Mar 2, 2017 at 6:42 PM, Abe Dillon <abedillon@gmail.com> wrote:

...
I'm going to repeat here what I posted in the thread on lazy imports. If it's possible for the interpreter to determine when it needs to force evaluation of a lazy expression or statement, then why not use them everywhere? If that's the case, then why not make everything lazy by default? Why not make it a service of the language to lazify your code (analogous to garbage collection) so a human doesn't have to worry about screwing it up?

There are, AFAIK, three things that *must* force evaluation of lazy expressions or statements:

1) Before the GIL is released, all pending lazy code must be evaluated since the current thread can't know what variables another thread will try to access (unless there's a way to explicitly label variables as "shared", then it will only force evaluation of those).

2) Branching statements force evaluation of anything required to evaluate the conditional clause.

3) I/O forces evaluation of any involved lazy expressions.

On Mon, Feb 20, 2017 at 7:07 PM, Joshua Morton <joshua.morton13@gmail.com

...
wrote:

...
This comes from a bit of a misunderstanding of how an interpreter figures out what needs to be compiled. Most (all?) JIT compilers run code in an interpreted manner, and then compile subsections down to efficient machine code when they notice that the same code path is taken repeatedly, so in pypy something like

x = 0 for i in range(100000): x += 1

would, get, after 10-20 runs through the loop, turned into assembly that looked like what you'd write in pure C, instead of the very indirection and pointer heavy code that such a loop would be if you could take it and convert it to cpython actually executes, for example. So the "hot" code is still run.

All that said, this is a bit of an off topic discussion and probably shouldn't be on list.

What you really do want is functional purity, which is a different concept and one that python as a language can't easily provide no matter what.

--Josh

On Mon, Feb 20, 2017 at 7:53 PM Abe Dillon <abedillon@gmail.com> wrote:

...
On Fri, Feb 17, 2017, Steven D'Aprano wrote:

JIT compilation delays *compiling* the code to run-time. This is a proposal for delaying *running* the code until such time as some other piece of code actually needs the result.

My thought was that if a compiler is capable of determining what needs to be compiled just in time, then an interpreter might be able to determine what expressions need to be evaluated just when their results are actually used.

So if you had code that looked like:

...
...
> log.debug("data: %s", expensive())

The interpreter could skip evaluating the expensive function if the result is never used. It would only evaluate it "just in time". This would almost certainly require just in time compilation as well, otherwise the byte code that calls the "log.debug" function would be unaware of the byte code that implements the function.

This is probably a pipe-dream, though; because the interpreter would have to be aware of side effects.

On Mon, Feb 20, 2017 at 5:18 AM, <tritium-list@sdamon.com> wrote:

...
-----Original Message----- From: Python-ideas [mailto:python-ideas-bounces+tritium- list=sdamon.com@python.org] On Behalf Of Michel Desmoulin Sent: Monday, February 20, 2017 3:30 AM To: python-ideas@python.org Subject: Re: [Python-ideas] Delayed Execution via Keyword

I wrote a blog post about this, and someone asked me if it meant allowing lazy imports to make optional imports easier.

Someting like:

lazy import foo lazy from foo import bar

So now if I don't use the imports, the module is not loaded, which could also significantly speed up applications starting time with a lot of imports.

Would that not also make a failure to import an error at the time of executing the imported piece of code rather than at the place of import? And how would optional imports work if they are not loaded until use? Right now, optional imports are done by wrapping the import statement in a try/except, would you not need to do that handling everywhere the imported object is used instead?

(I haven't been following the entire thread, and I don't know if this is a forest/tress argument)

...
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Abe Dillon

2:10 a.m.

...

without special casing iteration how do you know that `x1 = next(xs)` depends on the value of `x0`?

`x1 = next(xs)` doesn't depend on the value of `x0`, it depends on the state of xs. In order to evaluate `next(xs)` you have to jump into the function call and evaluate the relevant expressions within, which will, presumably, mean evaluating the value of some place-holder variable or something, which will trigger evaluation of preceding, pending expressions that modify the value of that place-holder variable, which includes `x0 = next(xs)`. You do have a point, though; if the pending-execution graph has to be fine enough scale to capture all that, then it's a dubious claim that juggling such a construct would save any time over simply executing the code as you go. This, I believe, goes beyond iterators and gets at the heart of what Josh said: What you really do want is functional purity, which is a different concept

...

and one that python as a language can't easily provide no matter what.

`next` is not a pure function, because it has side-effects: it changes state variables. Even if those side-effects can be tracked by the interpreter, they present a challenge. In the example:

...

...
...
log.warning(expensive_function())

Where we want to avoid executing expensive_function(). It's likely that the function iterates over some large amount of data. According to the `x1 = next(xs)` example, that means building a huge pending-execution graph in case that function does need to be evaluated, so you can track the iterator state changes all the way back to the first iteration before executing. Perhaps there's some clever trick I'm not thinking of to keep the graph small and only expand it as needed. I don't know. Maybe, like Joshua Morton's JIT example, you could automatically identify loop patterns and collapse them somehow. I guess special casing iteration would help with that, though it's difficult to see what that would look like. On Thu, Mar 2, 2017 at 7:30 PM, Joseph Jevnik <joejev@gmail.com> wrote:

...

without special casing iteration how do you know that `x1 = next(xs)` depends on the value of `x0`? If you assume every operation depends on every other operation then you have implemented an eager evaluation model.

On Thu, Mar 2, 2017 at 8:26 PM, Abe Dillon <abedillon@gmail.com> wrote:

...
I don't think you have to make a special case for iteration.

When the interpreter hits:

...
...
...
print(x1)

print falls under I/O, so it forces evaluation of x1, so we back-track to where x1 is evaluated:

...
...
...
x1 = next(xs)

And in the next call, we find that we must evaluate the state of the iterator, so we have to back-track to:

...
...
...
x0 = next(xs)

Evaluate that, then move forward.

You essentially keep a graph of pending/unevaluated expressions linked by their dependencies and evaluate branches of the graph as needed. You need to evaluate state to navigate conditional branches, and whenever state is passed outside of the interpreter's scope (like I/O or multi-threading). I think problems might crop up in parts of the language that are pure c-code. For instance; I don't know if the state variables in a list iterator are actually visible to the Interpreter or if it's implemented in C that is inscrutable to the interpreter.

On Mar 2, 2017 5:54 PM, "Joseph Jevnik" <joejev@gmail.com> wrote:

Other things that scrutinize an expression are iteration or branching (with the current evaluation model). If `xs` is a thunk, then `for x in xs` must scrutinize `xs`. At first this doesn't seem required; however, in general `next` imposes a data dependency on the next call to `next`. For example:

x0 = next(xs) x1 = next(xs)

print(x1) print(x0)

If `next` doesn't force computation then evaluating `x1` before `x0` will bind `x1` to `xs[0]` which is not what the eager version of the code does.

To preserve the current semantics of the language you cannot defer arbitrary expressions because they may have observable side-effects. Automatically translating would require knowing ahead of time if a function can have observable side effects, but that is not possible in Python. Because it is impossible to tell in the general case, we must rely on the user to tell us when it is safe to defer an expression.

On Thu, Mar 2, 2017 at 6:42 PM, Abe Dillon <abedillon@gmail.com> wrote:

...
I'm going to repeat here what I posted in the thread on lazy imports. If it's possible for the interpreter to determine when it needs to force evaluation of a lazy expression or statement, then why not use them everywhere? If that's the case, then why not make everything lazy by default? Why not make it a service of the language to lazify your code (analogous to garbage collection) so a human doesn't have to worry about screwing it up?

There are, AFAIK, three things that *must* force evaluation of lazy expressions or statements:

1) Before the GIL is released, all pending lazy code must be evaluated since the current thread can't know what variables another thread will try to access (unless there's a way to explicitly label variables as "shared", then it will only force evaluation of those).

2) Branching statements force evaluation of anything required to evaluate the conditional clause.

3) I/O forces evaluation of any involved lazy expressions.

On Mon, Feb 20, 2017 at 7:07 PM, Joshua Morton < joshua.morton13@gmail.com> wrote:

...
This comes from a bit of a misunderstanding of how an interpreter figures out what needs to be compiled. Most (all?) JIT compilers run code in an interpreted manner, and then compile subsections down to efficient machine code when they notice that the same code path is taken repeatedly, so in pypy something like

x = 0 for i in range(100000): x += 1

would, get, after 10-20 runs through the loop, turned into assembly that looked like what you'd write in pure C, instead of the very indirection and pointer heavy code that such a loop would be if you could take it and convert it to cpython actually executes, for example. So the "hot" code is still run.

All that said, this is a bit of an off topic discussion and probably shouldn't be on list.

What you really do want is functional purity, which is a different concept and one that python as a language can't easily provide no matter what.

--Josh

On Mon, Feb 20, 2017 at 7:53 PM Abe Dillon <abedillon@gmail.com> wrote:

...
On Fri, Feb 17, 2017, Steven D'Aprano wrote:

JIT compilation delays *compiling* the code to run-time. This is a proposal for delaying *running* the code until such time as some other piece of code actually needs the result.

My thought was that if a compiler is capable of determining what needs to be compiled just in time, then an interpreter might be able to determine what expressions need to be evaluated just when their results are actually used.

So if you had code that looked like:

...
>> log.debug("data: %s", expensive())

The interpreter could skip evaluating the expensive function if the result is never used. It would only evaluate it "just in time". This would almost certainly require just in time compilation as well, otherwise the byte code that calls the "log.debug" function would be unaware of the byte code that implements the function.

This is probably a pipe-dream, though; because the interpreter would have to be aware of side effects.

On Mon, Feb 20, 2017 at 5:18 AM, <tritium-list@sdamon.com> wrote:

...
-----Original Message----- From: Python-ideas [mailto:python-ideas-bounces+tritium- list=sdamon.com@python.org] On Behalf Of Michel Desmoulin Sent: Monday, February 20, 2017 3:30 AM To: python-ideas@python.org Subject: Re: [Python-ideas] Delayed Execution via Keyword

I wrote a blog post about this, and someone asked me if it meant allowing lazy imports to make optional imports easier.

Someting like:

lazy import foo lazy from foo import bar

So now if I don't use the imports, the module is not loaded, which could also significantly speed up applications starting time with a lot of imports.

Would that not also make a failure to import an error at the time of executing the imported piece of code rather than at the place of import? And how would optional imports work if they are not loaded until use? Right now, optional imports are done by wrapping the import statement in a try/except, would you not need to do that handling everywhere the imported object is used instead?

(I haven't been following the entire thread, and I don't know if this is a forest/tress argument)

...
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Abe Dillon

2:24 a.m.

Another problem I thought of was how this might complicate stack tracebacks. If you execute the following code: [1] a = ["hello", 1] [2] b = "1" + 1 [3] a = "".join(a) [4] print(a) The interpreter would build a graph until it hit line 4 and was forced to evaluate `a`. It would track `a` back to the branch: [1]->[3]-> and raise an error from line [3] when you would expect line [2] to raise an error first. I suppose it may be possible to catch any exceptions and force full evaluation of nodes up to that point to find any preceding errors, but that sounds like a harry proposition... On Thu, Mar 2, 2017 at 8:10 PM, Abe Dillon <abedillon@gmail.com> wrote:

...

without special casing iteration how do you know that `x1 = next(xs)`

...
depends on the value of `x0`?

`x1 = next(xs)` doesn't depend on the value of `x0`, it depends on the state of xs. In order to evaluate `next(xs)` you have to jump into the function call and evaluate the relevant expressions within, which will, presumably, mean evaluating the value of some place-holder variable or something, which will trigger evaluation of preceding, pending expressions that modify the value of that place-holder variable, which includes `x0 = next(xs)`.

You do have a point, though; if the pending-execution graph has to be fine enough scale to capture all that, then it's a dubious claim that juggling such a construct would save any time over simply executing the code as you go. This, I believe, goes beyond iterators and gets at the heart of what Josh said:

What you really do want is functional purity, which is a different concept

...
and one that python as a language can't easily provide no matter what.

`next` is not a pure function, because it has side-effects: it changes state variables. Even if those side-effects can be tracked by the interpreter, they present a challenge. In the example:

...
...
...
log.warning(expensive_function())

Where we want to avoid executing expensive_function(). It's likely that the function iterates over some large amount of data. According to the `x1 = next(xs)` example, that means building a huge pending-execution graph in case that function does need to be evaluated, so you can track the iterator state changes all the way back to the first iteration before executing.

Perhaps there's some clever trick I'm not thinking of to keep the graph small and only expand it as needed. I don't know. Maybe, like Joshua Morton's JIT example, you could automatically identify loop patterns and collapse them somehow. I guess special casing iteration would help with that, though it's difficult to see what that would look like.

On Thu, Mar 2, 2017 at 7:30 PM, Joseph Jevnik <joejev@gmail.com> wrote:

...
without special casing iteration how do you know that `x1 = next(xs)` depends on the value of `x0`? If you assume every operation depends on every other operation then you have implemented an eager evaluation model.

On Thu, Mar 2, 2017 at 8:26 PM, Abe Dillon <abedillon@gmail.com> wrote:

...
I don't think you have to make a special case for iteration.

When the interpreter hits:

...
...
...
print(x1)

print falls under I/O, so it forces evaluation of x1, so we back-track to where x1 is evaluated:

...
...
...
x1 = next(xs)

And in the next call, we find that we must evaluate the state of the iterator, so we have to back-track to:

...
...
...
x0 = next(xs)

Evaluate that, then move forward.

You essentially keep a graph of pending/unevaluated expressions linked by their dependencies and evaluate branches of the graph as needed. You need to evaluate state to navigate conditional branches, and whenever state is passed outside of the interpreter's scope (like I/O or multi-threading). I think problems might crop up in parts of the language that are pure c-code. For instance; I don't know if the state variables in a list iterator are actually visible to the Interpreter or if it's implemented in C that is inscrutable to the interpreter.

On Mar 2, 2017 5:54 PM, "Joseph Jevnik" <joejev@gmail.com> wrote:

Other things that scrutinize an expression are iteration or branching (with the current evaluation model). If `xs` is a thunk, then `for x in xs` must scrutinize `xs`. At first this doesn't seem required; however, in general `next` imposes a data dependency on the next call to `next`. For example:

x0 = next(xs) x1 = next(xs)

print(x1) print(x0)

If `next` doesn't force computation then evaluating `x1` before `x0` will bind `x1` to `xs[0]` which is not what the eager version of the code does.

To preserve the current semantics of the language you cannot defer arbitrary expressions because they may have observable side-effects. Automatically translating would require knowing ahead of time if a function can have observable side effects, but that is not possible in Python. Because it is impossible to tell in the general case, we must rely on the user to tell us when it is safe to defer an expression.

On Thu, Mar 2, 2017 at 6:42 PM, Abe Dillon <abedillon@gmail.com> wrote:

...
I'm going to repeat here what I posted in the thread on lazy imports. If it's possible for the interpreter to determine when it needs to force evaluation of a lazy expression or statement, then why not use them everywhere? If that's the case, then why not make everything lazy by default? Why not make it a service of the language to lazify your code (analogous to garbage collection) so a human doesn't have to worry about screwing it up?

There are, AFAIK, three things that *must* force evaluation of lazy expressions or statements:

1) Before the GIL is released, all pending lazy code must be evaluated since the current thread can't know what variables another thread will try to access (unless there's a way to explicitly label variables as "shared", then it will only force evaluation of those).

2) Branching statements force evaluation of anything required to evaluate the conditional clause.

3) I/O forces evaluation of any involved lazy expressions.

On Mon, Feb 20, 2017 at 7:07 PM, Joshua Morton < joshua.morton13@gmail.com> wrote:

...
This comes from a bit of a misunderstanding of how an interpreter figures out what needs to be compiled. Most (all?) JIT compilers run code in an interpreted manner, and then compile subsections down to efficient machine code when they notice that the same code path is taken repeatedly, so in pypy something like

x = 0 for i in range(100000): x += 1

would, get, after 10-20 runs through the loop, turned into assembly that looked like what you'd write in pure C, instead of the very indirection and pointer heavy code that such a loop would be if you could take it and convert it to cpython actually executes, for example. So the "hot" code is still run.

All that said, this is a bit of an off topic discussion and probably shouldn't be on list.

What you really do want is functional purity, which is a different concept and one that python as a language can't easily provide no matter what.

--Josh

On Mon, Feb 20, 2017 at 7:53 PM Abe Dillon <abedillon@gmail.com> wrote:

...
On Fri, Feb 17, 2017, Steven D'Aprano wrote:

JIT compilation delays *compiling* the code to run-time. This is a proposal for delaying *running* the code until such time as some other piece of code actually needs the result.

My thought was that if a compiler is capable of determining what needs to be compiled just in time, then an interpreter might be able to determine what expressions need to be evaluated just when their results are actually used.

So if you had code that looked like:

>>> log.debug("data: %s", expensive())

The interpreter could skip evaluating the expensive function if the result is never used. It would only evaluate it "just in time". This would almost certainly require just in time compilation as well, otherwise the byte code that calls the "log.debug" function would be unaware of the byte code that implements the function.

This is probably a pipe-dream, though; because the interpreter would have to be aware of side effects.

On Mon, Feb 20, 2017 at 5:18 AM, <tritium-list@sdamon.com> wrote:

> -----Original Message----- > From: Python-ideas [mailto:python-ideas-bounces+tritium- > list=sdamon.com@python.org] On Behalf Of Michel Desmoulin > Sent: Monday, February 20, 2017 3:30 AM > To: python-ideas@python.org > Subject: Re: [Python-ideas] Delayed Execution via Keyword > > I wrote a blog post about this, and someone asked me if it meant > allowing lazy imports to make optional imports easier. > > Someting like: > > lazy import foo > lazy from foo import bar > > So now if I don't use the imports, the module is not loaded, which could > also significantly speed up applications starting time with a lot of > imports.

Would that not also make a failure to import an error at the time of executing the imported piece of code rather than at the place of import? And how would optional imports work if they are not loaded until use? Right now, optional imports are done by wrapping the import statement in a try/except, would you not need to do that handling everywhere the imported object is used instead?

(I haven't been following the entire thread, and I don't know if this is a forest/tress argument)

> _______________________________________________ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

2914

Age (days ago)

2928

Last active (days ago)

List overview

Download

80 comments

16 participants

participants (16)

Abe Dillon
Chris Angelico
David Mertz
David Mertz
Ed Kellett
Joseph Hackman
Joseph Jevnik
Joshua Morton
Mark E. Haase
Michel Desmoulin
Nathaniel Smith
Pavol Lisy
Stephan Houben
Steven D'Aprano
Sven R. Kunze
tritium-list＠sdamon.com

Delayed Execution via Keyword

Stephan Houben

tags

participants (16)