One more time... lambda function <--- from *** signature def.

Starting new thread because this bike has a different shape and color. Yesterday I was thinking that just making the keyword lambda assignable like True, False, and None, would be enough. But the issue with that is lambda isn't a name to an actual object or type. That was the seed for this idea. How to get lambda like functionality into some sort of object that would be easy to use and explain. This morning I thought we could have in a functions definition something, like "*", and "**", to take an expression. Similar to Nicks idea with =:, but more general. The idea is to have "***" used in def mean to take "any" call expression and not evaluate it until *** is used on it. ie... the same rules as *. When used in a def to pack a tuple, and when used outside def, to unpack it. So, "***" used in a def, stores the call expression, at call time, and when used later, expresses it. A function call that captures an expression may be tricky to do. Here's one approach that requires sugar when a function defined with "***" is called. class TriStar: def __init__(self, expr): """ expr is a callable that takes no arguments. """ self.expr = expr def __tristar__(self): """ ***obj --> result """ return self.expr() def fn(***expr):... (Any other suggestions for how to do this would be good.) And at call time.... fn(...) --> fn(TriStar(expr=lambda:...)) So presuming we can do something like the above, the first case is ... def star_fn(***expr) return ***expr ... = star_fn(...) Which is a function that just returns whatever it's input is, and is even more general than using *args, **kwds. The call signature stored in expr isn't evaluated until it's returned with ***expr. So the evaluation is delayed, or lazy, but it's still explicit and very easy to read. This returns a lambda-like function. def star_lambda(***expr): return expr And is used this way... result = star_lambda(a * b + c) # captures expression. actual_result = ***result # *** resolves "result" here! The resolution is done with ***name, rather than name(). That's actually very good because it can pass through callable tests. So you can safely pass callable objects around without them getting called at the wrong time or place. We can shorten the name because star_lambda is just a function. L = star_lambda To me this is an exceptionally clean solution. Easy to use, and not to hard to explain. Seems a lot more like a python solution to me as well. Hoping it doesn't get shot down too quickly, Ron ;-)

On Sat, Mar 1, 2014 at 4:17 AM, Ron Adam <ron3200@gmail.com> wrote:
Interesting, but I don't like the way the interpretation of a function call depends on the target function. With both * and ** notations, there's absolutely no difference: the function is called with these positional and those keyword arguments, whether they came from actual args or from * or ** unpack/repacks; and there's no difference between a function that collects args with *args,**kwargs and one that collects them with individual names (or a C-level function that might do something altogether different). With this proposal, your star_lambda function's declaration changes the call site - instead of evaluating a*b+c, it has to construct an anonymous function and pass it along. ChrisA

On 02/28/2014 11:54 AM, Chris Angelico wrote:
It's not clear what differences you mean here... can you show some examples? I think we just are used to not thinking about it, But it's not really that different. def fn(*args, **kwds): ... This wraps args in a list, and kwds in a dict. It's up to the *function called* to do what is intended by the syntax. def fn(*args): ... fn(a, b, c) --> fn(list(a, b, c)) #depends on function called. def fn(**kwds): ... fn(a=1, b=2, c=3) --> fn(dict(a=1, b=2, c=3)) # here too. def fn(***expr): ... fn(expr) --> fn(TriStar(lambda:(expr))) # A bit more complex, but also the same. # Parentheses need to capture tuple packing # due to ',' having a higher precidence. The mechanism behind each of these may be somewhat different, but there are also similarities. def fn(***expr): return ***expr With these, it forwards the these existing cases nicely. a, b, c = fn(a, b, c) args = fn(*args)args, kwds kwds = fn(**kwds) args, kwds = fn(*args, **kwds) And just like '**' can't be used to pack a dictionary directly, we can't use '***' to pack an expression directly. Using "**" in a funciton unpacks the dictionary. Using "***" in a function call expresses the TriStar object. (* any name for the TriStar object would work. (small detail)) NOW here is the main limitation... :-/ a, b, c = fn(a, b, c=1) Which is because (a, b, c=1) isn't a valid expression outside of a function call. Or should this be captured as (a, b, {"c":1})? Sigh... darn edge cases. A bit more than an edge case I think. Any ideas? Cheers, Ron

On 1 Mar 2014 05:43, "Ron Adam" <ron3200@gmail.com> wrote:
examples? Remember that at compile time, Python has *no idea* what the actual signature of the target function is. Thus, all Python function calls use the following sequence (ignoring optimisations of special cases): 1. At the call site, the arguments are collected into a tuple of positional arguments and a dict of keyword arguments. 2. The interpreter hands that tuple and dict over to the target callable 3. The *target callable* then maps the supplied arguments to the defined parameters including filling in any default values. Any function related proposals need to account for the fact that from the compiler's point of view *every* function signature looks like "(*args, **kwds)" (although it may have optimised paths for the no-args case and the positional-args-only case), and that the target callable may not even be written in Python. Cheers, Nick.

On 02/28/2014 04:33 PM, Nick Coghlan wrote:
So functions can't be extended to take a triplet instead of a pair... (*args, **kwds, ***expr) Looking up... I think this is what you're referring to in ceval.c. #------------------- /* External interface to call any callable object. The arg must be a tuple or NULL. The kw must be a dict or NULL. */ PyObject * PyEval_CallObjectWithKeywords(PyObject *func, PyObject *arg, PyObject *kw) { #-------------------- And it wouldn't work in the normal case any way, as expressions are evaluated as they are put on the stack before calling the function. Somehow I was thinking this morning the code inside the function call parentheses f(...), could be parsed later than it actually is. And in the context of the function definition, possibly similar to how a comprehension is evaluated. But it would take some pretty big changes to do that I supose. Cheers, Ron

On Sat, Mar 1, 2014 at 1:13 PM, Ron Adam <ron3200@gmail.com> wrote:
The best way to do it would be to adorn the call site. We can currently do that with lambda: f(a * b + c) f(lambda: a * b + c) Those are completely different from each other, but completely consistent with themselves - it doesn't matter what f is, one of them passes the sum of the product and the other passes a callable. ChrisA

On Feb 28, 2014, at 18:13, Ron Adam <ron3200@gmail.com> wrote:
So functions can't be extended to take a triplet instead of a pair...
(*args, **kwds, ***expr)
The more you elaborate this, the more this looks like Ruby procs: You get any number of normal arguments, then at most one special kind of callback thing which must come at the end. I'm not sure which use cases this solves. It doesn't work for expressions that need an argument, like a sorting key function. It also doesn't work for pre-existing functions that weren't designed to take ***expr, like Button or takewhile. And wrapping the expression in a call to a forwarding function doesn't seem any less verbose or more readable than just using lambda. Anyway, there are two fundamental problems with anything that doesn't require any syntax at the call site. First, it's not obvious, or even easily discoverable, to the reader when you're passing a quoted expression and when you're passing a value. Second, it's not discoverable to the compiler. Going back to the analogies from other languages again, Lisp gets around this by making functions and macros different things. Among other differences, a macro always gets expressions instead of values, without any need to quote them at the call site. And, as Haoyi Li has pointed our multiple times on the other threads, you can already do the same thing in Python (with MacroPy).

On Sat, Mar 1, 2014 at 6:42 AM, Ron Adam <ron3200@gmail.com> wrote:
It's not clear what differences you mean here... can you show some examples?
These are exactly the same: x = (1,2) f(*x) f(1,2) I played with dis.dis() and it seems there are special-case opcodes for calling a function with a variable number of arguments and/or with variable keyword args; but once it arrives at the other side, the two are identical:
The runtime fetches some callable, gives it some args, and says "Go do your stuff!". It doesn't care what that callable is - it could be a classic function defined with 'def' or 'lambda', it could be a type, it could be an object with __call__, it could be a built-in that's backed by a C function, anything. All that ends up arriving on the other side is: You have these positional args and these keyword args. Adding the tri-star to the mix suddenly changes that. A function is now capable of taking an expression, rather than an object. That's completely different, and it depends on the called function to distinguish one from the other. ChrisA

On 02/28/2014 08:14 PM, Chris Angelico wrote:
Right.. To make it work in the way I was thinking would require each function consisting of two parts. One part to build a name-space, and the other for executing the code. Then at call time, the first part could evaluate the expressions inside the function parentheses. This part would be new, and possibly work like a dict comprehension, except more specific to call signatures. Then the constructed name space would be passed directly to the code part for the actual call. Which would be quite different than what happens presently. I've refactored dis once for fun, and also played around with ceval enough to know how this stuff works, but when I haven't looked at the code recently, (like now) I tend to think more in abstract terms which helps with creativity and seeing new ways to do things. (And it is why I like reading this list), But I sometimes misses on the objectivity side... [Note to self... look at the code more.] It comes down to this... "Creativity and Objectivity often don't want to occupy the same space at the same time." also helped test some patches to dis as well. And Cheers, Ron

Chris Angelico wrote:
And when you consider that this could happen with any argument to any function, depending on what the function turns out to be like at run time, it means that *all* function arguments would need to be passes as anonymous functions. That would be very awkward an inefficient. -- Greg

On Fri, Feb 28, 2014 at 11:17:18AM -0600, Ron Adam wrote:
You can't assign to True, False or None. (You can assign to True and False in Python 2, but shouldn't.) [...]
I think it would be useful to have a way to delay execution of an expression, that is to say, have a way to capture an expression for later evaluation, something more lightweight than a function, but I don't think that limiting it to inside function calls is the right approach. Something perhaps like a thunk might be appropriate? We can *almost* do that now, since Python has a compile function: thunk = compile("x + 3", "", "eval") # much later eval(thunk) Some problems with that approach: - You have to write the expression as a string, which means you lose any possibility of syntax highlighting. - The temptation is to pass some arbitrary untrusted string, which leads to serious security implications. The argument here should be limited to an actual expression, not a string containing an expression which might have come from who knows where. - It should be as lightweight as possible. The actual compilation of the expression should occur at compile-time, not run-time. That implies some sort of syntax for making thunks, rather than a function call. - Likewise actually evaluating the thunk should be really lightweight, which may rule out a function call to eval. - How should scoping work? I can see use-cases for flat scoping, static scoping, dynamic scoping, and the ability to optionally provide custom globals and locals, but I have no idea how practical any of them would be or what syntax they should use. All of this is pie-in-the-sky at the moment, and not a serious proposal for Python 3.5. -- Steven

On 02/28/2014 09:46 PM, Steven D'Aprano wrote:
I meant it the other way around. For example... def is not assignable to anything.. D = def # won't work T = True # works F = False N = None L = lambda # won't work lambda isn't an object like True, False, and None.
The expressing part isn't limited to inside function calls.. Or wouldn't be if it was a doable idea.
def thunk(***expr): return expr def do thunks(*args): for expr in args: ***expr start = t = time() update_timer = thunk(t = time()) show_timer = thunk(print(t-start)) show_status = thunk(do_thunks(upate_timer, show_timer)) Then as you do things, possibly in different functions as well. ... ***show_status # update t, and prints elapsed time. ... ***show_status ... ***show_status ... Yes, it could be done with lambda too. Here's where it differs... So if we have this, where obj is nested expressions. While 1: try: obj = ***obj except TypeError: break Then the result of the obj expression, could be a callable without any conflict. Only delayed expressions would be expressed. This was one of the properties I was looking for.
That's where the idea I mentioned in another response to this thread comes in. Separating the namespace constructing from the code part of a function in a well defined way. If that can be done... also pie in the sky. Then we could also evaluate an expression in a previously defined namespace. It's hypothetical, so supply your own syntax for it. The compiler will still compile the code, so most of the work is still done at compile time. But we have these useful pieces we can take apart and put back together again. (without calling eval, or exec.) BTW, you can do that now, but it's very hard to get right. help(type(lambda:1))
All of this is pie-in-the-sky at the moment, and not a serious proposal for Python 3.5.
Agree. Cheers, Ron

On 1 March 2014 13:46, Steven D'Aprano <steve@pearwood.info> wrote:
FWIW, I think this is why PEP 312 (simple implicit lambda) garnered any interest at all: sure, you can't pass such a lambda any arguments, but it works fine as a *closure*. It's also potentially worth trawling the python-ideas archives for the various discussions about deferred and delayed expressions - they're generally all variants of the same basic idea, a less esoteric, easier to learn way to handle one-shot callbacks and other forms of lazy expression evaluation. The "lambda" keyword in the current syntax ends up being a distraction rather than an aid to understanding. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Feb 28, 2014, at 19:46, Steven D'Aprano <steve@pearwood.info> wrote:
Something perhaps like a thunk might be appropriate? We can *almost* do that now, since Python has a compile function:
To save Haoyi Li the trouble of saying it: we _can_ do this now, since Python has import hooks, so you can use MacroPy to quote an expression, giving you an AST to evaluate later, or automatically wrapping it up in a function (like quote and function, respectively, in Lisp). There's also an in-between option of compiling it to a code object but not wrapping that in a function object. There's one other crazy option for representing quoted expressions/thunks: as lazy futures. This is how Alice does it, and it's also implicitly what dataflow languages like Oz are doing (every variable is a lazy future). In Python, this would really just be a wrapper around a function call, so I don't think it would but anything. Which one of those do people want? I'm not sure. Since you did a great job listing the problems of the string-quoting solution, we can compare the options for each one.
- You have to write the expression as a string, which means you lose any possibility of syntax highlighting.
Obviously not a problem here.
Again, not a problem.
Not a problem for the function or code versions. For the AST version, you're doing part of the compilation at compile-time, and the rest at runtime.
- Likewise actually evaluating the thunk should be really lightweight, which may rule out a function call to eval.
A function is obviously no better or worse than using lambda today. A code object has to be evaluated by passing it to eval, or wrapping it in a function and calling it. I suspect the former may be lighter weight than calling a function, but I really don't know. The latter, or the other hand, is obviously heavier than calling a function, but not by that much. An AST has to be compiled to a code object, after which you do the same as the above. Obviously this is heavier than not having to compile.
This, I think, is the big question. A function is clearly a normal, static-scoped closure. An AST or code object, you could do almost any form of scoping you want _except_ static, either by building a function around it or by calling eval on it. (You can do additional tricks with an AST, but I don't think most programs would want to. For statements, this could be useful for optional hygienic variables, but for expressions that isn't an issue.) If we want static scoping, we really need functions. At least we need some kind of object that has some form of code plus a closure mechanism--and that's pretty much all functions are. On the other hand, if we want dynamic, flat, or customizable scoping, we need something we can either build a function object out of dynamically, or evaluate dynamically--and that's pretty much what code objects are.

On 1 Mar 2014 05:46, "Steven D'Aprano" <steve@pearwood.info> wrote:
something,
What about a new code literal object, e.g. thunk = c'x + 3' thunk(x=2) The "c" prefix identifies the literal as an code so highlighters can recognise it and it can be parsed. Triple quotes could also be used to support multiline code blocks. It would also be possible to prohibit any direct conversion from a string to avoid the security problems. I'm not sure how positional arguments would be handled though, and scoping still needs to be thought through.

On Mar 3, 2014, at 22:02, David Townshend <aquavitae69@gmail.com> wrote:
Why does it need to be built from/look like a string? I think it would be just as simple for the parser, and simpler for editors, and less misleading for readers, if it used a different marker. Not that I'm seriously suggesting backticks here, but... thunk = `x + 3` Most existing editors already treat that as an expression (since in 2.x it means repr(x + 3)). No human reader is going to be misled into thinking this means there's a way to create code objects from string objects without eval (c.f. all those questions on StackOverflow about how to make a raw string from a string...). And as far as the parser is concerned, it's trivial: '`' expr '`' is a Code whose value is whatever expr parses to. Meanwhile, if this gives you a code object, what do you do with it? Using eval on something that looks like a string but isn't may be correct to anyone who understands Python deeply, but I think it could be very misleading to anyone else. And creating a FunctionType from a code object is pretty ugly for something that deserved literal support...
Triple quotes could also be used to support multiline code blocks.
That is one advantage of reusing strings. In fact, that gives us a way to embed statements in the middle of expressions. Which I'm not sure is a good thing.
I'm not sure how positional arguments would be handled though, and scoping still needs to be thought through.
If you have a code object, scoping is up to whoever evaluates it or builds a function out of it.

On Mar 3, 2014, at 22:02, David Townshend <aquavitae69@gmail.com> wrote:
thunk = lambda: x + 3 thunk(x=2) Your original post in this thread wanted to allow functions to magically capture their parameters as code (~ call by name) and it's been pointed out that that is incompatible with the way the language is designed and function calls are implemented. Yes, there are languages where the called function gets to tell the caller how to package arguments but Python isn't one of them. Now you seem to have veered in a different direction. I have no idea what problem you are trying to solve. --- Bruce Learn how hackers think: http://j.mp/gruyere-security https://www.linkedin.com/in/bruceleban

On Tue, Mar 4, 2014 at 9:03 AM, Andrew Barnert <abarnert@yahoo.com> wrote:
The only real reason for it looking like a string is a shortage of symbols. Backticks are off limits, and most other symbols are just plain ugly (e.g. @x + 3@ or $x + 3$) or already in use in a way that could lead to ambiguity (e.g. |x + 3|). String-like quotes seem like a better option than the alternatives.
My suggestion was to use c'x + 3' as an effective replacement for compile('x + 3', '', 'eval'). The use of a literal rather than compile would solve a few of the issues raised in Steven's original post. I don't see why it would be any more misleading to eval a code literal than a compiled string. I did suggest making it callable (i.e. thunk(x=3), but on second thoughts I'm not sure that's a good idea. It adds complexity without and real benefit over eval.

On Tue, Mar 04, 2014 at 11:03:35AM +0200, David Townshend wrote:
Agreed.
I don't think calling the thunk is the right API. If we have this sort of functionality, I want calling a thunk to be treated like any other expression, thunk[0] or thunk+1. I don't want it to mean "evaluate this thunk". An example: thunk = `lambda x: x + y` def func(y, obj): return obj + 1 result = func(50, thunk(100)) (This is not a use-case, just an illustration.) In this example, the expression "thunk(1000)" is delayed until inside the func scope. Inside that scope, we evaluate this expression: (lambda x: x + y)(100) where y has the value 50, and bind the result of this to the name obj. The lambda part returns a closure, and calling that closure with argument 100 returns 150, so this is equivalent to binding obj=150. Then func continues, evaluates "obj + 1", and returns 151. -- Steven

On Mon, Mar 03, 2014 at 11:03:03PM -0800, Andrew Barnert wrote:
Off the top of my head, I think that looks alright.
thunk(x=2)
I'm not entirely sure about that. That basically makes these thunks just a function. I'm still not quite sure how this should actually be used in practice, so as far as I'm concerned this is just pie in the sky thinking aloud. If thunks are just functions, why not make them functions? There needs to be something extra (different scoping rules, faster/more lightweight, *something*) to make the idea worthwhile. I think I need to learn more about Algol and other languages that use call-by-name and thunks. I may be completely on a wild-goose chase here, but I'm surely not the only person who has needed to delay evaluation of an expression. http://en.wikipedia.org/wiki/Thunk
Actually I'd be happy with backticks, but Guido has said No Backticks Ever Again. So until the Glorious Revolution, we're stuck.
Agreed. What I have in my head is some vague concept that the Python evaluation rules will somehow know when to evaluate the thunk and when to treat it as an object, which is (as I understand it) what happens in Algol. Again, just thinking aloud, perhaps we do this: thunk = `some_expression` # delays evaluation a = [0, 1 + thunk] # evaluates thunk in the current scope b = [0, `1 + thunk`] # delays evaluation and creates a thunk object # equivalent to `1 + some_expression` c = b[1] # now evaluates the thunk object d = f(2, thunk) # evaluates thunk in f's scope e = g(3, `thunk`) # passes the un-evaluated thunk object to g Consider this just a sketch, and in no way fully thought out. (This will most definitely need a PEP.)
I have little interest in allowing thunks to be statements. If you want a delayed statement, use compile and eval. Or def. (But maybe that applies to expressions too?) Did I mention this needs a PEP? -- Steven

On Tue, Mar 4, 2014 at 10:09 PM, Steven D'Aprano <steve@pearwood.info> wrote:
PEP can come later. First, let's get some solid use-cases, and start looking at implications. The way it's described here, there's effectively magic when you try to look at an object of this type, which will break a lot of assumptions. Most people expect that: foo = bar assert foo is bar to be a safe assumption, but if bar is a thunk, then it's getting evaluated separately in each of those, and that's potentially going to create different objects and/or even have side effects. That's going to surprise people. On the flip side, that's something that could be dealt with with a naming convention for thunks. We have _private, __mangled, __magic__, anticollision_, CONSTANT, Class... maybe we could have thunk__ or something. It's most confusable with magic and anticollision, but since both of those are connected with specific keywords, it's reasonably likely there'll be no actual confusion. Specific downside: There's no way to actually pass an unevaluated thunk around. Technically, `thunk__` will create a new wrapper thunk and pass that along. That'll often have the same effect, but it won't be quite identical (same as there's a difference between f(g) and f(lambda x: g(x)) in that the second one lazily looks up g), so that might cause confusion. But you could always special-case it: writing `thunk__` will be guaranteed to transmit the thunk unchanged, and if you actually mean to add another wrapper layer, use `(thunk__)` or something instead. The biggest thing to figure out is scoping. Does a thunk take a snapshot of its enclosing scope (a closure), or is it an expression that gets evaluated in the target namespace? The latter could be extremely confusing, but the former's just what a nested function does, so this'd just be a new lambda syntax. Python's existing lambda syntax has its flaws and its detractors, but it has one huge benefit over thunking: It exists. :) Thunking has to get over that difference. I'd like to see some proposals. ChrisA

On 2014-03-04, at 13:07 , Chris Angelico <rosuav@gmail.com> wrote:
Why? Either it's forced during assignment or both names map to the same thunk, and are both forced when any of them is. That could be during the identity check, but since both names refer to the same thunk they can only yield the same value, so the identity check needs not force the thunk. An equality test would likely force the thunk.
Is there a use case for actually thunking a thunk?
That is essentially what a thunk is, at least in my experience: it is conceptually a nullary memoized function, forced (evaluated/called) if an actual value/object is ever needed but potentially thrown out during reduction. The difference, under the proposed semantics, is that the forcing of the thunk would be implicit where that of a function is explicit (not sure that's a good idea in a strict language).

On Tue, Mar 4, 2014 at 11:50 PM, Masklinn <masklinn@masklinn.net> wrote:
Most certainly not. Try this: bar = `[1,2,3]` foo = bar spam = bar assert foo is spam Here's the evaluated version: foo = [1,2,3] spam = [1,2,3] assert foo is spam You can try this one out directly. They're not going to be the same object - they'll be two separate lists. They will be equal, in this case, but there's no guarantee of that either: bar = `random.random()` Separate evaluation of the same expression isn't guaranteed to have the same result, AND it might have unexpected side effects: bar = `whiskey.pop().drink()` and you might find yourself underneath the bar before you know it. ChrisA

On 2014-03-04, at 15:14 , Chris Angelico <rosuav@gmail.com> wrote:
I don't agree with this, again why would the thunk be evaluated twice? If thunks are added to *delay* expression evaluation (which is what I understood from Steven's messages) surely something akin to Haskell's semantics is simpler to understand and implement. That is, instead of thunks being sugar for: bar = lambda: expr they're sugar for bar = memoize(lambda: expr)
Obviously, but why would repeated evaluation of the expression be desirable?

On Wed, Mar 5, 2014 at 1:55 AM, Masklinn <masklinn@masklinn.net> wrote:
Okay. That's an interesting point, and a distinction from lambda. Effectively, once a thunk is evaluated once, it devolves to its value. That would be *extremely* interesting in the case of lazy evaluation - you could actually make a ternary-if function: value = true_expr if cond else false_expr def if_(cond, true_thunk, false_thunk): if cond: return true_thunk return false_thunk value = if_(cond, `true_expr`, `false_expr`) There's still the questions of scoping, but now you have the distinction between a thunk and a function. And if it's at all possible, the thunk could even replace itself in memory with its result - that would be massively implementation-dependent, but since you can't look at the identity of the thunk itself anyway, it wouldn't break anything. Effectively, as soon as you evaluate bar, it assigns to bar whatever the expression evaluates to. (Which is simpler and cleaner to explain than the memoization.) ChrisA

Steven D'Aprano wrote:
But Algol has the benefit of static typing -- the procedure being called explicitly declares whether the argument is to be passed by name or value. Python has no idea about that at compile time.
When exactly does implicit evaluation of a thunk object occur? Does `b[1]` give you an unevaluated thunk object? What if b is a custom sequence type implemented in Python -- how does its __getitem__ method avoid evaluating the thunk object prematurely? None of these problems occur in Algol, because its thunks are not first-class values (you can't store them in arrays, etc.) and it uses static type information to tell when to create and evaluate them. -- Greg

On Wed, Mar 5, 2014 at 9:31 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Does `b[1]` give you an unevaluated thunk object?
I would say that `b[1]` should create a brand new thunk object of that expression. On evaluation of that, it'll look up b, subscript it, and touch the thunk. When it does, it Schrodingers itself into a value and that's that. In my opinion, __getitem__ isn't a problem, but __setitem__ is. Somewhere along the way, you have to be able to "hold" a thunk as a thunk. I've no idea how that works when you pass it around from function to function. ChrisA

On Mar 4, 2014, at 14:31, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
This is the main reason I think it's more productive to think of this in terms of Lisp-style quoting than Algol-style thunks. The fact that quoting/thunking gives you a code object instead of an AST (sexpr) is not significant here* (unless we're also considering adding macros). The fact that it gives you a first-class value, and that we can't use the "implicit casting" syntax that comes with static typing to magically evaluate the object at the right time, is critical. This is basically the same problem I described trying to implement Boost-style auto lambdas in Python without C++-style implicit cast from lambda to function. * It is significant _elsewhere_. In an AST, "b" is just a reference to a name; in a code object, we have to have already determined whether it's a reference to a fast, closure, or global name--unless you want to do something like automatically compile `b[1]` as if it were something like vars('b')[1], or to do something equivalent like add a LOAD_DYNAMIC opcode.
There's really no way to cleanly distinguish the c, d, and e cases. There are really only two possibilities: Magic evaluation everywhere (so d is passing in eval(thunk) to f), or explicit evaluation (whether via eval or otherwise) only (so c is just binding the unevaluated thunk). I think either of those is potentially viable, but I don't think there's anything in between. Note that you can always quote the implicit evaluation (if you go the first way) or explicitly eval anywhere you want (the second), so neither one really limits what you can do.
You're creating a new thunk/quoting a new expression inside the backticks, so the issue doesn't arise. You're creating a new code object that looks up b and indexes it. The __getitem__ call doesn't happen until that new code object is evaluated.

On 03/04/2014 07:05 PM, Andrew Barnert wrote:
On Mar 4, 2014, at 14:31, Greg Ewing<greg.ewing@canterbury.ac.nz> wrote:
Steven D'Aprano wrote:
In a experimental language I'm writing, I use a concept I call "Context Resolution" to resolve objects to expected kinds of objects. The expected type/kind is determined in the context of how things are used together rather than by how they are defined. (That allows everything to be objects). Keywords, Names, Expressions, and CodeBlocks, etc... In python it would probably depend on AttributeError instead of the type. If an object doesn't have the needed attribute, then it could try calling a different method, possibly __resolve__. Then retry the attribute lookup again on the result. If there's no __resolve__ attribute, then the AttributeError would be raised as usual. The chained __resolve__ resolution attempts would also give a useful exception backtrace. Cheers, Ron

On Tue, Mar 04, 2014 at 05:05:39PM -0800, Andrew Barnert wrote:
I'm not convinced that this really matters, but for the sake of the argument let's say it does.
This is the main reason I think it's more productive to think of this in terms of Lisp-style quoting than Algol-style thunks.
Can you give more detail please? -- Steven

From: Steven D'Aprano <steve@pearwood.info> Sent: Wednesday, March 5, 2014 3:25 AM
It's not really static typing that's key here; it's that, unless you have explicit syntax for all uses of thunks, you either need implicit type casts, and static typing is necessary for implicit type casts. (However, as I said before, this doesn't have to be a deal-breaker. You can probably get away with making all references either immediate or delayed, so long as the syntax for doing the other explicitly isn't too obtrusive and the performance cost isn't too high. For example, if referencing a thunk always evaluates, `thunk` just creates a new thunk that will evaluate the old one, and you could optimize that into not much heavier than just passing the old one around—with a JIT, like PyPy, you could even optimize it into just passing the old one around.)
I think I've already explained it, but let me try again in more detail. A quoted expression and a thunk are similar things: ways to turn a language expression into something that can be evaluated or executed later. But there are two major differences. First, quoted expressions are first-class values, while thunks are not. Second, quoted expressions have an inspectable (or pattern-matchable) structure, while thunks do not. You could relate that second difference to ASTs vs. code objects in Python—but since idiomatic Python does not use macros or any other generation or parsing of ASTs, it really isn't a visible difference, so the first one is more important.* And then there's the historical difference: thunks come from static languages, quoting from dynamic languages. Together with being first-class values, this means quoted expressions need explicit but not-too-intrusive syntax not just for creating them, but also either for evaluating them, or for continuing to delay them. So, that's why I think quoting is a more apt model for what we're trying to accomplish in this thread. * As I mentioned before, it does actually matter for _implementation_ whether we use ASTs or code objects, at least if we want dynamic or flat in-place scoping, because the former would allow us to look up names at execution time, while the latter would not, unless we either add a LOAD_DYNAMIC opcode or compile name lookup into explicit dynamic lookup code. But as long as we define what the scoping rules are, users won't care how that's implemented.

On 2014-03-04, at 23:31 , Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
That's not really a blocker though, Haskell thunks are implicit and not type-encoded. A name may correspond to a (unforced) thunk or to a strict value (an already forced thunk, whether through a previous implicit forcing, through an explicit forcing — a strict annotation — or through a decision of the strictness analyser).
There are definitely difficulties in deciding how the decision to force a thunk comes about.

Masklinn wrote:
That comes at the expense of making *everything* a thunk until its value is needed. This would be a big, far-reaching change in the way Python works. Also, Haskell has the advantage of knowing that the value won't change, so it can replace the thunk with its value once it's been calculated. Python would have to keep it as a thunk and reevaluate it every time. And Haskell doesn't have to provide a way of passimg the thunk around *without* evaluating it. -- Greg

On 2014-03-05, at 22:51 , Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Of course not, Haskell allows explicitly forcing a thunk and the compiler/runtime pair can use strictness analysis and not create thunks in the first place.
This would be a big, far-reaching change in the way Python works.
Having Haskell-type thunks does not mean using the same default as Haskell. Python would have the opposite one, instead of creating thunks by default it would create them only when requested.
Also, Haskell has the advantage of knowing that the value won't change
I do not see why that would be relevant to thunks. The result of the thunk may be mutable, so what?
See above, I do not see why that would happen. Consider that a thunk is a deferral of an expression's evaluation, why would said expression's evaluation happen more than once? The only thing which changes is *when* the actual evaluation happens.
And Haskell doesn't have to provide a way of passimg the thunk around *without* evaluating it.
Well no, instead it takes the sensible option to force thunks when they need to be forced (or when their forcing is explicitly requested e.g. using ($!) or "bang pattern" extensions, but the need for those is mostly because haskell defaults to creating thunks I would say). Why would passing a reference around force a thunk in the first place?

Masklinn wrote:
On 2014-03-05, at 22:51 , Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
That's not what I mean. The value of the thunk can depend on other things that change. In general you can't get away with evaluating it once and caching the result. -- Greg

Masklinn writes:
That doesn't seem fair to me. "Explicitly forcing" by definition means the programmer has decided the value is needed, and "strictness analysis" is an optimization, not part of the language definition. Am I missing something?
Are thunks guaranteed not to leak out of the scope of the containing expression, so assignment is impossible? If they can leak, the thunk would be an object, and in Python "assigning" it to a "variable" actually just creates a reference. Evaluation could memoize the thunk's value ("evaluation only happens once") or reevaluate the thunk each time its value is requested. Either seems potentially surprising to me. Memoization makes sense if you think of the thunk as the closure of a computation that conceptually takes place at definition time. I'm not sure if in the use cases for thunks it really makes sense to "close" the thunk at evaluation time, but it seems like a plausible interpretation to me.

On 2014-03-06, at 08:29 , Stephen J. Turnbull <stephen@xemacs.org> wrote:
No, it means the programmer wants the value to be strictly evaluated (this is generally a space optimisation). It has no impact on program behavior (assuming infinite resources, one issue of accumulating thunks is the risk of OOM if they remain live but unevaluated).
No? It just demonstrates that not everything is a thunk in haskell, there are both thunks and strict values living around in the runtime, and that's not visible in the type system.
I'm not sure I understand the question.
If they can leak, the thunk would be an object
I think that's just one possible implementation, a thunk could also be some sort of reference indirection, a tagged pointer, or a deeply integrated proxy-ish object.
Absolutely.
I find it the most *useful* definition: for the other one we already have functions and I'm not sure thunks would have much use as shorter arguments-less functions. It also has the property that "client" code (code receiving thunks without necessarily being aware of it) remains valid and unsurprising in that a = foo b = foo assert a is b remains true regardless of `foo` being a thunk (the only difference being that if `foo` is a thunk it may remain unforced throughout).

On Sat, Mar 1, 2014 at 4:17 AM, Ron Adam <ron3200@gmail.com> wrote:
Interesting, but I don't like the way the interpretation of a function call depends on the target function. With both * and ** notations, there's absolutely no difference: the function is called with these positional and those keyword arguments, whether they came from actual args or from * or ** unpack/repacks; and there's no difference between a function that collects args with *args,**kwargs and one that collects them with individual names (or a C-level function that might do something altogether different). With this proposal, your star_lambda function's declaration changes the call site - instead of evaluating a*b+c, it has to construct an anonymous function and pass it along. ChrisA

On 02/28/2014 11:54 AM, Chris Angelico wrote:
It's not clear what differences you mean here... can you show some examples? I think we just are used to not thinking about it, But it's not really that different. def fn(*args, **kwds): ... This wraps args in a list, and kwds in a dict. It's up to the *function called* to do what is intended by the syntax. def fn(*args): ... fn(a, b, c) --> fn(list(a, b, c)) #depends on function called. def fn(**kwds): ... fn(a=1, b=2, c=3) --> fn(dict(a=1, b=2, c=3)) # here too. def fn(***expr): ... fn(expr) --> fn(TriStar(lambda:(expr))) # A bit more complex, but also the same. # Parentheses need to capture tuple packing # due to ',' having a higher precidence. The mechanism behind each of these may be somewhat different, but there are also similarities. def fn(***expr): return ***expr With these, it forwards the these existing cases nicely. a, b, c = fn(a, b, c) args = fn(*args)args, kwds kwds = fn(**kwds) args, kwds = fn(*args, **kwds) And just like '**' can't be used to pack a dictionary directly, we can't use '***' to pack an expression directly. Using "**" in a funciton unpacks the dictionary. Using "***" in a function call expresses the TriStar object. (* any name for the TriStar object would work. (small detail)) NOW here is the main limitation... :-/ a, b, c = fn(a, b, c=1) Which is because (a, b, c=1) isn't a valid expression outside of a function call. Or should this be captured as (a, b, {"c":1})? Sigh... darn edge cases. A bit more than an edge case I think. Any ideas? Cheers, Ron

On 1 Mar 2014 05:43, "Ron Adam" <ron3200@gmail.com> wrote:
examples? Remember that at compile time, Python has *no idea* what the actual signature of the target function is. Thus, all Python function calls use the following sequence (ignoring optimisations of special cases): 1. At the call site, the arguments are collected into a tuple of positional arguments and a dict of keyword arguments. 2. The interpreter hands that tuple and dict over to the target callable 3. The *target callable* then maps the supplied arguments to the defined parameters including filling in any default values. Any function related proposals need to account for the fact that from the compiler's point of view *every* function signature looks like "(*args, **kwds)" (although it may have optimised paths for the no-args case and the positional-args-only case), and that the target callable may not even be written in Python. Cheers, Nick.

On 02/28/2014 04:33 PM, Nick Coghlan wrote:
So functions can't be extended to take a triplet instead of a pair... (*args, **kwds, ***expr) Looking up... I think this is what you're referring to in ceval.c. #------------------- /* External interface to call any callable object. The arg must be a tuple or NULL. The kw must be a dict or NULL. */ PyObject * PyEval_CallObjectWithKeywords(PyObject *func, PyObject *arg, PyObject *kw) { #-------------------- And it wouldn't work in the normal case any way, as expressions are evaluated as they are put on the stack before calling the function. Somehow I was thinking this morning the code inside the function call parentheses f(...), could be parsed later than it actually is. And in the context of the function definition, possibly similar to how a comprehension is evaluated. But it would take some pretty big changes to do that I supose. Cheers, Ron

On Sat, Mar 1, 2014 at 1:13 PM, Ron Adam <ron3200@gmail.com> wrote:
The best way to do it would be to adorn the call site. We can currently do that with lambda: f(a * b + c) f(lambda: a * b + c) Those are completely different from each other, but completely consistent with themselves - it doesn't matter what f is, one of them passes the sum of the product and the other passes a callable. ChrisA

On Feb 28, 2014, at 18:13, Ron Adam <ron3200@gmail.com> wrote:
So functions can't be extended to take a triplet instead of a pair...
(*args, **kwds, ***expr)
The more you elaborate this, the more this looks like Ruby procs: You get any number of normal arguments, then at most one special kind of callback thing which must come at the end. I'm not sure which use cases this solves. It doesn't work for expressions that need an argument, like a sorting key function. It also doesn't work for pre-existing functions that weren't designed to take ***expr, like Button or takewhile. And wrapping the expression in a call to a forwarding function doesn't seem any less verbose or more readable than just using lambda. Anyway, there are two fundamental problems with anything that doesn't require any syntax at the call site. First, it's not obvious, or even easily discoverable, to the reader when you're passing a quoted expression and when you're passing a value. Second, it's not discoverable to the compiler. Going back to the analogies from other languages again, Lisp gets around this by making functions and macros different things. Among other differences, a macro always gets expressions instead of values, without any need to quote them at the call site. And, as Haoyi Li has pointed our multiple times on the other threads, you can already do the same thing in Python (with MacroPy).

On Sat, Mar 1, 2014 at 6:42 AM, Ron Adam <ron3200@gmail.com> wrote:
It's not clear what differences you mean here... can you show some examples?
These are exactly the same: x = (1,2) f(*x) f(1,2) I played with dis.dis() and it seems there are special-case opcodes for calling a function with a variable number of arguments and/or with variable keyword args; but once it arrives at the other side, the two are identical:
The runtime fetches some callable, gives it some args, and says "Go do your stuff!". It doesn't care what that callable is - it could be a classic function defined with 'def' or 'lambda', it could be a type, it could be an object with __call__, it could be a built-in that's backed by a C function, anything. All that ends up arriving on the other side is: You have these positional args and these keyword args. Adding the tri-star to the mix suddenly changes that. A function is now capable of taking an expression, rather than an object. That's completely different, and it depends on the called function to distinguish one from the other. ChrisA

On 02/28/2014 08:14 PM, Chris Angelico wrote:
Right.. To make it work in the way I was thinking would require each function consisting of two parts. One part to build a name-space, and the other for executing the code. Then at call time, the first part could evaluate the expressions inside the function parentheses. This part would be new, and possibly work like a dict comprehension, except more specific to call signatures. Then the constructed name space would be passed directly to the code part for the actual call. Which would be quite different than what happens presently. I've refactored dis once for fun, and also played around with ceval enough to know how this stuff works, but when I haven't looked at the code recently, (like now) I tend to think more in abstract terms which helps with creativity and seeing new ways to do things. (And it is why I like reading this list), But I sometimes misses on the objectivity side... [Note to self... look at the code more.] It comes down to this... "Creativity and Objectivity often don't want to occupy the same space at the same time." also helped test some patches to dis as well. And Cheers, Ron

Chris Angelico wrote:
And when you consider that this could happen with any argument to any function, depending on what the function turns out to be like at run time, it means that *all* function arguments would need to be passes as anonymous functions. That would be very awkward an inefficient. -- Greg

On Fri, Feb 28, 2014 at 11:17:18AM -0600, Ron Adam wrote:
You can't assign to True, False or None. (You can assign to True and False in Python 2, but shouldn't.) [...]
I think it would be useful to have a way to delay execution of an expression, that is to say, have a way to capture an expression for later evaluation, something more lightweight than a function, but I don't think that limiting it to inside function calls is the right approach. Something perhaps like a thunk might be appropriate? We can *almost* do that now, since Python has a compile function: thunk = compile("x + 3", "", "eval") # much later eval(thunk) Some problems with that approach: - You have to write the expression as a string, which means you lose any possibility of syntax highlighting. - The temptation is to pass some arbitrary untrusted string, which leads to serious security implications. The argument here should be limited to an actual expression, not a string containing an expression which might have come from who knows where. - It should be as lightweight as possible. The actual compilation of the expression should occur at compile-time, not run-time. That implies some sort of syntax for making thunks, rather than a function call. - Likewise actually evaluating the thunk should be really lightweight, which may rule out a function call to eval. - How should scoping work? I can see use-cases for flat scoping, static scoping, dynamic scoping, and the ability to optionally provide custom globals and locals, but I have no idea how practical any of them would be or what syntax they should use. All of this is pie-in-the-sky at the moment, and not a serious proposal for Python 3.5. -- Steven

On 02/28/2014 09:46 PM, Steven D'Aprano wrote:
I meant it the other way around. For example... def is not assignable to anything.. D = def # won't work T = True # works F = False N = None L = lambda # won't work lambda isn't an object like True, False, and None.
The expressing part isn't limited to inside function calls.. Or wouldn't be if it was a doable idea.
def thunk(***expr): return expr def do thunks(*args): for expr in args: ***expr start = t = time() update_timer = thunk(t = time()) show_timer = thunk(print(t-start)) show_status = thunk(do_thunks(upate_timer, show_timer)) Then as you do things, possibly in different functions as well. ... ***show_status # update t, and prints elapsed time. ... ***show_status ... ***show_status ... Yes, it could be done with lambda too. Here's where it differs... So if we have this, where obj is nested expressions. While 1: try: obj = ***obj except TypeError: break Then the result of the obj expression, could be a callable without any conflict. Only delayed expressions would be expressed. This was one of the properties I was looking for.
That's where the idea I mentioned in another response to this thread comes in. Separating the namespace constructing from the code part of a function in a well defined way. If that can be done... also pie in the sky. Then we could also evaluate an expression in a previously defined namespace. It's hypothetical, so supply your own syntax for it. The compiler will still compile the code, so most of the work is still done at compile time. But we have these useful pieces we can take apart and put back together again. (without calling eval, or exec.) BTW, you can do that now, but it's very hard to get right. help(type(lambda:1))
All of this is pie-in-the-sky at the moment, and not a serious proposal for Python 3.5.
Agree. Cheers, Ron

On 1 March 2014 13:46, Steven D'Aprano <steve@pearwood.info> wrote:
FWIW, I think this is why PEP 312 (simple implicit lambda) garnered any interest at all: sure, you can't pass such a lambda any arguments, but it works fine as a *closure*. It's also potentially worth trawling the python-ideas archives for the various discussions about deferred and delayed expressions - they're generally all variants of the same basic idea, a less esoteric, easier to learn way to handle one-shot callbacks and other forms of lazy expression evaluation. The "lambda" keyword in the current syntax ends up being a distraction rather than an aid to understanding. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Feb 28, 2014, at 19:46, Steven D'Aprano <steve@pearwood.info> wrote:
Something perhaps like a thunk might be appropriate? We can *almost* do that now, since Python has a compile function:
To save Haoyi Li the trouble of saying it: we _can_ do this now, since Python has import hooks, so you can use MacroPy to quote an expression, giving you an AST to evaluate later, or automatically wrapping it up in a function (like quote and function, respectively, in Lisp). There's also an in-between option of compiling it to a code object but not wrapping that in a function object. There's one other crazy option for representing quoted expressions/thunks: as lazy futures. This is how Alice does it, and it's also implicitly what dataflow languages like Oz are doing (every variable is a lazy future). In Python, this would really just be a wrapper around a function call, so I don't think it would but anything. Which one of those do people want? I'm not sure. Since you did a great job listing the problems of the string-quoting solution, we can compare the options for each one.
- You have to write the expression as a string, which means you lose any possibility of syntax highlighting.
Obviously not a problem here.
Again, not a problem.
Not a problem for the function or code versions. For the AST version, you're doing part of the compilation at compile-time, and the rest at runtime.
- Likewise actually evaluating the thunk should be really lightweight, which may rule out a function call to eval.
A function is obviously no better or worse than using lambda today. A code object has to be evaluated by passing it to eval, or wrapping it in a function and calling it. I suspect the former may be lighter weight than calling a function, but I really don't know. The latter, or the other hand, is obviously heavier than calling a function, but not by that much. An AST has to be compiled to a code object, after which you do the same as the above. Obviously this is heavier than not having to compile.
This, I think, is the big question. A function is clearly a normal, static-scoped closure. An AST or code object, you could do almost any form of scoping you want _except_ static, either by building a function around it or by calling eval on it. (You can do additional tricks with an AST, but I don't think most programs would want to. For statements, this could be useful for optional hygienic variables, but for expressions that isn't an issue.) If we want static scoping, we really need functions. At least we need some kind of object that has some form of code plus a closure mechanism--and that's pretty much all functions are. On the other hand, if we want dynamic, flat, or customizable scoping, we need something we can either build a function object out of dynamically, or evaluate dynamically--and that's pretty much what code objects are.

On 1 Mar 2014 05:46, "Steven D'Aprano" <steve@pearwood.info> wrote:
something,
What about a new code literal object, e.g. thunk = c'x + 3' thunk(x=2) The "c" prefix identifies the literal as an code so highlighters can recognise it and it can be parsed. Triple quotes could also be used to support multiline code blocks. It would also be possible to prohibit any direct conversion from a string to avoid the security problems. I'm not sure how positional arguments would be handled though, and scoping still needs to be thought through.

On Mar 3, 2014, at 22:02, David Townshend <aquavitae69@gmail.com> wrote:
Why does it need to be built from/look like a string? I think it would be just as simple for the parser, and simpler for editors, and less misleading for readers, if it used a different marker. Not that I'm seriously suggesting backticks here, but... thunk = `x + 3` Most existing editors already treat that as an expression (since in 2.x it means repr(x + 3)). No human reader is going to be misled into thinking this means there's a way to create code objects from string objects without eval (c.f. all those questions on StackOverflow about how to make a raw string from a string...). And as far as the parser is concerned, it's trivial: '`' expr '`' is a Code whose value is whatever expr parses to. Meanwhile, if this gives you a code object, what do you do with it? Using eval on something that looks like a string but isn't may be correct to anyone who understands Python deeply, but I think it could be very misleading to anyone else. And creating a FunctionType from a code object is pretty ugly for something that deserved literal support...
Triple quotes could also be used to support multiline code blocks.
That is one advantage of reusing strings. In fact, that gives us a way to embed statements in the middle of expressions. Which I'm not sure is a good thing.
I'm not sure how positional arguments would be handled though, and scoping still needs to be thought through.
If you have a code object, scoping is up to whoever evaluates it or builds a function out of it.

On Mar 3, 2014, at 22:02, David Townshend <aquavitae69@gmail.com> wrote:
thunk = lambda: x + 3 thunk(x=2) Your original post in this thread wanted to allow functions to magically capture their parameters as code (~ call by name) and it's been pointed out that that is incompatible with the way the language is designed and function calls are implemented. Yes, there are languages where the called function gets to tell the caller how to package arguments but Python isn't one of them. Now you seem to have veered in a different direction. I have no idea what problem you are trying to solve. --- Bruce Learn how hackers think: http://j.mp/gruyere-security https://www.linkedin.com/in/bruceleban

On Tue, Mar 4, 2014 at 9:03 AM, Andrew Barnert <abarnert@yahoo.com> wrote:
The only real reason for it looking like a string is a shortage of symbols. Backticks are off limits, and most other symbols are just plain ugly (e.g. @x + 3@ or $x + 3$) or already in use in a way that could lead to ambiguity (e.g. |x + 3|). String-like quotes seem like a better option than the alternatives.
My suggestion was to use c'x + 3' as an effective replacement for compile('x + 3', '', 'eval'). The use of a literal rather than compile would solve a few of the issues raised in Steven's original post. I don't see why it would be any more misleading to eval a code literal than a compiled string. I did suggest making it callable (i.e. thunk(x=3), but on second thoughts I'm not sure that's a good idea. It adds complexity without and real benefit over eval.

On Tue, Mar 04, 2014 at 11:03:35AM +0200, David Townshend wrote:
Agreed.
I don't think calling the thunk is the right API. If we have this sort of functionality, I want calling a thunk to be treated like any other expression, thunk[0] or thunk+1. I don't want it to mean "evaluate this thunk". An example: thunk = `lambda x: x + y` def func(y, obj): return obj + 1 result = func(50, thunk(100)) (This is not a use-case, just an illustration.) In this example, the expression "thunk(1000)" is delayed until inside the func scope. Inside that scope, we evaluate this expression: (lambda x: x + y)(100) where y has the value 50, and bind the result of this to the name obj. The lambda part returns a closure, and calling that closure with argument 100 returns 150, so this is equivalent to binding obj=150. Then func continues, evaluates "obj + 1", and returns 151. -- Steven

On Mon, Mar 03, 2014 at 11:03:03PM -0800, Andrew Barnert wrote:
Off the top of my head, I think that looks alright.
thunk(x=2)
I'm not entirely sure about that. That basically makes these thunks just a function. I'm still not quite sure how this should actually be used in practice, so as far as I'm concerned this is just pie in the sky thinking aloud. If thunks are just functions, why not make them functions? There needs to be something extra (different scoping rules, faster/more lightweight, *something*) to make the idea worthwhile. I think I need to learn more about Algol and other languages that use call-by-name and thunks. I may be completely on a wild-goose chase here, but I'm surely not the only person who has needed to delay evaluation of an expression. http://en.wikipedia.org/wiki/Thunk
Actually I'd be happy with backticks, but Guido has said No Backticks Ever Again. So until the Glorious Revolution, we're stuck.
Agreed. What I have in my head is some vague concept that the Python evaluation rules will somehow know when to evaluate the thunk and when to treat it as an object, which is (as I understand it) what happens in Algol. Again, just thinking aloud, perhaps we do this: thunk = `some_expression` # delays evaluation a = [0, 1 + thunk] # evaluates thunk in the current scope b = [0, `1 + thunk`] # delays evaluation and creates a thunk object # equivalent to `1 + some_expression` c = b[1] # now evaluates the thunk object d = f(2, thunk) # evaluates thunk in f's scope e = g(3, `thunk`) # passes the un-evaluated thunk object to g Consider this just a sketch, and in no way fully thought out. (This will most definitely need a PEP.)
I have little interest in allowing thunks to be statements. If you want a delayed statement, use compile and eval. Or def. (But maybe that applies to expressions too?) Did I mention this needs a PEP? -- Steven

On Tue, Mar 4, 2014 at 10:09 PM, Steven D'Aprano <steve@pearwood.info> wrote:
PEP can come later. First, let's get some solid use-cases, and start looking at implications. The way it's described here, there's effectively magic when you try to look at an object of this type, which will break a lot of assumptions. Most people expect that: foo = bar assert foo is bar to be a safe assumption, but if bar is a thunk, then it's getting evaluated separately in each of those, and that's potentially going to create different objects and/or even have side effects. That's going to surprise people. On the flip side, that's something that could be dealt with with a naming convention for thunks. We have _private, __mangled, __magic__, anticollision_, CONSTANT, Class... maybe we could have thunk__ or something. It's most confusable with magic and anticollision, but since both of those are connected with specific keywords, it's reasonably likely there'll be no actual confusion. Specific downside: There's no way to actually pass an unevaluated thunk around. Technically, `thunk__` will create a new wrapper thunk and pass that along. That'll often have the same effect, but it won't be quite identical (same as there's a difference between f(g) and f(lambda x: g(x)) in that the second one lazily looks up g), so that might cause confusion. But you could always special-case it: writing `thunk__` will be guaranteed to transmit the thunk unchanged, and if you actually mean to add another wrapper layer, use `(thunk__)` or something instead. The biggest thing to figure out is scoping. Does a thunk take a snapshot of its enclosing scope (a closure), or is it an expression that gets evaluated in the target namespace? The latter could be extremely confusing, but the former's just what a nested function does, so this'd just be a new lambda syntax. Python's existing lambda syntax has its flaws and its detractors, but it has one huge benefit over thunking: It exists. :) Thunking has to get over that difference. I'd like to see some proposals. ChrisA

On 2014-03-04, at 13:07 , Chris Angelico <rosuav@gmail.com> wrote:
Why? Either it's forced during assignment or both names map to the same thunk, and are both forced when any of them is. That could be during the identity check, but since both names refer to the same thunk they can only yield the same value, so the identity check needs not force the thunk. An equality test would likely force the thunk.
Is there a use case for actually thunking a thunk?
That is essentially what a thunk is, at least in my experience: it is conceptually a nullary memoized function, forced (evaluated/called) if an actual value/object is ever needed but potentially thrown out during reduction. The difference, under the proposed semantics, is that the forcing of the thunk would be implicit where that of a function is explicit (not sure that's a good idea in a strict language).

On Tue, Mar 4, 2014 at 11:50 PM, Masklinn <masklinn@masklinn.net> wrote:
Most certainly not. Try this: bar = `[1,2,3]` foo = bar spam = bar assert foo is spam Here's the evaluated version: foo = [1,2,3] spam = [1,2,3] assert foo is spam You can try this one out directly. They're not going to be the same object - they'll be two separate lists. They will be equal, in this case, but there's no guarantee of that either: bar = `random.random()` Separate evaluation of the same expression isn't guaranteed to have the same result, AND it might have unexpected side effects: bar = `whiskey.pop().drink()` and you might find yourself underneath the bar before you know it. ChrisA

On 2014-03-04, at 15:14 , Chris Angelico <rosuav@gmail.com> wrote:
I don't agree with this, again why would the thunk be evaluated twice? If thunks are added to *delay* expression evaluation (which is what I understood from Steven's messages) surely something akin to Haskell's semantics is simpler to understand and implement. That is, instead of thunks being sugar for: bar = lambda: expr they're sugar for bar = memoize(lambda: expr)
Obviously, but why would repeated evaluation of the expression be desirable?

On Wed, Mar 5, 2014 at 1:55 AM, Masklinn <masklinn@masklinn.net> wrote:
Okay. That's an interesting point, and a distinction from lambda. Effectively, once a thunk is evaluated once, it devolves to its value. That would be *extremely* interesting in the case of lazy evaluation - you could actually make a ternary-if function: value = true_expr if cond else false_expr def if_(cond, true_thunk, false_thunk): if cond: return true_thunk return false_thunk value = if_(cond, `true_expr`, `false_expr`) There's still the questions of scoping, but now you have the distinction between a thunk and a function. And if it's at all possible, the thunk could even replace itself in memory with its result - that would be massively implementation-dependent, but since you can't look at the identity of the thunk itself anyway, it wouldn't break anything. Effectively, as soon as you evaluate bar, it assigns to bar whatever the expression evaluates to. (Which is simpler and cleaner to explain than the memoization.) ChrisA

Steven D'Aprano wrote:
But Algol has the benefit of static typing -- the procedure being called explicitly declares whether the argument is to be passed by name or value. Python has no idea about that at compile time.
When exactly does implicit evaluation of a thunk object occur? Does `b[1]` give you an unevaluated thunk object? What if b is a custom sequence type implemented in Python -- how does its __getitem__ method avoid evaluating the thunk object prematurely? None of these problems occur in Algol, because its thunks are not first-class values (you can't store them in arrays, etc.) and it uses static type information to tell when to create and evaluate them. -- Greg

On Wed, Mar 5, 2014 at 9:31 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Does `b[1]` give you an unevaluated thunk object?
I would say that `b[1]` should create a brand new thunk object of that expression. On evaluation of that, it'll look up b, subscript it, and touch the thunk. When it does, it Schrodingers itself into a value and that's that. In my opinion, __getitem__ isn't a problem, but __setitem__ is. Somewhere along the way, you have to be able to "hold" a thunk as a thunk. I've no idea how that works when you pass it around from function to function. ChrisA

On Mar 4, 2014, at 14:31, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
This is the main reason I think it's more productive to think of this in terms of Lisp-style quoting than Algol-style thunks. The fact that quoting/thunking gives you a code object instead of an AST (sexpr) is not significant here* (unless we're also considering adding macros). The fact that it gives you a first-class value, and that we can't use the "implicit casting" syntax that comes with static typing to magically evaluate the object at the right time, is critical. This is basically the same problem I described trying to implement Boost-style auto lambdas in Python without C++-style implicit cast from lambda to function. * It is significant _elsewhere_. In an AST, "b" is just a reference to a name; in a code object, we have to have already determined whether it's a reference to a fast, closure, or global name--unless you want to do something like automatically compile `b[1]` as if it were something like vars('b')[1], or to do something equivalent like add a LOAD_DYNAMIC opcode.
There's really no way to cleanly distinguish the c, d, and e cases. There are really only two possibilities: Magic evaluation everywhere (so d is passing in eval(thunk) to f), or explicit evaluation (whether via eval or otherwise) only (so c is just binding the unevaluated thunk). I think either of those is potentially viable, but I don't think there's anything in between. Note that you can always quote the implicit evaluation (if you go the first way) or explicitly eval anywhere you want (the second), so neither one really limits what you can do.
You're creating a new thunk/quoting a new expression inside the backticks, so the issue doesn't arise. You're creating a new code object that looks up b and indexes it. The __getitem__ call doesn't happen until that new code object is evaluated.

On 03/04/2014 07:05 PM, Andrew Barnert wrote:
On Mar 4, 2014, at 14:31, Greg Ewing<greg.ewing@canterbury.ac.nz> wrote:
Steven D'Aprano wrote:
In a experimental language I'm writing, I use a concept I call "Context Resolution" to resolve objects to expected kinds of objects. The expected type/kind is determined in the context of how things are used together rather than by how they are defined. (That allows everything to be objects). Keywords, Names, Expressions, and CodeBlocks, etc... In python it would probably depend on AttributeError instead of the type. If an object doesn't have the needed attribute, then it could try calling a different method, possibly __resolve__. Then retry the attribute lookup again on the result. If there's no __resolve__ attribute, then the AttributeError would be raised as usual. The chained __resolve__ resolution attempts would also give a useful exception backtrace. Cheers, Ron

On Tue, Mar 04, 2014 at 05:05:39PM -0800, Andrew Barnert wrote:
I'm not convinced that this really matters, but for the sake of the argument let's say it does.
This is the main reason I think it's more productive to think of this in terms of Lisp-style quoting than Algol-style thunks.
Can you give more detail please? -- Steven

From: Steven D'Aprano <steve@pearwood.info> Sent: Wednesday, March 5, 2014 3:25 AM
It's not really static typing that's key here; it's that, unless you have explicit syntax for all uses of thunks, you either need implicit type casts, and static typing is necessary for implicit type casts. (However, as I said before, this doesn't have to be a deal-breaker. You can probably get away with making all references either immediate or delayed, so long as the syntax for doing the other explicitly isn't too obtrusive and the performance cost isn't too high. For example, if referencing a thunk always evaluates, `thunk` just creates a new thunk that will evaluate the old one, and you could optimize that into not much heavier than just passing the old one around—with a JIT, like PyPy, you could even optimize it into just passing the old one around.)
I think I've already explained it, but let me try again in more detail. A quoted expression and a thunk are similar things: ways to turn a language expression into something that can be evaluated or executed later. But there are two major differences. First, quoted expressions are first-class values, while thunks are not. Second, quoted expressions have an inspectable (or pattern-matchable) structure, while thunks do not. You could relate that second difference to ASTs vs. code objects in Python—but since idiomatic Python does not use macros or any other generation or parsing of ASTs, it really isn't a visible difference, so the first one is more important.* And then there's the historical difference: thunks come from static languages, quoting from dynamic languages. Together with being first-class values, this means quoted expressions need explicit but not-too-intrusive syntax not just for creating them, but also either for evaluating them, or for continuing to delay them. So, that's why I think quoting is a more apt model for what we're trying to accomplish in this thread. * As I mentioned before, it does actually matter for _implementation_ whether we use ASTs or code objects, at least if we want dynamic or flat in-place scoping, because the former would allow us to look up names at execution time, while the latter would not, unless we either add a LOAD_DYNAMIC opcode or compile name lookup into explicit dynamic lookup code. But as long as we define what the scoping rules are, users won't care how that's implemented.

On 2014-03-04, at 23:31 , Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
That's not really a blocker though, Haskell thunks are implicit and not type-encoded. A name may correspond to a (unforced) thunk or to a strict value (an already forced thunk, whether through a previous implicit forcing, through an explicit forcing — a strict annotation — or through a decision of the strictness analyser).
There are definitely difficulties in deciding how the decision to force a thunk comes about.

Masklinn wrote:
That comes at the expense of making *everything* a thunk until its value is needed. This would be a big, far-reaching change in the way Python works. Also, Haskell has the advantage of knowing that the value won't change, so it can replace the thunk with its value once it's been calculated. Python would have to keep it as a thunk and reevaluate it every time. And Haskell doesn't have to provide a way of passimg the thunk around *without* evaluating it. -- Greg

On 2014-03-05, at 22:51 , Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Of course not, Haskell allows explicitly forcing a thunk and the compiler/runtime pair can use strictness analysis and not create thunks in the first place.
This would be a big, far-reaching change in the way Python works.
Having Haskell-type thunks does not mean using the same default as Haskell. Python would have the opposite one, instead of creating thunks by default it would create them only when requested.
Also, Haskell has the advantage of knowing that the value won't change
I do not see why that would be relevant to thunks. The result of the thunk may be mutable, so what?
See above, I do not see why that would happen. Consider that a thunk is a deferral of an expression's evaluation, why would said expression's evaluation happen more than once? The only thing which changes is *when* the actual evaluation happens.
And Haskell doesn't have to provide a way of passimg the thunk around *without* evaluating it.
Well no, instead it takes the sensible option to force thunks when they need to be forced (or when their forcing is explicitly requested e.g. using ($!) or "bang pattern" extensions, but the need for those is mostly because haskell defaults to creating thunks I would say). Why would passing a reference around force a thunk in the first place?

Masklinn wrote:
On 2014-03-05, at 22:51 , Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
That's not what I mean. The value of the thunk can depend on other things that change. In general you can't get away with evaluating it once and caching the result. -- Greg

Masklinn writes:
That doesn't seem fair to me. "Explicitly forcing" by definition means the programmer has decided the value is needed, and "strictness analysis" is an optimization, not part of the language definition. Am I missing something?
Are thunks guaranteed not to leak out of the scope of the containing expression, so assignment is impossible? If they can leak, the thunk would be an object, and in Python "assigning" it to a "variable" actually just creates a reference. Evaluation could memoize the thunk's value ("evaluation only happens once") or reevaluate the thunk each time its value is requested. Either seems potentially surprising to me. Memoization makes sense if you think of the thunk as the closure of a computation that conceptually takes place at definition time. I'm not sure if in the use cases for thunks it really makes sense to "close" the thunk at evaluation time, but it seems like a plausible interpretation to me.

On 2014-03-06, at 08:29 , Stephen J. Turnbull <stephen@xemacs.org> wrote:
No, it means the programmer wants the value to be strictly evaluated (this is generally a space optimisation). It has no impact on program behavior (assuming infinite resources, one issue of accumulating thunks is the risk of OOM if they remain live but unevaluated).
No? It just demonstrates that not everything is a thunk in haskell, there are both thunks and strict values living around in the runtime, and that's not visible in the type system.
I'm not sure I understand the question.
If they can leak, the thunk would be an object
I think that's just one possible implementation, a thunk could also be some sort of reference indirection, a tagged pointer, or a deeply integrated proxy-ish object.
Absolutely.
I find it the most *useful* definition: for the other one we already have functions and I'm not sure thunks would have much use as shorter arguments-less functions. It also has the property that "client" code (code receiving thunks without necessarily being aware of it) remains valid and unsurprising in that a = foo b = foo assert a is b remains true regardless of `foo` being a thunk (the only difference being that if `foo` is a thunk it may remain unforced throughout).
participants (11)
-
Andrew Barnert
-
Bruce Leban
-
Chris Angelico
-
David Townshend
-
Greg Ewing
-
Masklinn
-
MRAB
-
Nick Coghlan
-
Ron Adam
-
Stephen J. Turnbull
-
Steven D'Aprano