Syntax for late-bound arguments
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
Spun off from PEP 505 thread on python-dev. One of the weaker use-cases for None-coalescing is function default arguments, where the technical default is None but the function wants to use some other value instead. This is a weak argument in favour of None-coalescing because only a small set of such situations are only slightly improved by a "??" operator, and the problem is far larger. Consider the bisect.bisect() function [1]: def bisect(a, x, lo=0, hi=None): if lo < 0: raise ValueError('lo must be non-negative') if hi is None: hi = len(a) It's clear what value lo gets if you omit it. It's less clear what hi gets. And the situation only gets uglier if None is a valid argument, and a unique sentinel is needed; this standard idiom makes help() rather unhelpful: _missing = object() def spaminate(thing, count=_missing): if count is _missing: count = thing.getdefault() Proposal: Proper syntax and support for late-bound argument defaults. def spaminate(thing, count=:thing.getdefault()): ... def bisect(a, x, lo=0, hi=:len(a)): if lo < 0: raise ValueError('lo must be non-negative') This would be approximately equivalent to the _missing idiom, with the following exceptions: 1) Inspecting the function would reveal the source code for the late-bound value 2) There is no value which can be passed as an argument to achieve the same effect 3) All late-bound defaults would be evaluated before any other part of the function executes (ie the "if hi is None" check would now be done prior to "if lo < 0"). The syntax I've chosen is deliberately subtle, since - in many many cases - it won't make any difference whether the argument is early or late bound, so they should look similar. While it is visually similar to the inline assignment operator, neither form is currently valid in a function's parameter list, and thus should not introduce ambiguity. The expression would be evaluated in the function's context, having available to it everything that the function has. Notably, this is NOT the same as the context of the function definition, but this is only rarely going to be significant (eg class methods where a bare name in an early-bound argument default would come from class scope, but the same bare name would come from local scope if late-bound). The purpose of this change is to have the function header define, as fully as possible, the function's arguments. Burying part of that definition inside the function is arbitrary and unnecessary. ChrisA [1] Technically it's bisect_right but I'm using the simpler name here. Also, it recently grew a key parameter, which would only get in the way here. https://docs.python.org/3/library/bisect.html#bisect.bisect_right
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 3:26 AM Ethan Furman <ethan@stoneleaf.us> wrote:
(Truncated email, I presume) Yeah, I'm not wedded to the precise syntax. But it needs to be simple and easy to read, it needs to not be ugly, and it needs to not be valid syntax already. There are a few alternatives, but I like them even less: def bisect(a, x, lo=0, hi=@len(a)): def bisect(a, x, lo=0, hi=?len(a)): def bisect(a, x, lo=0, hi=>len(a)): def bisect(a, x, lo=0, hi=\len(a)): def bisect(a, x, lo=0, hi=`len(a)`): def bisect(a, x, lo=0, hi!=len(a)): Feel free to propose an improvement to the syntax. Whatever spelling is ultimately used, this would still be of value. ChrisA
data:image/s3,"s3://crabby-images/4937b/4937b27410834ce81f696e8505f05dcd413883b2" alt=""
On 2021-10-24 at 03:07:45 +1100, Chris Angelico <rosuav@gmail.com> wrote:
[...]
Those two paragraphs contradict each other. If the expression is evaluated in the function's context, then said evaluation is (by definition?) part of the function and not part of its argumens. As a separate matter, are following (admittedly toy) functions (a) an infinite confusion factory, or (b) a teaching moment? def f1(l=[]): l.append(4) return l def f2(l=:[]): l.append(4) return l
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 6:18 AM <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:
The function header is a syntactic construct - the "def" line, any decorators, annotations, etc. But for the late-binding expressions to be useful, they MUST be evaluated in the context of the function body, not its definition. That's the only way that expressions like len(a) can be of value. (Admittedly, this feature would have some value even without that, but it would be extremely surprising and restrictive.)
Teaching moment. Currently, the equivalent second function would be this: def f2(l=None): if l is None: l = [] l.append(4) return l And the whole "early bind or late bind" question is there just the same; the only difference is that the late binding happens somewhere inside the function body, instead of being visible as part of the function's header. (In this toy example, it's the very next line, which isn't a major problem; but in real-world examples, it's often buried deeper in the function, and it's not obvious that passing None really is the same as passing the array's length, or using a system random number generator, or constructing a new list, or whatever it is.) This is also why the evaluation has to happen in the function's context: the two forms should be broadly equivalent. You should be able to explain a late-bound function default argument by saying "it's like using =None and then checking for None in the function body, only it doesn't use None like that". This is, ultimately, the same teaching moment that you can get in classes: class X: items = [] def add_item(self, item): self.items.append(item) class Y: def __init__(self): self.items = [] def add_item(self, item): self.items.append(item) Understanding these distinctions is crucial to understanding what your code is doing. There's no getting away from that. I'm aware that blessing this with nice syntax will likely lead to a lot of people (a) using late-binding everywhere, even if it's unnecessary; or (b) using early-binding, but then treating late-binding as a magic bandaid that fixes problems if you apply it in the right places. Programmers are lazy. We don't always go to the effort of understanding what things truly do. But we can't shackle ourselves just because some people will misuse a feature - we have plenty of footguns in every language, and it's understood that programmers should be allowed to use them if they choose. ChrisA
data:image/s3,"s3://crabby-images/4937b/4937b27410834ce81f696e8505f05dcd413883b2" alt=""
On 2021-10-24 at 06:54:36 +1100, Chris Angelico <rosuav@gmail.com> wrote:
If you mean that def statements and decorators run at compile time, then I agree. If you mean something else, then I don't understand.
I think we're saying the same thing, but drawing different conclusions. I agree with everything in the first paragraph I quoted above, but I can't make the leap to claiming that late binding is part of defining the function's arguments. You say "late binding of function arguments"; I say "the part of the function that translates the arguments into something useful for the algorithn the function encapsulates."
It's only not obvious if the documentation is lacking, or the tools are lacking, or the programmer is lacking. The deeper "it" is in the function, the more you make my point that it's part of the function itself and not part of setting up the arguments.
Understanding the difference between defining a class and instantiating that class is crucial, as is noticing the very different source code contexts in which X.items and self.item are created. I agree. Stuff in class definitions (X.items, X.add_item, Y.__init__, Y.add_item) happens when X is created, arguably at compile time. The code inside the function suites (looking up and otherwise manipulating self.items) happens later, arguably at run-time. In f1, everything in the "def" statement happens when f1 is defined. In f2, part of the "def" statement (i.e., defining f2) happens when f2 is defined (at compile-time), but the other part (the logic surrounding l and its default value) happens when f2 is called (at run-time).
I won't disagree. Maybe it's just that I am the opposite of sympathetic to the itches (and those itches' underlying causes) that this particular potential footgun scratches. Curiously, for many of the same reasons, I think I'm with you that: def get_expensive(self): if not self.expensive: self.expensive = expensive() return self.expensive is better (or at least not worse) than: def get_expensive(self): return self.expensive or (self.expensive := expensive())
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 8:56 AM <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:
"Function header" is not about time, it's about place. def spam(x: int, y: int) -> Thingy: ... The annotations might be evaluated at function definition time (although some proposals are looking at changing that), but they may also be evaluated long before that, during a static analysis phase. We don't need a separate place in the code for "stuff that runs during static analysis", because logically, it's all about those same function parameters. Late-bound argument defaults are still argument defaults. When you're thinking about how you call the function, what matters is "this argument is optional, and if you don't specify it, this is what happens". Sometimes the definition of 'this' is a specific value (calculated by evaluating an expression at definition time). Sometimes, it's some other behaviour, defined by the function itself. All this proposal does is make the most common of those into a new option: defining it as an expression.
It currently is, due to a technical limitation. There's no particular reason that it HAS to be. For instance, consider these two: def popitem(items, which=-1): ... def popitem(items, which=len(items) - 1): ... Both of them allow you to omit the argument and get the last one. The first one defines it with a simple value and relies on the fact that you can subscript lists with -1 to get the last element; the second doesn't currently work. Is there a fundamental difference between them, or only a technical one?
It's mainly about what [] means and when it's evaluated. Either way, self.items is a list. The only difference is whether instantiating the class creates a new list, or you keep referring to the same one every time.
Yes. It is the exact same distinction as early-bound or late-bound arguments.
I don't have any good examples where this happens, so it's hard to argue, but I definitely don't see any advantage in the second one. It's almost identically as repetitive, and offers very little advantage. If expensive() is a method call, I'd just tack an lru_cache onto it and be done with it. Having a wrapper like this just looks like a toy, and one that's hard to argue on the basis of. Would need a real example. ChrisA
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Sat, Oct 23, 2021 at 02:54:54PM -0700, 2QdxY4RzWzUUiLuE@potatochowder.com wrote: [...]
Pedantic note: def statements and decorators run at runtime, the same as all other statements. (Including class statements.) Broadly speaking (I may have some of the fine details wrong) there are three stages in executing a function: - compile-time, when the interpreter statically analyses the function source code and compiles the body of the function, plus a set of byte-codes (or equivalent) to assemble the function object; - runtime, when the def *statement* is executed by running the second lot of byte-code, and the function object is assembled -- we often call that "function definition time"; and - runtime, when the function object is actually called, and the first set of byte-code, the one that represents the body of the function, gets executed. There is no Python code executed at compile-time, and the def statement is not executed until runtime. [...]
Consider what happens when you call a function now: bisect(alist, obj) At runtime, the interpreter has to bind arguments to parameters (and handle any errors, such as too few or too many arguments). The bisect function takes five parameters, but only two arguments are given. So the interpreter currently has to look up default values for the missing three parameters (which it gets from the function object, or possibly the code object, I forget which). Those values are static references to objects which were evaluated at function definition time, so that the process of fetching the default is nothing more than grabbing the object from the function object. That process of the interpreter matching up arguments to parameters, filling in missing arguments with defaults, and checking for error conditions, is not normally considered to be part of the function execution itself. It is part of the interpreter, not part of the function. Now consider what would happen if we used late binding. Everything would be the same, *except* that instead of fetching a static reference to a pre-existing object, the interpreter would have to fetch a reference to some code, evaluate the code, and use *that* object as the default. There is no reason to consider that part of the function body, it is still performed by the interpreter. It is only habit from 30 years of working around the lack of late- binding defaults by putting the code inside the function body that leads us to think that late-binding is necessarily part of the body. In Python today, of course late-binding is part of the body, because that's the only way we have to delay the evaluation of an expression. But consider languages which have late-binding, I think Smalltalk and Lisp are the two major examples. I'm not an expert on either, but I can read StackOverflow and extrapolate a meaning to code :-) (defun test1 (&optional (x 0)) (+ x x)) is a function that takes one argument with a default value of 0. Lisp uses late-binding: the default value is an expression (an S-expression?) that is evaluated when the code is called, not at compile time, but it is not part of the body of the function (the `(+ x x)` expression. -- Steve
data:image/s3,"s3://crabby-images/4937b/4937b27410834ce81f696e8505f05dcd413883b2" alt=""
On 2021-10-24 at 13:23:51 +1100, Steven D'Aprano <steve@pearwood.info> wrote:
Yep. My mistake.
[good explanation of what happens with default parameters snipped]
Aha. I see it now (and it's not just those 30 years of Python, it's the previous decade of Basic, FORTRAN, Pascal, C, etc.). My first impulse remains that all those things "the interpreter" does with default values are still part of the function, and that the shorthand declarative syntax is still just sugar for the explicit logic.
Yep. I've written some non-toy Lisp code, and perhaps because of the language in Lisp documentation ("initial forms" rather than "default values"), I definitely see all of that binding as part of the function, whether I write it in the body or in the lambda list. (FWIW, there's also a 3-tuple form of optional parameter, where the third element is a predicate that is bound to a boolean value that indicates whether the value of the argument came from the caller or the default value (thus eliminating the need for unique sentinels). If you're so inclined, find "supplied-p-parameter" on <http://www.lispworks.com/documentation/HyperSpec/Body/03_da.htm>.) As I said before, pushing this sort of logic into "late binding" scratches an itch I don't have. I gladly write two (or more) [public] functions that call a common [possibly private] function rather than one [public] function with optional arguments and default values. Easy to write; easy to read; and easy to test, maintain, and extend. Yes, theoretically, the number of functions grows exponentially, but rarely do I need more than a few such glue functions to cover most (if not all) of the real use cases.
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Sun, Oct 24, 2021 at 06:54:36AM +1100, Chris Angelico wrote: [...]
I challenge that assertion. I've never knowingly seen a function where the late binding is "buried deeper in the function", certainly not deep enough that it is not obvious. It is a very strong convention that such late binding operations occur early in the function body. You know, before you use the parameter, not afterwards *wink* But then I mostly look at well-written functions that are usually less than two, maybe three, dozen lines long, with a managable number of parameters. If you are regularly reading badly-written functions that are four pages long, with fifty parameters, your experience may differ :-) The bisect function you gave earlier is a real-world example of a non-toy function. You will notice that the body of bisect_right: - does the late binding early in the body, immediately after checking for an error condition; - and is a manageable size (19 LOC). https://github.com/python/cpython/blob/3.10/Lib/bisect.py The bisect module is also good evidence that this proposal may not be as useful as we hope. We have: def insort_right(a, x, lo=0, hi=None, *, key=None): which just passes the None on to bisect_right. So if we introduced optional late-binding, the bisect module has two choices: - keep the status quo (don't use the new functionality); - or violate DRY (Don't Repeat Yourself) by having both functions duplicate the same late-binding. It's only a minor DRY violation, but still, if the bisect module was mine, I wouldn't use the new late-binding proposal. So I think that your case is undermined a little by your own example. -- Steve
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 1:00 PM Steven D'Aprano <steve@pearwood.info> wrote:
What I'm more often seeing is cases that are less obviously a late-binding, but where the sentinel is replaced with the "real" value at the point where it's used, rather than up the top of the function.
The truth is, though, that the default for hi is not None - it's really "length of the given list". Python allows us to have real defaults for parameters, rather than simply leaving trailing parameters undefined as JavaScript does; this means that you can read off the function header and see what the meaning of parameter omission is. Late-binding semantics allow this to apply even if the default isn't a constant. If this proposal is accepted, I would adopt the DRY violation, since it would literally look like this: def insort_right(a, x, lo=0, hi=>len(a), *, key=None): def bisect_right(a, x, lo=0, hi=>len(a), *, key=None): def insort_left(a, x, lo=0, hi=>len(a), *, key=None): def bisect_left(a, x, lo=0, hi=>len(a), *, key=None): That's not really a lot of repetition, and now everyone can see that the default lo is 0 and the default hi is the length of a. Four "hi=>len(a)" isn't really different from four "hi=None" when it comes down to it. ChrisA
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Sun, Oct 24, 2021 at 01:16:02PM +1100, Chris Angelico wrote:
Got any examples you can share? And is it really a problem if we delay the late-binding to the point where the value is actually needed? Here's a toy example: # Using only early binding. def function(spam, eggs=None, cheese=None): if eggs is None: eggs = cheap_default() # do stuff using eggs ... if condition: return result if cheese is None: cheese = expensive_default() # do stuff using cheese ... return result The cheese parameter only gets used if the processing of spam with eggs fails to give a result. But if cheese is used, the default is expensive. Is it really a problem if we delay evaluating that default to the point where it is needed? So this would be another example where automatic late-binding wouldn't be used. If the default is very expensive, I would stick to manual late- binding using None, and only evaluate it as needed. Maybe this is an argument for some sort of thunk, as in Algol, which is only evaluated at need. Then we could just write: # Assume late-binding with thunks. def function(spam, eggs=cheap_default(), cheese=expensive_default()): # do stuff using eggs ... if condition: return result # do stuff using cheese ... return result and the thunks `cheap_default()` and `expensive_default()` will only be evaluated *if they are actually needed*, rather than automatically when the function is called. To be clear about the semantics, let me illustrate. I am deliberately not using any extra syntax for late-binding. # early binding (the status quo) def func(arg=expression): ... The expression is evaluated when the def statement is run and the func object is created. # late binding (minus any extra syntax) def func(arg=expression): ... The expression is evaluated eagerly when the function is called, if and only if the parameter arg has not been given a value by the caller. # late binding with thunk def func(arg=expression): ... The expression is evaluated only if and when the body of the function attempts to use the value of arg, if the caller has not provided a value. So if the function looks like this: # late binding with a thunk that delays execution until needed def func(flag, arg=1/0): if flag: print("Boom!") return arg return None then func(True) will print Boom! and then raise ZeroDivisionError, and func(False) will happily return None. I have no idea whether thunk-like functionality is workable in Python's execution model without slowing down every object reference, but if it is possible, there could be other really nice use-cases beyond just function defaults. -- Steve
data:image/s3,"s3://crabby-images/4d484/4d484377daa18e9172106d4beee4707c95dab2b3" alt=""
On Sat, Oct 23, 2021 at 7:55 PM Steven D'Aprano <steve@pearwood.info> wrote:
And is it really a problem if we delay the late-binding to the point where the value is actually needed? ...
<snip> [in that csse] I would stick to manual late-
I have no idea whether thunk-like functionality is workable in Python's
Your message here and my message on this passed in the mail. Yes, this is a really good point and would apply to the cases I've seen where the evaluation was in the middle. Thanks for raising it. I also don't know if it's workable but it should be considered. --- Bruce
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 1:53 PM Steven D'Aprano <steve@pearwood.info> wrote:
Not from the standard library, but try something like this: def generate_code(secret, timestamp=None): ... ... ... ts = (timestamp or now()).to_bytes(8, "big") ... The variable "timestamp" isn't ever actually set to its true default value, but logically, omitting that parameter means "use the current time", so this would be better written as: def generate_code(secret, timestamp=>now()): That's what I mean by "at the point where it's used" - it's embedded into the expression that uses it. But the body of a function is its own business; what matters is the header, which is stating that the default is None, where the default is really now().
Delaying evaluation isn't a problem, though it also isn't usually an advantage. (If the default is truly expensive, then you wouldn't want to use this, but now we're looking at optimizations, where the decision is made to favour performance over clarity.)
The biggest problem with thunks is knowing when to trigger evaluation. We already have functions if you want to be explicit about that: def func(flag, arg=lambda:1/0): ... return arg() so any thunk feature would need some way to trigger its full evaluation. Should that happen when it gets touched in any way? Or leave it until some attribute is looked up? What are the consequences of leaving it unevaluated for too long? Thunks would be another great feature, but I think they're orthogonal to this. ChrisA
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Sun, Oct 24, 2021 at 02:09:59PM +1100, Chris Angelico wrote:
The biggest problem with thunks is knowing when to trigger evaluation.
I think Algol solved that problem by having thunks a purely internal mechanism, not a first-class value that users could store or pass around.
We already have functions if you want to be explicit about that:
Yes, if we are satisfied with purely manually evaluating thunks. The point of a thunk though is that the interpreter knows when to evaluate it, you don't have to think about it.
Thunks would be another great feature, but I think they're orthogonal to this.
If we had thunks, that would give us late binding for free: def bisect(a, x, lo=0, hi=thunk len(a), *, key=None) Aaaand we're done. So thunks would make this PEP obsolete. But if thunks are implausible, too hard, or would have too high a cost, then this PEP remains less ambitious and therefore easier. -- Steve
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 5:53 PM Steven D'Aprano <steve@pearwood.info> wrote:
If you can't pass them around, then how are they different from what's proposed here? They are simply expressions that get evaluated a bit later. You potentially win on performance if it's expensive, but you lose on debuggability when errors happen further down and less consistently, and otherwise, it's exactly the same thing.
Where else would you use thunks? I think it's exactly as ambitious, if indeed they can't be stored or passed around. ChrisA
data:image/s3,"s3://crabby-images/083fb/083fb9fce1476ebe02d0a5d8c76d5547020ebe75" alt=""
On Sat, Oct 23, 2021 at 11:53 PM Steven D'Aprano <steve@pearwood.info> wrote:
If we had thunks, that would give us late binding for free:
def bisect(a, x, lo=0, hi=thunk len(a), *, key=None)
I'm unclear on exactly what the semantics of a thunk would be, but I don't see how it could do what you want here. In an ordinary default value, the "a" in "len(a)" refers to a variable of that name in the enclosing scope, not the argument of bisect. A generic delayed-evaluation mechanism wouldn't (shouldn't) change that.
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
I like that you're trying to fix this wart! I think that using a different syntax may be the only way out. My own bikeshed color to try would be `=>`, assuming we'll introduce `(x) => x+1` as the new lambda syntax, but I can see problems with both as well :-). -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 6:55 AM Guido van Rossum <guido@python.org> wrote:
I like that you're trying to fix this wart! I think that using a different syntax may be the only way out. My own bikeshed color to try would be `=>`, assuming we'll introduce `(x) => x+1` as the new lambda syntax, but I can see problems with both as well :-).
Sounds good. I can definitely get behind this as the preferred syntax, until such time as we find a serious problem. ChrisA
data:image/s3,"s3://crabby-images/4139c/4139cd55a519bbbc5518a98d3ab394bc539912b9" alt=""
El sáb, 23 oct 2021 a las 12:57, Guido van Rossum (<guido@python.org>) escribió:
def bisect_right(a, x, lo=0, hi=>len(a), *, key=None): This reads to me like we're putting "hi" into "len(a)", when it's in fact the reverse. What about: def bisect_right(a, x, lo=0, hi<=len(a), *, key=None): Another option (going back to Chris's original suggestion) could be: def bisect_right(a, x, lo=0, hi:=len(a), *, key=None): Which is the same as the walrus operator, leaning on the idea that this is kind of like the walrus: a name gets assigned based on something evaluated right here. Bikeshedding aside, thanks Chris for the initiative here! This is a tricky corner of the language and a promising improvement.
data:image/s3,"s3://crabby-images/4d484/4d484377daa18e9172106d4beee4707c95dab2b3" alt=""
On Sat, Oct 23, 2021 at 6:23 PM Jelle Zijlstra <jelle.zijlstra@gmail.com> wrote:
I think in most cases what's on the right side will be something that's not assignable. Likewise with the proposal to use => for lambda, someone could read (a => a + 1) as putting a into a + 1. I think they're going to get over that. Every language I am aware of that has adopted a short hand lambda notation (without a keyword) has used => or -> except APL, Ruby, SmallTalk. See https://en.wikipedia.org/wiki/Anonymous_function APL uses a tacit syntax while Ruby and SmallTalk use explicit syntaxes. The equivalent of x => x + 1 in each of these is APL ⍺+1 (I think) Ruby |x| x + 1 SmallTalk [ :x | x + 1 ] --- Bruce
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 1:48 PM Bruce Leban <bruce@leban.us> wrote:
Anonymous functions are an awkward parallel here. The notation you're describing will create a function which accepts one argument, and then returns a value calculated from that argument. We're actually doing the opposite: hi is being set to len(a), it's not that len(a) is being calculated from hi. That said, though, I still count "=>" among my top three preferences (along with "=:" and "?="), and flipping the arrow to "<=" is too confusable with the less-eq operator. ChrisA
data:image/s3,"s3://crabby-images/4d484/4d484377daa18e9172106d4beee4707c95dab2b3" alt=""
--- Bruce On Sat, Oct 23, 2021 at 7:55 PM Chris Angelico <rosuav@gmail.com> wrote:
Sorry I was less than clear. The syllogism here is (1) late-evaluated argument default should use => because that's the proposal for shorthand lambda (2) shorthand lambda should use => because that's what other languages use. I was talking about (2) but I should have been explicit. And yes, you highlight a potential source of confusion. def f(x=>x + 1): ... means that x is 1 more than the value of x from the enclosing global scope (at function call time) while g = x => x + 1 sets g to a single-argument function that adds 1 to its argument value. --- Bruce
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On Sat, 23 Oct 2021 at 17:09, Chris Angelico <rosuav@gmail.com> wrote:
+1 from me. I agree that getting a good syntax will be tricky, but I like the functionality. I do quite like Guido's "hi=>len(a)" syntax, but I admit I'm not seeing the potential issues he alludes to, so maybe I'm missing something :-) Paul
data:image/s3,"s3://crabby-images/4d484/4d484377daa18e9172106d4beee4707c95dab2b3" alt=""
On Sat, Oct 23, 2021 at 12:56 PM Guido van Rossum <guido@python.org> wrote:
+1 to this spelling. I started writing a message arguing that this should be spelled with lambda because the fact that you're (effectively) writing a function should be explicit (see below). But the syntax is ugly needing both a lambda and a variant = operator. This solves that elegantly. On Sat, Oct 23, 2021 at 9:10 AM Chris Angelico <rosuav@gmail.com> wrote:
The syntax I've chosen is deliberately subtle, since - in many many
cases - it won't make any difference whether the argument is early or late bound, so they should look similar.
I think a subtle difference in syntax is a bad idea since this is not a subtle difference in behavior. If it makes no difference whether the argument is early or late bound then you wouldn't be using it. Here's one way you could imagine writing this today: def bisect(a, x, lo=0, hi=lambda: len(a)): hi = hi() if callable(hi) else hi ... which is clumsy and more importantly doesn't work because the binding of the lambda occurs in the function definition context which doesn't have access to the other parameters. <deleted some discussion about alternative syntaxes because Guido's suggestion solves it elegantly> --- Bruce
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 7:58 AM Bruce Leban <bruce@leban.us> wrote:
Agreed; if it needs to remain self-contained, then yes, it would have to be a function, and there'd be good reason for making it look like one. As it is, it's more just "code that happens at the start of the function", but I think there's still value in using a syntax that people understand as a late binding system (as lambda functions will be).
There IS a difference, but the difference should be subtle. Consider: x = 5 x = 5. These are deliberately similar, but they are quite definitely different, and they behave differently. (You can't use the second one to index a list, for instance.) Does the difference matter? Absolutely. Does the similarity matter? Yep. Similar things should look similar. The current front-runner syntax is: def bisect(a, x, lo=0, hi=>len(a)): This is only slightly less subtle. It's still a one-character difference which means that instead of being evaluated at definition time, it's evaluated at call time. This is deliberate; it should still look like (a) a parameter named "hi", (b) which is optional, and (c) which will default to the result of evaluating "len(a)". That's a good thing.
Right. It also doesn't solve the help() problem, since it's just going to show the (not-very-helpful) repr of a lambda function. It's not really much better than using object() as a sentinel, although it does at least avoid the global-pollution problem. It seems like there's broad interest in this, but a lot of details to nut out. I think it may be time for me to write up a full PEP. Guido, if I'm understanding recent SC decisions correctly, a PEP editor can self-sponsor, correct? ChrisA
data:image/s3,"s3://crabby-images/552f9/552f93297bac074f42414baecc3ef3063050ba29" alt=""
+1 on the idea. Sometimes early binding is needed, sometimes late binding is needed. So Python should provide both. QED 😁 I'm not keen on the var = > expr syntax. IMO the arrow is pointing the wrong way. expr is assigned to var. Some possible alternatives, if there is no technical reason they wouldn't work (as far as I know they are currently a syntax error, and thus unambiguous, inside a function header, unless annotations change things. I know nothing about annotations): var <= exp # Might be confused with less-than-or-equals var := expr # Reminiscent of the walrus operator in other contexts. # This might be considered a good thing or a bad thing. # Possibly too similar to `var = expr' var : expr # Less evocative. Looks like part of a dict display. var <- expr # Trailing `-` is confusing: is it part of (-expr) ? var = <expr> # Hm. At least it's clearly distinct from `var=expr`. var << expr # Ditto. (var = expr) [var = expr] # Ditto And as I'm criticising my own suggestions, I'll do the same for the ones in the PEP: def bisect(a, hi=:len(a)): # Just looks weird, doesn't suggest anything to me def bisect(a, hi?=len(a)): # Hate question marks (turning Python into Perl?) # Ditto for using $ or % or ^ or & def bisect(a, hi!=len(a)): # Might be confused with not-equals def bisect(a, hi=\len(a)): # `\` looks like escape sequence or end of line # What if you wanted to break a long line at that point? def bisect(a, hi=`len(a)`): # Backquotes are fiddly to type. Is ` len(a) ` allowed? def bisect(a, hi=@len(a)): # Just looks weird, doesn't suggest anything to me My personal current (subject to change) preference would be `var := expr`. All this of course is my own off-the-cuff subjective reaction. Y M M (probably will) V. Best wishes Rob Cliffe On 23/10/2021 23:08, Chris Angelico wrote:
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 12:44 PM Rob Cliffe <rob.cliffe@btinternet.com> wrote:
That's definitely open to discussion.
Annotations come before the default, so the question is mainly whether it's possible for an annotation to end with the symbol in question.
var <= exp # Might be confused with less-than-or-equals
Not a fan, for that exact reason.
This is less concerning than <=, since... ahem. I spy a missed opportunity here. Must correct. This is less confusing or equally confusing as <=, since ":=" is a form of name binding, which function parameters are doing. Not a huge fan but I'd be more open to this than the leftward-arrow.
var : expr # Less evocative. Looks like part of a dict display.
(That is precisely the syntax for annotations, so that one's out)
var <- expr # Trailing `-` is confusing: is it part of (-expr) ?
Not a fan. I don't like syntaxes where a space makes a distinct difference to the meaning.
var = <expr> # Hm. At least it's clearly distinct from `var=expr`.
It took me a couple of readings to understand this, since <expr> is a common way of indicating a placeholder. This might be a possibility, but it would be very difficult to discuss it.
var << expr # Ditto.
Like <=, this is drawing an analogy with an operator that has nothing to do with name binding, so I think it'll just be confusing.
(var = expr) [var = expr] # Ditto
Bracketing the argument seems like a weird way to indicate late-binding, but if anything, I'd go with the round ones.
Yeah, I don't like != for the same reason that I don't like <= or <<.
def bisect(a, hi=\len(a)): # `\` looks like escape sequence or end of line # What if you wanted to break a long line at that point?
No worse than the reuse of backslashes in regexes and string literals.
def bisect(a, hi=`len(a)`): # Backquotes are fiddly to type. Is ` len(a) ` allowed?
Not sure why it wouldn't - the spaces don't change the expression. Bear in mind that the raw source code of the expression would be saved for inspection, so there'd be a difference with help(), but other than that, it would have the same effect.
def bisect(a, hi=@len(a)): # Just looks weird, doesn't suggest anything to me
Agreed, don't like that one.
Subjective reactions are extremely important. If the syntax is ugly, the proposal is weak. But I can add := to the list of alternates, for the sake of having it listed. ChrisA
data:image/s3,"s3://crabby-images/a3b9e/a3b9e3c01ce9004917ad5e7689530187eb3ae21c" alt=""
+1 to this idea -- thanks Chris A! On Sat, Oct 23, 2021 at 2:01 PM Bruce Leban <bruce@leban.us> wrote:
However, it will work if there happens to be an 'a' defined in the scope where the function is created. So that means that it could be VERY confusing if the syntax for an anonymous function is the same (or very similar to) the syntax for delayed evaluation of parameter defaults. I think Steven may have posted an example of what it would look like. I"ll also add that looking again, and "arrow" like symbol really is very odd in this context, as others have pointed out, it's pointing in the wrong direction. Not that I have a better idea. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 1:12 PM Ricky Teachey <ricky@teachey.org> wrote:
That's something where we may need to get a reference implementation before deciding, but I am open to either of two possibilities: 1) Keyword args are resolved before late-bound defaults, so your example would technically work, despite being confusing 2) Late-bound defaults explicitly reject (with SyntaxError) any references to arguments to their right. Python already enforces this kind of check:
Even if it IS legal, I would say that this would be something to avoid. Same with using assignment expressions to mutate other arguments - I suspect that this won't cause technical problems, but it would certainly hurt the brains of people who read it: def f(a, b=>c:=a, c=>b:=len(a)): print("wat") Yeah, just don't :) Not a dumb question. It's a little underspecified in the PEP at the moment (I only mention that they may refer to previous values, but not whether subsequently-named arguments are fair game), and am open to discussion about whether this should be locked down. Currently, I'm inclined to be permissive, and let people put anything they like there, just like you can put crazy expressions into other places where they probably wouldn't improve your code :) ChrisA
data:image/s3,"s3://crabby-images/83003/83003405cb3e437d91969f4da1e4d11958d94f27" alt=""
On 2021-10-23 09:07, Chris Angelico wrote:
I'm -1 on it. For me the biggest problem with this idea is that it only handles a subset of cases, namely those that can be expressed as an expression inlined into the function definition. This subset is too small, because we'll still have to write code in the function body for cases where the default depends on more complex logic. But it is also too large, because it will encourage people to cram complex expressions into the function definition. To me, this is definitely not worth adding special syntax for. I seem to be the only person around here who detests "ASCII art" "arrow" operators but, well, I do, and I'd hate to see them used for this. The colon or alternatives like ? or @ are less offensive but still too inscrutable to be used for something that can already be handled in a more explicit way. I do have one other specific objection to the rationale:
Not really. help() shows the *documentation* of the function. A person calling it should *read the documentation*, not just glance at the function signature. I don't see any compelling benefit to having a mini-lambda be retrievable via introspection tools. There is simply no substitute for actually reading (and writing) the documentation. Also, insofar as glancing at the function signature is useful, I suspect that putting this change in will *also* lead to help() being unhelpful, because, as I mentioned above, if the default uses anything but the most trivial logic, the signature will become cluttered with stuff that ought to be separated out as actual logic. I would prefer to see this situation handled as part of a larger-scale change of adding some kind of "inline lambda" which executes directly in the calling scope. (I think this is similar to the "deferred computation" idea mentioned by David Mertz elsewhere in the thread.) This would also allow extracting the logic out of the function definition into a separate variable (holding the "inline lambda"), which could help with cases similar to the bisect examples discussed elsewhere in the thread, where multiple functions share late-binding logic. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 3:51 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
These two considerations, together, are the exact push that programmers need: keep the expression short, don't cram everything into the function definition. It's like writing a list comprehension; technically you can put any expression into the body of it, but it's normally going to be short enough to not get unwieldy. ChrisA
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Sat, Oct 23, 2021 at 08:29:54PM -0700, Brendan Barnwell wrote:
True. But that's equally true for default values under early binding. Would you say that existing syntax for default values is problematic because it only covers the 90% of cases where the default value can be computed from an expression? If push comes to shove, we can always write a helper function.
Just like they do now? I think people can write bad code regardless of the features available, but I don't think that adding late binding will particularly make that tendency worse. Most uses of late binding will be pretty simple: - mutable literals like [] and {}; - length of another argument (like the bisect example); - references to an attribute of self; - call out to another function. I don't think that there is good reason to believe that adding late binding will cause a large increase in the amount of overly complex default values. As you say, they are limited to a single expression, so the really complex blocks will have to stay inside the body of the function where they belong. And we have thirty years of default values, and no sign that people abuse them by writing lots of horrific defaults: def func(arg=sum(min(seq) for seq in [[LOOKUP[key]]+list(elements) for key, elements in zip(map(Mapper(spam, eggs), keys), iterable_of_sets) if condition(key)] if len(seq) > 5)): ... People just don't do that sort of thing in anywhere near enough numbers to worry about them doing it just because we have late binding. And if they do? "People will write crappy code" is a social problem, not a technology problem, which is best solved socially, using code reviews, linters, a quick application of the Clue Bat to the offending coder's head, etc.
That would be what I called a "thunk" in two posts now, stealing the term from Algol. It would be nice if one of the core devs who understand the deep internals of the interpreter could comment on whether that sort of delayed evaluation of an expression is even plausible for Python. If it is, then I agree: we should focus on a general thunk mechanism, which would then give us late binding defaults for free, plus many more interesting use-cases. (Somewhere in my hard drive I have a draft proto-PEP regarding this.) But if it is not plausible, then a more limited mechanism for late bound defaults will, I think, be a useful feature that improves the experience of writing functions. We already document functions like this: def bisect(a, x, lo=0, hi=len(a)) It would be an improvement if we could write them like that too. -- Steve
data:image/s3,"s3://crabby-images/83003/83003405cb3e437d91969f4da1e4d11958d94f27" alt=""
On 2021-10-23 23:33, Steven D'Aprano wrote:
I understand your point, but there is an important difference between the current situation and the proposal. Right now, the function definition executes in the enclosing scope. That means that there is nothing you can do in a default-argument expression that you can't do by assigning to a variable in a separate line (or lines) before defining the function. But with this proposal, the function definition will gain the new ability to express logic that executes in the function body environment, with the restriction that it be only a single expression. It's true that any feature can be abused, but I think this temptation to shoehorn function logic into the argument list will be more likely to result in unwieldy signatures than the current situation. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Sun, Oct 24, 2021 at 10:18:52PM +1100, Steven D'Aprano wrote:
Find some languages (Lisp, Smalltalk, any others?) and show that they are abused by people in that community.
Oops, sorry I left out a clause. Find some language **with late-binding defaults** and show that they are abused by people in that community. -- Steve
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
More about thunks. I've resisted the temptation to look at the Wikipedia page -- this is from memory. On Sat, Oct 23, 2021 at 11:39 PM Steven D'Aprano <steve@pearwood.info> wrote:
IIRC, a thunk in Algol was more or less defined as "substitute the argument in each use". There was no expression in the function definition, just an argument. Translating to Python, we'd have something like this (all arguments were thunks by default): def foo(arg): print(arg) arg = arg + 1 x = 42 foo(x) print(x) This would print 42 and 43. Writing foo(42) would produce an error. But if you had another function def bar(arg) print(arg) it would be okay to write bar(42). A key property (IIRC) was that the thunk would be evaluated each time it was used. This led to "Jensen's device" where you could write def foo(a, b, n): total = 0 for a in range(1, n+1): # IIRC Algol spelled this 'for a := 1 to n' total += b return total and you'd call it like this: foo(x, x**2, 10) which would compute the sum of the squares of i for i i range(10). It went out of style because it was complex to implement and expensive to execute -- for example, the 'n' argument would be a thunk too. You can't easily introduce this in Python because Python is built on not knowing the signature of foo when the code for the call foo(x, x**2, 10) is compiled. So it wouldn't be sufficient to mark 'a' and 'b' as thunks in the 'def foo(a, b, n)' line. You'd also have to mark the call site. Suppose we introduce a \ to indicate thunks in both places. (This is the worst character but enough to explain the mechanism.) You'd write def foo(\a, \b, n): total = 0 for a in range(1, n+1): total += b return total and you'd call it like x = None # dummy foo(\x, \x**2, 10) Now the compiler has enough information to compile code for the thunks. Since thunks can be used as l-values as well as r-values, there would be two hidden functions, one to get the value, another to set it. The call would pass an object containing those two functions and the code in the callee would translate each use of a thunk argument into a call to either the getter or the setter. Above, the getter functions for x and x**2 are simple: get_a = lambda: x get_b = lambda: x**2 (Defined in the caller's scope.) The setter function for the first argument would be a bit more complex: def set_a(val): nonlocal x x = val The second argument's setter function is missing, since 'x+42' is not an l-value. (We'd get a runtime error if foo() assigned to b.) If we wanted thunks without assignment and without Jensen's device, we would only need a getter function. Then \x in the caller would just pass lambda: x, and an argument marked with \a in a function definition would cause code to be generated that calls it each time it is used. Getting rid of the need to mark all thunk arguments in the caller would require the compiler to have knowledge of which function is being called. That would require an amount of static analysis beyond what even mypy and friends can do, so I don't think we should try to pursue that. The key property of a thunk, IMO, is that it is evaluated in the caller's scope. It's no different than a function defined in the caller. I don't think it would be a good substitute for late-binding default arguments. (You could make something up that uses dynamic scoping, but that's a whole different can of worms.) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
data:image/s3,"s3://crabby-images/98c42/98c429f8854de54c6dfbbe14b9c99e430e0e4b7d" alt=""
23.10.21 19:07, Chris Angelico пише:
Few years ago I proposed a syntax for optional arguments without default value: def spaminate(thing, count=?): try: count except UnboundLocalError: count = thing.getdefault() ... It would help in cases in which we now use None or special singleton value. It is more general than late-bound arguments, because it can be used in cases in which the default argument cannot be expressed, like in getattr() and dict.pop(). The code for initialization of the default value is something complicated, but we can introduce special syntax for it: if unset count: count = thing.getdefault() or even count ?= thing.getdefault()
This is the largest problem of both ideas. The inspect module has no way to represent optional arguments without default value, and any solution will break user code which is not ready for this feature.
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 9:52 PM Serhiy Storchaka <storchaka@gmail.com> wrote:
Ah yes, I'd forgotten about this proposal.
True, but in the example you give here, I would definitely prefer "count=>thing.getdefault()".
This is the perfect way to stir the pot on the None-coalescing discussion, because now you're looking at a very real concept of null value :) (Although there's no way to return this null value from a function or pass it to another, so the state of being unbound should be local to a single code unit.)
There kinda sorta is a way to do it, at least for keyword arguments: def func_left(x, y=>len(x)): ... def func_right(x, y=?): if unset y: y = {} else y = {"y": y} return func_left(-x, **y) But if there were a value that could be passed as an argument to indicate "no value", then we'd come across the same problem of using None as a default: sometimes, literally any value could be valid, and we still need to signal the absence of a value. ChrisA
data:image/s3,"s3://crabby-images/47610/4761082e56b6ffcff5f7cd21383aebce0c5ed191" alt=""
It seems to me that the syntax for late binding could be chosen so as to leave the possibility of expansion open in the future, and defer (har har) the entire generalized thunk discussion? So why not go with syntax like this, where before the ? just represents a keyword to be bike shedded ("defer", "thunk", "later", "...."): def func(a, b=? a): ... I kind of like using the ellipses btw; it looks sort of like "later..." to me: def func(a, b = ... a): ...
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 11:45 PM Ricky Teachey <ricky@teachey.org> wrote:
I'm not a fan of a keyword to cause late evaluation, for several reasons: * If it's a hard keyword, then it can't be used anywhere, despite having meaning only in one specific place. * If it's a soft keyword, there is confusion based on what's otherwise a perfectly legal identifier name. * It's extremely long. At very best, it'll be five letters, and maybe longer, just to change when something is evaluated. That could easily be longer than the expression that it's governing. Keywords are awesome for some situations, but not for this one IMO. Using ellipsis is kinda cute, but I don't think it'll really help here, especially as it's a perfectly valid token at that point. Consider: def func(a, b=...+a): ... Is this going to attempt to add Ellipsis and whatever is in the global a, or is it going to do a unary plus on the first argument? If thunks are introduced, they would be actual values, which should mean they can simply use the normal argument defaulting mechanism. There should be no conflict. ChrisA
data:image/s3,"s3://crabby-images/7c5da/7c5da102c926b3f2d1d8a7b421a337a59d187a84" alt=""
How about this syntax: def insort_right(a, x, lo=0, hi={len(a)}, *, key=None): … Similar to the expression curly brackets in f-string. If the user didn’t specify a value for hi, the expression between the curly brackets will be evaluated and assigned to hi. Abdulla Sent from my iPhone
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Sun, Oct 24, 2021 at 05:24:46PM +0400, Abdulla Al Kathiri wrote:
It’s not a good idea to use a mutable object anyways a default value.
Unless you intend to use a mutable object as a default value, and have it persist from one call to the next. Then it is absolutely fine. One of the use-cases for late-binding is that it will allow a safe and obvious way to get a *new* mutable default each time you call the function: def func(arg, @more=[]) would give you a new empty list each time you call func(x), rather than the same list each time. -- Steve
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 3:26 AM Ethan Furman <ethan@stoneleaf.us> wrote:
(Truncated email, I presume) Yeah, I'm not wedded to the precise syntax. But it needs to be simple and easy to read, it needs to not be ugly, and it needs to not be valid syntax already. There are a few alternatives, but I like them even less: def bisect(a, x, lo=0, hi=@len(a)): def bisect(a, x, lo=0, hi=?len(a)): def bisect(a, x, lo=0, hi=>len(a)): def bisect(a, x, lo=0, hi=\len(a)): def bisect(a, x, lo=0, hi=`len(a)`): def bisect(a, x, lo=0, hi!=len(a)): Feel free to propose an improvement to the syntax. Whatever spelling is ultimately used, this would still be of value. ChrisA
data:image/s3,"s3://crabby-images/4937b/4937b27410834ce81f696e8505f05dcd413883b2" alt=""
On 2021-10-24 at 03:07:45 +1100, Chris Angelico <rosuav@gmail.com> wrote:
[...]
Those two paragraphs contradict each other. If the expression is evaluated in the function's context, then said evaluation is (by definition?) part of the function and not part of its argumens. As a separate matter, are following (admittedly toy) functions (a) an infinite confusion factory, or (b) a teaching moment? def f1(l=[]): l.append(4) return l def f2(l=:[]): l.append(4) return l
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 6:18 AM <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:
The function header is a syntactic construct - the "def" line, any decorators, annotations, etc. But for the late-binding expressions to be useful, they MUST be evaluated in the context of the function body, not its definition. That's the only way that expressions like len(a) can be of value. (Admittedly, this feature would have some value even without that, but it would be extremely surprising and restrictive.)
Teaching moment. Currently, the equivalent second function would be this: def f2(l=None): if l is None: l = [] l.append(4) return l And the whole "early bind or late bind" question is there just the same; the only difference is that the late binding happens somewhere inside the function body, instead of being visible as part of the function's header. (In this toy example, it's the very next line, which isn't a major problem; but in real-world examples, it's often buried deeper in the function, and it's not obvious that passing None really is the same as passing the array's length, or using a system random number generator, or constructing a new list, or whatever it is.) This is also why the evaluation has to happen in the function's context: the two forms should be broadly equivalent. You should be able to explain a late-bound function default argument by saying "it's like using =None and then checking for None in the function body, only it doesn't use None like that". This is, ultimately, the same teaching moment that you can get in classes: class X: items = [] def add_item(self, item): self.items.append(item) class Y: def __init__(self): self.items = [] def add_item(self, item): self.items.append(item) Understanding these distinctions is crucial to understanding what your code is doing. There's no getting away from that. I'm aware that blessing this with nice syntax will likely lead to a lot of people (a) using late-binding everywhere, even if it's unnecessary; or (b) using early-binding, but then treating late-binding as a magic bandaid that fixes problems if you apply it in the right places. Programmers are lazy. We don't always go to the effort of understanding what things truly do. But we can't shackle ourselves just because some people will misuse a feature - we have plenty of footguns in every language, and it's understood that programmers should be allowed to use them if they choose. ChrisA
data:image/s3,"s3://crabby-images/4937b/4937b27410834ce81f696e8505f05dcd413883b2" alt=""
On 2021-10-24 at 06:54:36 +1100, Chris Angelico <rosuav@gmail.com> wrote:
If you mean that def statements and decorators run at compile time, then I agree. If you mean something else, then I don't understand.
I think we're saying the same thing, but drawing different conclusions. I agree with everything in the first paragraph I quoted above, but I can't make the leap to claiming that late binding is part of defining the function's arguments. You say "late binding of function arguments"; I say "the part of the function that translates the arguments into something useful for the algorithn the function encapsulates."
It's only not obvious if the documentation is lacking, or the tools are lacking, or the programmer is lacking. The deeper "it" is in the function, the more you make my point that it's part of the function itself and not part of setting up the arguments.
Understanding the difference between defining a class and instantiating that class is crucial, as is noticing the very different source code contexts in which X.items and self.item are created. I agree. Stuff in class definitions (X.items, X.add_item, Y.__init__, Y.add_item) happens when X is created, arguably at compile time. The code inside the function suites (looking up and otherwise manipulating self.items) happens later, arguably at run-time. In f1, everything in the "def" statement happens when f1 is defined. In f2, part of the "def" statement (i.e., defining f2) happens when f2 is defined (at compile-time), but the other part (the logic surrounding l and its default value) happens when f2 is called (at run-time).
I won't disagree. Maybe it's just that I am the opposite of sympathetic to the itches (and those itches' underlying causes) that this particular potential footgun scratches. Curiously, for many of the same reasons, I think I'm with you that: def get_expensive(self): if not self.expensive: self.expensive = expensive() return self.expensive is better (or at least not worse) than: def get_expensive(self): return self.expensive or (self.expensive := expensive())
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 8:56 AM <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:
"Function header" is not about time, it's about place. def spam(x: int, y: int) -> Thingy: ... The annotations might be evaluated at function definition time (although some proposals are looking at changing that), but they may also be evaluated long before that, during a static analysis phase. We don't need a separate place in the code for "stuff that runs during static analysis", because logically, it's all about those same function parameters. Late-bound argument defaults are still argument defaults. When you're thinking about how you call the function, what matters is "this argument is optional, and if you don't specify it, this is what happens". Sometimes the definition of 'this' is a specific value (calculated by evaluating an expression at definition time). Sometimes, it's some other behaviour, defined by the function itself. All this proposal does is make the most common of those into a new option: defining it as an expression.
It currently is, due to a technical limitation. There's no particular reason that it HAS to be. For instance, consider these two: def popitem(items, which=-1): ... def popitem(items, which=len(items) - 1): ... Both of them allow you to omit the argument and get the last one. The first one defines it with a simple value and relies on the fact that you can subscript lists with -1 to get the last element; the second doesn't currently work. Is there a fundamental difference between them, or only a technical one?
It's mainly about what [] means and when it's evaluated. Either way, self.items is a list. The only difference is whether instantiating the class creates a new list, or you keep referring to the same one every time.
Yes. It is the exact same distinction as early-bound or late-bound arguments.
I don't have any good examples where this happens, so it's hard to argue, but I definitely don't see any advantage in the second one. It's almost identically as repetitive, and offers very little advantage. If expensive() is a method call, I'd just tack an lru_cache onto it and be done with it. Having a wrapper like this just looks like a toy, and one that's hard to argue on the basis of. Would need a real example. ChrisA
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Sat, Oct 23, 2021 at 02:54:54PM -0700, 2QdxY4RzWzUUiLuE@potatochowder.com wrote: [...]
Pedantic note: def statements and decorators run at runtime, the same as all other statements. (Including class statements.) Broadly speaking (I may have some of the fine details wrong) there are three stages in executing a function: - compile-time, when the interpreter statically analyses the function source code and compiles the body of the function, plus a set of byte-codes (or equivalent) to assemble the function object; - runtime, when the def *statement* is executed by running the second lot of byte-code, and the function object is assembled -- we often call that "function definition time"; and - runtime, when the function object is actually called, and the first set of byte-code, the one that represents the body of the function, gets executed. There is no Python code executed at compile-time, and the def statement is not executed until runtime. [...]
Consider what happens when you call a function now: bisect(alist, obj) At runtime, the interpreter has to bind arguments to parameters (and handle any errors, such as too few or too many arguments). The bisect function takes five parameters, but only two arguments are given. So the interpreter currently has to look up default values for the missing three parameters (which it gets from the function object, or possibly the code object, I forget which). Those values are static references to objects which were evaluated at function definition time, so that the process of fetching the default is nothing more than grabbing the object from the function object. That process of the interpreter matching up arguments to parameters, filling in missing arguments with defaults, and checking for error conditions, is not normally considered to be part of the function execution itself. It is part of the interpreter, not part of the function. Now consider what would happen if we used late binding. Everything would be the same, *except* that instead of fetching a static reference to a pre-existing object, the interpreter would have to fetch a reference to some code, evaluate the code, and use *that* object as the default. There is no reason to consider that part of the function body, it is still performed by the interpreter. It is only habit from 30 years of working around the lack of late- binding defaults by putting the code inside the function body that leads us to think that late-binding is necessarily part of the body. In Python today, of course late-binding is part of the body, because that's the only way we have to delay the evaluation of an expression. But consider languages which have late-binding, I think Smalltalk and Lisp are the two major examples. I'm not an expert on either, but I can read StackOverflow and extrapolate a meaning to code :-) (defun test1 (&optional (x 0)) (+ x x)) is a function that takes one argument with a default value of 0. Lisp uses late-binding: the default value is an expression (an S-expression?) that is evaluated when the code is called, not at compile time, but it is not part of the body of the function (the `(+ x x)` expression. -- Steve
data:image/s3,"s3://crabby-images/4937b/4937b27410834ce81f696e8505f05dcd413883b2" alt=""
On 2021-10-24 at 13:23:51 +1100, Steven D'Aprano <steve@pearwood.info> wrote:
Yep. My mistake.
[good explanation of what happens with default parameters snipped]
Aha. I see it now (and it's not just those 30 years of Python, it's the previous decade of Basic, FORTRAN, Pascal, C, etc.). My first impulse remains that all those things "the interpreter" does with default values are still part of the function, and that the shorthand declarative syntax is still just sugar for the explicit logic.
Yep. I've written some non-toy Lisp code, and perhaps because of the language in Lisp documentation ("initial forms" rather than "default values"), I definitely see all of that binding as part of the function, whether I write it in the body or in the lambda list. (FWIW, there's also a 3-tuple form of optional parameter, where the third element is a predicate that is bound to a boolean value that indicates whether the value of the argument came from the caller or the default value (thus eliminating the need for unique sentinels). If you're so inclined, find "supplied-p-parameter" on <http://www.lispworks.com/documentation/HyperSpec/Body/03_da.htm>.) As I said before, pushing this sort of logic into "late binding" scratches an itch I don't have. I gladly write two (or more) [public] functions that call a common [possibly private] function rather than one [public] function with optional arguments and default values. Easy to write; easy to read; and easy to test, maintain, and extend. Yes, theoretically, the number of functions grows exponentially, but rarely do I need more than a few such glue functions to cover most (if not all) of the real use cases.
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Sun, Oct 24, 2021 at 06:54:36AM +1100, Chris Angelico wrote: [...]
I challenge that assertion. I've never knowingly seen a function where the late binding is "buried deeper in the function", certainly not deep enough that it is not obvious. It is a very strong convention that such late binding operations occur early in the function body. You know, before you use the parameter, not afterwards *wink* But then I mostly look at well-written functions that are usually less than two, maybe three, dozen lines long, with a managable number of parameters. If you are regularly reading badly-written functions that are four pages long, with fifty parameters, your experience may differ :-) The bisect function you gave earlier is a real-world example of a non-toy function. You will notice that the body of bisect_right: - does the late binding early in the body, immediately after checking for an error condition; - and is a manageable size (19 LOC). https://github.com/python/cpython/blob/3.10/Lib/bisect.py The bisect module is also good evidence that this proposal may not be as useful as we hope. We have: def insort_right(a, x, lo=0, hi=None, *, key=None): which just passes the None on to bisect_right. So if we introduced optional late-binding, the bisect module has two choices: - keep the status quo (don't use the new functionality); - or violate DRY (Don't Repeat Yourself) by having both functions duplicate the same late-binding. It's only a minor DRY violation, but still, if the bisect module was mine, I wouldn't use the new late-binding proposal. So I think that your case is undermined a little by your own example. -- Steve
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 1:00 PM Steven D'Aprano <steve@pearwood.info> wrote:
What I'm more often seeing is cases that are less obviously a late-binding, but where the sentinel is replaced with the "real" value at the point where it's used, rather than up the top of the function.
The truth is, though, that the default for hi is not None - it's really "length of the given list". Python allows us to have real defaults for parameters, rather than simply leaving trailing parameters undefined as JavaScript does; this means that you can read off the function header and see what the meaning of parameter omission is. Late-binding semantics allow this to apply even if the default isn't a constant. If this proposal is accepted, I would adopt the DRY violation, since it would literally look like this: def insort_right(a, x, lo=0, hi=>len(a), *, key=None): def bisect_right(a, x, lo=0, hi=>len(a), *, key=None): def insort_left(a, x, lo=0, hi=>len(a), *, key=None): def bisect_left(a, x, lo=0, hi=>len(a), *, key=None): That's not really a lot of repetition, and now everyone can see that the default lo is 0 and the default hi is the length of a. Four "hi=>len(a)" isn't really different from four "hi=None" when it comes down to it. ChrisA
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Sun, Oct 24, 2021 at 01:16:02PM +1100, Chris Angelico wrote:
Got any examples you can share? And is it really a problem if we delay the late-binding to the point where the value is actually needed? Here's a toy example: # Using only early binding. def function(spam, eggs=None, cheese=None): if eggs is None: eggs = cheap_default() # do stuff using eggs ... if condition: return result if cheese is None: cheese = expensive_default() # do stuff using cheese ... return result The cheese parameter only gets used if the processing of spam with eggs fails to give a result. But if cheese is used, the default is expensive. Is it really a problem if we delay evaluating that default to the point where it is needed? So this would be another example where automatic late-binding wouldn't be used. If the default is very expensive, I would stick to manual late- binding using None, and only evaluate it as needed. Maybe this is an argument for some sort of thunk, as in Algol, which is only evaluated at need. Then we could just write: # Assume late-binding with thunks. def function(spam, eggs=cheap_default(), cheese=expensive_default()): # do stuff using eggs ... if condition: return result # do stuff using cheese ... return result and the thunks `cheap_default()` and `expensive_default()` will only be evaluated *if they are actually needed*, rather than automatically when the function is called. To be clear about the semantics, let me illustrate. I am deliberately not using any extra syntax for late-binding. # early binding (the status quo) def func(arg=expression): ... The expression is evaluated when the def statement is run and the func object is created. # late binding (minus any extra syntax) def func(arg=expression): ... The expression is evaluated eagerly when the function is called, if and only if the parameter arg has not been given a value by the caller. # late binding with thunk def func(arg=expression): ... The expression is evaluated only if and when the body of the function attempts to use the value of arg, if the caller has not provided a value. So if the function looks like this: # late binding with a thunk that delays execution until needed def func(flag, arg=1/0): if flag: print("Boom!") return arg return None then func(True) will print Boom! and then raise ZeroDivisionError, and func(False) will happily return None. I have no idea whether thunk-like functionality is workable in Python's execution model without slowing down every object reference, but if it is possible, there could be other really nice use-cases beyond just function defaults. -- Steve
data:image/s3,"s3://crabby-images/4d484/4d484377daa18e9172106d4beee4707c95dab2b3" alt=""
On Sat, Oct 23, 2021 at 7:55 PM Steven D'Aprano <steve@pearwood.info> wrote:
And is it really a problem if we delay the late-binding to the point where the value is actually needed? ...
<snip> [in that csse] I would stick to manual late-
I have no idea whether thunk-like functionality is workable in Python's
Your message here and my message on this passed in the mail. Yes, this is a really good point and would apply to the cases I've seen where the evaluation was in the middle. Thanks for raising it. I also don't know if it's workable but it should be considered. --- Bruce
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 1:53 PM Steven D'Aprano <steve@pearwood.info> wrote:
Not from the standard library, but try something like this: def generate_code(secret, timestamp=None): ... ... ... ts = (timestamp or now()).to_bytes(8, "big") ... The variable "timestamp" isn't ever actually set to its true default value, but logically, omitting that parameter means "use the current time", so this would be better written as: def generate_code(secret, timestamp=>now()): That's what I mean by "at the point where it's used" - it's embedded into the expression that uses it. But the body of a function is its own business; what matters is the header, which is stating that the default is None, where the default is really now().
Delaying evaluation isn't a problem, though it also isn't usually an advantage. (If the default is truly expensive, then you wouldn't want to use this, but now we're looking at optimizations, where the decision is made to favour performance over clarity.)
The biggest problem with thunks is knowing when to trigger evaluation. We already have functions if you want to be explicit about that: def func(flag, arg=lambda:1/0): ... return arg() so any thunk feature would need some way to trigger its full evaluation. Should that happen when it gets touched in any way? Or leave it until some attribute is looked up? What are the consequences of leaving it unevaluated for too long? Thunks would be another great feature, but I think they're orthogonal to this. ChrisA
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Sun, Oct 24, 2021 at 02:09:59PM +1100, Chris Angelico wrote:
The biggest problem with thunks is knowing when to trigger evaluation.
I think Algol solved that problem by having thunks a purely internal mechanism, not a first-class value that users could store or pass around.
We already have functions if you want to be explicit about that:
Yes, if we are satisfied with purely manually evaluating thunks. The point of a thunk though is that the interpreter knows when to evaluate it, you don't have to think about it.
Thunks would be another great feature, but I think they're orthogonal to this.
If we had thunks, that would give us late binding for free: def bisect(a, x, lo=0, hi=thunk len(a), *, key=None) Aaaand we're done. So thunks would make this PEP obsolete. But if thunks are implausible, too hard, or would have too high a cost, then this PEP remains less ambitious and therefore easier. -- Steve
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 5:53 PM Steven D'Aprano <steve@pearwood.info> wrote:
If you can't pass them around, then how are they different from what's proposed here? They are simply expressions that get evaluated a bit later. You potentially win on performance if it's expensive, but you lose on debuggability when errors happen further down and less consistently, and otherwise, it's exactly the same thing.
Where else would you use thunks? I think it's exactly as ambitious, if indeed they can't be stored or passed around. ChrisA
data:image/s3,"s3://crabby-images/083fb/083fb9fce1476ebe02d0a5d8c76d5547020ebe75" alt=""
On Sat, Oct 23, 2021 at 11:53 PM Steven D'Aprano <steve@pearwood.info> wrote:
If we had thunks, that would give us late binding for free:
def bisect(a, x, lo=0, hi=thunk len(a), *, key=None)
I'm unclear on exactly what the semantics of a thunk would be, but I don't see how it could do what you want here. In an ordinary default value, the "a" in "len(a)" refers to a variable of that name in the enclosing scope, not the argument of bisect. A generic delayed-evaluation mechanism wouldn't (shouldn't) change that.
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
I like that you're trying to fix this wart! I think that using a different syntax may be the only way out. My own bikeshed color to try would be `=>`, assuming we'll introduce `(x) => x+1` as the new lambda syntax, but I can see problems with both as well :-). -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 6:55 AM Guido van Rossum <guido@python.org> wrote:
I like that you're trying to fix this wart! I think that using a different syntax may be the only way out. My own bikeshed color to try would be `=>`, assuming we'll introduce `(x) => x+1` as the new lambda syntax, but I can see problems with both as well :-).
Sounds good. I can definitely get behind this as the preferred syntax, until such time as we find a serious problem. ChrisA
data:image/s3,"s3://crabby-images/4139c/4139cd55a519bbbc5518a98d3ab394bc539912b9" alt=""
El sáb, 23 oct 2021 a las 12:57, Guido van Rossum (<guido@python.org>) escribió:
def bisect_right(a, x, lo=0, hi=>len(a), *, key=None): This reads to me like we're putting "hi" into "len(a)", when it's in fact the reverse. What about: def bisect_right(a, x, lo=0, hi<=len(a), *, key=None): Another option (going back to Chris's original suggestion) could be: def bisect_right(a, x, lo=0, hi:=len(a), *, key=None): Which is the same as the walrus operator, leaning on the idea that this is kind of like the walrus: a name gets assigned based on something evaluated right here. Bikeshedding aside, thanks Chris for the initiative here! This is a tricky corner of the language and a promising improvement.
data:image/s3,"s3://crabby-images/4d484/4d484377daa18e9172106d4beee4707c95dab2b3" alt=""
On Sat, Oct 23, 2021 at 6:23 PM Jelle Zijlstra <jelle.zijlstra@gmail.com> wrote:
I think in most cases what's on the right side will be something that's not assignable. Likewise with the proposal to use => for lambda, someone could read (a => a + 1) as putting a into a + 1. I think they're going to get over that. Every language I am aware of that has adopted a short hand lambda notation (without a keyword) has used => or -> except APL, Ruby, SmallTalk. See https://en.wikipedia.org/wiki/Anonymous_function APL uses a tacit syntax while Ruby and SmallTalk use explicit syntaxes. The equivalent of x => x + 1 in each of these is APL ⍺+1 (I think) Ruby |x| x + 1 SmallTalk [ :x | x + 1 ] --- Bruce
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 1:48 PM Bruce Leban <bruce@leban.us> wrote:
Anonymous functions are an awkward parallel here. The notation you're describing will create a function which accepts one argument, and then returns a value calculated from that argument. We're actually doing the opposite: hi is being set to len(a), it's not that len(a) is being calculated from hi. That said, though, I still count "=>" among my top three preferences (along with "=:" and "?="), and flipping the arrow to "<=" is too confusable with the less-eq operator. ChrisA
data:image/s3,"s3://crabby-images/4d484/4d484377daa18e9172106d4beee4707c95dab2b3" alt=""
--- Bruce On Sat, Oct 23, 2021 at 7:55 PM Chris Angelico <rosuav@gmail.com> wrote:
Sorry I was less than clear. The syllogism here is (1) late-evaluated argument default should use => because that's the proposal for shorthand lambda (2) shorthand lambda should use => because that's what other languages use. I was talking about (2) but I should have been explicit. And yes, you highlight a potential source of confusion. def f(x=>x + 1): ... means that x is 1 more than the value of x from the enclosing global scope (at function call time) while g = x => x + 1 sets g to a single-argument function that adds 1 to its argument value. --- Bruce
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On Sat, 23 Oct 2021 at 17:09, Chris Angelico <rosuav@gmail.com> wrote:
+1 from me. I agree that getting a good syntax will be tricky, but I like the functionality. I do quite like Guido's "hi=>len(a)" syntax, but I admit I'm not seeing the potential issues he alludes to, so maybe I'm missing something :-) Paul
data:image/s3,"s3://crabby-images/4d484/4d484377daa18e9172106d4beee4707c95dab2b3" alt=""
On Sat, Oct 23, 2021 at 12:56 PM Guido van Rossum <guido@python.org> wrote:
+1 to this spelling. I started writing a message arguing that this should be spelled with lambda because the fact that you're (effectively) writing a function should be explicit (see below). But the syntax is ugly needing both a lambda and a variant = operator. This solves that elegantly. On Sat, Oct 23, 2021 at 9:10 AM Chris Angelico <rosuav@gmail.com> wrote:
The syntax I've chosen is deliberately subtle, since - in many many
cases - it won't make any difference whether the argument is early or late bound, so they should look similar.
I think a subtle difference in syntax is a bad idea since this is not a subtle difference in behavior. If it makes no difference whether the argument is early or late bound then you wouldn't be using it. Here's one way you could imagine writing this today: def bisect(a, x, lo=0, hi=lambda: len(a)): hi = hi() if callable(hi) else hi ... which is clumsy and more importantly doesn't work because the binding of the lambda occurs in the function definition context which doesn't have access to the other parameters. <deleted some discussion about alternative syntaxes because Guido's suggestion solves it elegantly> --- Bruce
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 7:58 AM Bruce Leban <bruce@leban.us> wrote:
Agreed; if it needs to remain self-contained, then yes, it would have to be a function, and there'd be good reason for making it look like one. As it is, it's more just "code that happens at the start of the function", but I think there's still value in using a syntax that people understand as a late binding system (as lambda functions will be).
There IS a difference, but the difference should be subtle. Consider: x = 5 x = 5. These are deliberately similar, but they are quite definitely different, and they behave differently. (You can't use the second one to index a list, for instance.) Does the difference matter? Absolutely. Does the similarity matter? Yep. Similar things should look similar. The current front-runner syntax is: def bisect(a, x, lo=0, hi=>len(a)): This is only slightly less subtle. It's still a one-character difference which means that instead of being evaluated at definition time, it's evaluated at call time. This is deliberate; it should still look like (a) a parameter named "hi", (b) which is optional, and (c) which will default to the result of evaluating "len(a)". That's a good thing.
Right. It also doesn't solve the help() problem, since it's just going to show the (not-very-helpful) repr of a lambda function. It's not really much better than using object() as a sentinel, although it does at least avoid the global-pollution problem. It seems like there's broad interest in this, but a lot of details to nut out. I think it may be time for me to write up a full PEP. Guido, if I'm understanding recent SC decisions correctly, a PEP editor can self-sponsor, correct? ChrisA
data:image/s3,"s3://crabby-images/552f9/552f93297bac074f42414baecc3ef3063050ba29" alt=""
+1 on the idea. Sometimes early binding is needed, sometimes late binding is needed. So Python should provide both. QED 😁 I'm not keen on the var = > expr syntax. IMO the arrow is pointing the wrong way. expr is assigned to var. Some possible alternatives, if there is no technical reason they wouldn't work (as far as I know they are currently a syntax error, and thus unambiguous, inside a function header, unless annotations change things. I know nothing about annotations): var <= exp # Might be confused with less-than-or-equals var := expr # Reminiscent of the walrus operator in other contexts. # This might be considered a good thing or a bad thing. # Possibly too similar to `var = expr' var : expr # Less evocative. Looks like part of a dict display. var <- expr # Trailing `-` is confusing: is it part of (-expr) ? var = <expr> # Hm. At least it's clearly distinct from `var=expr`. var << expr # Ditto. (var = expr) [var = expr] # Ditto And as I'm criticising my own suggestions, I'll do the same for the ones in the PEP: def bisect(a, hi=:len(a)): # Just looks weird, doesn't suggest anything to me def bisect(a, hi?=len(a)): # Hate question marks (turning Python into Perl?) # Ditto for using $ or % or ^ or & def bisect(a, hi!=len(a)): # Might be confused with not-equals def bisect(a, hi=\len(a)): # `\` looks like escape sequence or end of line # What if you wanted to break a long line at that point? def bisect(a, hi=`len(a)`): # Backquotes are fiddly to type. Is ` len(a) ` allowed? def bisect(a, hi=@len(a)): # Just looks weird, doesn't suggest anything to me My personal current (subject to change) preference would be `var := expr`. All this of course is my own off-the-cuff subjective reaction. Y M M (probably will) V. Best wishes Rob Cliffe On 23/10/2021 23:08, Chris Angelico wrote:
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 12:44 PM Rob Cliffe <rob.cliffe@btinternet.com> wrote:
That's definitely open to discussion.
Annotations come before the default, so the question is mainly whether it's possible for an annotation to end with the symbol in question.
var <= exp # Might be confused with less-than-or-equals
Not a fan, for that exact reason.
This is less concerning than <=, since... ahem. I spy a missed opportunity here. Must correct. This is less confusing or equally confusing as <=, since ":=" is a form of name binding, which function parameters are doing. Not a huge fan but I'd be more open to this than the leftward-arrow.
var : expr # Less evocative. Looks like part of a dict display.
(That is precisely the syntax for annotations, so that one's out)
var <- expr # Trailing `-` is confusing: is it part of (-expr) ?
Not a fan. I don't like syntaxes where a space makes a distinct difference to the meaning.
var = <expr> # Hm. At least it's clearly distinct from `var=expr`.
It took me a couple of readings to understand this, since <expr> is a common way of indicating a placeholder. This might be a possibility, but it would be very difficult to discuss it.
var << expr # Ditto.
Like <=, this is drawing an analogy with an operator that has nothing to do with name binding, so I think it'll just be confusing.
(var = expr) [var = expr] # Ditto
Bracketing the argument seems like a weird way to indicate late-binding, but if anything, I'd go with the round ones.
Yeah, I don't like != for the same reason that I don't like <= or <<.
def bisect(a, hi=\len(a)): # `\` looks like escape sequence or end of line # What if you wanted to break a long line at that point?
No worse than the reuse of backslashes in regexes and string literals.
def bisect(a, hi=`len(a)`): # Backquotes are fiddly to type. Is ` len(a) ` allowed?
Not sure why it wouldn't - the spaces don't change the expression. Bear in mind that the raw source code of the expression would be saved for inspection, so there'd be a difference with help(), but other than that, it would have the same effect.
def bisect(a, hi=@len(a)): # Just looks weird, doesn't suggest anything to me
Agreed, don't like that one.
Subjective reactions are extremely important. If the syntax is ugly, the proposal is weak. But I can add := to the list of alternates, for the sake of having it listed. ChrisA
data:image/s3,"s3://crabby-images/a3b9e/a3b9e3c01ce9004917ad5e7689530187eb3ae21c" alt=""
+1 to this idea -- thanks Chris A! On Sat, Oct 23, 2021 at 2:01 PM Bruce Leban <bruce@leban.us> wrote:
However, it will work if there happens to be an 'a' defined in the scope where the function is created. So that means that it could be VERY confusing if the syntax for an anonymous function is the same (or very similar to) the syntax for delayed evaluation of parameter defaults. I think Steven may have posted an example of what it would look like. I"ll also add that looking again, and "arrow" like symbol really is very odd in this context, as others have pointed out, it's pointing in the wrong direction. Not that I have a better idea. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 1:12 PM Ricky Teachey <ricky@teachey.org> wrote:
That's something where we may need to get a reference implementation before deciding, but I am open to either of two possibilities: 1) Keyword args are resolved before late-bound defaults, so your example would technically work, despite being confusing 2) Late-bound defaults explicitly reject (with SyntaxError) any references to arguments to their right. Python already enforces this kind of check:
Even if it IS legal, I would say that this would be something to avoid. Same with using assignment expressions to mutate other arguments - I suspect that this won't cause technical problems, but it would certainly hurt the brains of people who read it: def f(a, b=>c:=a, c=>b:=len(a)): print("wat") Yeah, just don't :) Not a dumb question. It's a little underspecified in the PEP at the moment (I only mention that they may refer to previous values, but not whether subsequently-named arguments are fair game), and am open to discussion about whether this should be locked down. Currently, I'm inclined to be permissive, and let people put anything they like there, just like you can put crazy expressions into other places where they probably wouldn't improve your code :) ChrisA
data:image/s3,"s3://crabby-images/83003/83003405cb3e437d91969f4da1e4d11958d94f27" alt=""
On 2021-10-23 09:07, Chris Angelico wrote:
I'm -1 on it. For me the biggest problem with this idea is that it only handles a subset of cases, namely those that can be expressed as an expression inlined into the function definition. This subset is too small, because we'll still have to write code in the function body for cases where the default depends on more complex logic. But it is also too large, because it will encourage people to cram complex expressions into the function definition. To me, this is definitely not worth adding special syntax for. I seem to be the only person around here who detests "ASCII art" "arrow" operators but, well, I do, and I'd hate to see them used for this. The colon or alternatives like ? or @ are less offensive but still too inscrutable to be used for something that can already be handled in a more explicit way. I do have one other specific objection to the rationale:
Not really. help() shows the *documentation* of the function. A person calling it should *read the documentation*, not just glance at the function signature. I don't see any compelling benefit to having a mini-lambda be retrievable via introspection tools. There is simply no substitute for actually reading (and writing) the documentation. Also, insofar as glancing at the function signature is useful, I suspect that putting this change in will *also* lead to help() being unhelpful, because, as I mentioned above, if the default uses anything but the most trivial logic, the signature will become cluttered with stuff that ought to be separated out as actual logic. I would prefer to see this situation handled as part of a larger-scale change of adding some kind of "inline lambda" which executes directly in the calling scope. (I think this is similar to the "deferred computation" idea mentioned by David Mertz elsewhere in the thread.) This would also allow extracting the logic out of the function definition into a separate variable (holding the "inline lambda"), which could help with cases similar to the bisect examples discussed elsewhere in the thread, where multiple functions share late-binding logic. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Oct 24, 2021 at 3:51 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
These two considerations, together, are the exact push that programmers need: keep the expression short, don't cram everything into the function definition. It's like writing a list comprehension; technically you can put any expression into the body of it, but it's normally going to be short enough to not get unwieldy. ChrisA
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Sat, Oct 23, 2021 at 08:29:54PM -0700, Brendan Barnwell wrote:
True. But that's equally true for default values under early binding. Would you say that existing syntax for default values is problematic because it only covers the 90% of cases where the default value can be computed from an expression? If push comes to shove, we can always write a helper function.
Just like they do now? I think people can write bad code regardless of the features available, but I don't think that adding late binding will particularly make that tendency worse. Most uses of late binding will be pretty simple: - mutable literals like [] and {}; - length of another argument (like the bisect example); - references to an attribute of self; - call out to another function. I don't think that there is good reason to believe that adding late binding will cause a large increase in the amount of overly complex default values. As you say, they are limited to a single expression, so the really complex blocks will have to stay inside the body of the function where they belong. And we have thirty years of default values, and no sign that people abuse them by writing lots of horrific defaults: def func(arg=sum(min(seq) for seq in [[LOOKUP[key]]+list(elements) for key, elements in zip(map(Mapper(spam, eggs), keys), iterable_of_sets) if condition(key)] if len(seq) > 5)): ... People just don't do that sort of thing in anywhere near enough numbers to worry about them doing it just because we have late binding. And if they do? "People will write crappy code" is a social problem, not a technology problem, which is best solved socially, using code reviews, linters, a quick application of the Clue Bat to the offending coder's head, etc.
That would be what I called a "thunk" in two posts now, stealing the term from Algol. It would be nice if one of the core devs who understand the deep internals of the interpreter could comment on whether that sort of delayed evaluation of an expression is even plausible for Python. If it is, then I agree: we should focus on a general thunk mechanism, which would then give us late binding defaults for free, plus many more interesting use-cases. (Somewhere in my hard drive I have a draft proto-PEP regarding this.) But if it is not plausible, then a more limited mechanism for late bound defaults will, I think, be a useful feature that improves the experience of writing functions. We already document functions like this: def bisect(a, x, lo=0, hi=len(a)) It would be an improvement if we could write them like that too. -- Steve
data:image/s3,"s3://crabby-images/83003/83003405cb3e437d91969f4da1e4d11958d94f27" alt=""
On 2021-10-23 23:33, Steven D'Aprano wrote:
I understand your point, but there is an important difference between the current situation and the proposal. Right now, the function definition executes in the enclosing scope. That means that there is nothing you can do in a default-argument expression that you can't do by assigning to a variable in a separate line (or lines) before defining the function. But with this proposal, the function definition will gain the new ability to express logic that executes in the function body environment, with the restriction that it be only a single expression. It's true that any feature can be abused, but I think this temptation to shoehorn function logic into the argument list will be more likely to result in unwieldy signatures than the current situation. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
participants (15)
-
2QdxY4RzWzUUiLuE@potatochowder.com
-
Abdulla Al Kathiri
-
Ben Rudiak-Gould
-
Brendan Barnwell
-
Bruce Leban
-
Chris Angelico
-
Christopher Barker
-
Ethan Furman
-
Guido van Rossum
-
Jelle Zijlstra
-
Paul Moore
-
Ricky Teachey
-
Rob Cliffe
-
Serhiy Storchaka
-
Steven D'Aprano