PEP 671: Syntax for late-bound function argument defaults
Incorporates comments from the thread we just had. Is anyone interested in coauthoring this with me? Anyone who has strong interest in seeing this happen - whether you've been around the Python lists for years, or you're new and interested in getting involved for the first time, or anywhere in between! https://www.python.org/dev/peps/pep-0671/ PEP: 671 Title: Syntax for late-bound function argument defaults Author: Chris Angelico <rosuav@gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 24-Oct-2021 Python-Version: 3.11 Post-History: 24-Oct-2021 Abstract ======== Function parameters can have default values which are calculated during function definition and saved. This proposal introduces a new form of argument default, defined by an expression to be evaluated at function call time. Motivation ========== Optional function arguments, if omitted, often have some sort of logical default value. When this value depends on other arguments, or needs to be reevaluated each function call, there is currently no clean way to state this in the function header. Currently-legal idioms for this include:: # Very common: Use None and replace it in the function def bisect_right(a, x, lo=0, hi=None, *, key=None): if hi is None: hi = len(a) # Also well known: Use a unique custom sentinel object _USE_GLOBAL_DEFAULT = object() def connect(timeout=_USE_GLOBAL_DEFAULT): if timeout is _USE_GLOBAL_DEFAULT: timeout = default_timeout # Unusual: Accept star-args and then validate def add_item(item, *optional_target): if not optional_target: target = [] else: target = optional_target[0] In each form, ``help(function)`` fails to show the true default value. Each one has additional problems, too; using ``None`` is only valid if None is not itself a plausible function parameter, the custom sentinel requires a global constant; and use of star-args implies that more than one argument could be given. Specification ============= Function default arguments can be defined using the new ``=>`` notation:: def bisect_right(a, x, lo=0, hi=>len(a), *, key=None): def connect(timeout=>default_timeout): def add_item(item, target=>[]): The expression is saved in its source code form for the purpose of inspection, and bytecode to evaluate it is prepended to the function's body. Notably, the expression is evaluated in the function's run-time scope, NOT the scope in which the function was defined (as are early-bound defaults). This allows the expression to refer to other arguments. Self-referential expressions will result in UnboundLocalError:: def spam(eggs=>eggs): # Nope Multiple late-bound arguments are evaluated from left to right, and can refer to previously-calculated values. Order is defined by the function, regardless of the order in which keyword arguments may be passed. Choice of spelling ------------------ Our chief syntax proposal is ``name=>expression`` -- our two syntax proposals ... ahem. Amongst our potential syntaxes are:: def bisect(a, hi=>len(a)): def bisect(a, hi=:len(a)): def bisect(a, hi?=len(a)): def bisect(a, hi!=len(a)): def bisect(a, hi=\len(a)): def bisect(a, hi=`len(a)`): def bisect(a, hi=@len(a)): Since default arguments behave largely the same whether they're early or late bound, the preferred syntax is very similar to the existing early-bind syntax. The alternatives offer little advantage over the preferred one. How to Teach This ================= Early-bound default arguments should always be taught first, as they are the simpler and more efficient way to evaluate arguments. Building on them, late bound arguments are broadly equivalent to code at the top of the function:: def add_item(item, target=>[]): # Equivalent pseudocode: def add_item(item, target=<OPTIONAL>): if target was omitted: target = [] Open Issues =========== - yield/await? Will they cause problems? Might end up being a non-issue. - annotations? They go before the default, so is there any way an anno could want to end with ``=>``? References ========== Copyright ========= This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.
What would the the syntax for the default be, i would presume just a single expression like a lambda, but what about something like an async function, for example: ```py async def foo(arg=>await get_arg()): ``` would this work?
On Mon, Oct 25, 2021 at 1:16 AM Zomatree . <angelokontaxis@hotmail.com> wrote:
What would the the syntax for the default be, i would presume just a single expression like a lambda, but what about something like an async function, for example: ```py async def foo(arg=>await get_arg()): ``` would this work?
That's an open question at the moment, but I suspect that it will be perfectly acceptable. Same with a yield expression in a generator. ChrisA
Eight hours from the initial post on Python-Ideas, to a PEP, with just eight responses from six people. Is that some sort of a record? And in the wee hours of the morning too (3am to 11am local time). I thought my sleep habits were bad. Do you not sleep any more? :-) -- Steve
On Sun, Oct 24, 2021 at 12:10 PM Steven D'Aprano <steve@pearwood.info> wrote:
Eight hours from the initial post on Python-Ideas, to a PEP, with just eight responses from six people. Is that some sort of a record?
And in the wee hours of the morning too (3am to 11am local time). I thought my sleep habits were bad. Do you not sleep any more? :-)
Fair point, but this is something that just keeps on coming up in some form or another. Anyway, if it ends up going nowhere, it's still not wasted time. Sleep? What is sleep? https://docs.python.org/3/library/time.html#time.sleep Ah yes. That. :) ChrisA
From what I've seen so far, I'm -0 on this. I understand the pattern it addresses, but it doesn't feel all that common, nor that hard to address with the existing sentinel-check pattern alien in the PEP draft. This just doesn't feel big enough to merit it's own syntax. ... On the other hand, if this could express a much more general deferred computation, I'd be really enthusiastic (subject to syntax and behavioral details). However, I recognize that a new general "dynamically scoped lambda" would indeed have a lot of edge cases. On Sat, Oct 23, 2021, 8:15 PM Chris Angelico <rosuav@gmail.com> wrote:
Incorporates comments from the thread we just had.
Is anyone interested in coauthoring this with me? Anyone who has strong interest in seeing this happen - whether you've been around the Python lists for years, or you're new and interested in getting involved for the first time, or anywhere in between!
https://www.python.org/dev/peps/pep-0671/
PEP: 671 Title: Syntax for late-bound function argument defaults Author: Chris Angelico <rosuav@gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 24-Oct-2021 Python-Version: 3.11 Post-History: 24-Oct-2021
Abstract ========
Function parameters can have default values which are calculated during function definition and saved. This proposal introduces a new form of argument default, defined by an expression to be evaluated at function call time.
Motivation ==========
Optional function arguments, if omitted, often have some sort of logical default value. When this value depends on other arguments, or needs to be reevaluated each function call, there is currently no clean way to state this in the function header.
Currently-legal idioms for this include::
# Very common: Use None and replace it in the function def bisect_right(a, x, lo=0, hi=None, *, key=None): if hi is None: hi = len(a)
# Also well known: Use a unique custom sentinel object _USE_GLOBAL_DEFAULT = object() def connect(timeout=_USE_GLOBAL_DEFAULT): if timeout is _USE_GLOBAL_DEFAULT: timeout = default_timeout
# Unusual: Accept star-args and then validate def add_item(item, *optional_target): if not optional_target: target = [] else: target = optional_target[0]
In each form, ``help(function)`` fails to show the true default value. Each one has additional problems, too; using ``None`` is only valid if None is not itself a plausible function parameter, the custom sentinel requires a global constant; and use of star-args implies that more than one argument could be given.
Specification =============
Function default arguments can be defined using the new ``=>`` notation::
def bisect_right(a, x, lo=0, hi=>len(a), *, key=None): def connect(timeout=>default_timeout): def add_item(item, target=>[]):
The expression is saved in its source code form for the purpose of inspection, and bytecode to evaluate it is prepended to the function's body.
Notably, the expression is evaluated in the function's run-time scope, NOT the scope in which the function was defined (as are early-bound defaults). This allows the expression to refer to other arguments.
Self-referential expressions will result in UnboundLocalError::
def spam(eggs=>eggs): # Nope
Multiple late-bound arguments are evaluated from left to right, and can refer to previously-calculated values. Order is defined by the function, regardless of the order in which keyword arguments may be passed.
Choice of spelling ------------------
Our chief syntax proposal is ``name=>expression`` -- our two syntax proposals ... ahem. Amongst our potential syntaxes are::
def bisect(a, hi=>len(a)): def bisect(a, hi=:len(a)): def bisect(a, hi?=len(a)): def bisect(a, hi!=len(a)): def bisect(a, hi=\len(a)): def bisect(a, hi=`len(a)`): def bisect(a, hi=@len(a)):
Since default arguments behave largely the same whether they're early or late bound, the preferred syntax is very similar to the existing early-bind syntax. The alternatives offer little advantage over the preferred one.
How to Teach This =================
Early-bound default arguments should always be taught first, as they are the simpler and more efficient way to evaluate arguments. Building on them, late bound arguments are broadly equivalent to code at the top of the function::
def add_item(item, target=>[]):
# Equivalent pseudocode: def add_item(item, target=<OPTIONAL>): if target was omitted: target = []
Open Issues ===========
- yield/await? Will they cause problems? Might end up being a non-issue.
- annotations? They go before the default, so is there any way an anno could want to end with ``=>``?
References ==========
Copyright =========
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/KR2TML... Code of Conduct: http://python.org/psf/codeofconduct/
On Sun, Oct 24, 2021 at 05:49:50AM +0400, David Mertz, Ph.D. wrote:
... On the other hand, if this could express a much more general deferred computation, I'd be really enthusiastic (subject to syntax and behavioral details).
However, I recognize that a new general "dynamically scoped lambda" would indeed have a lot of edge cases.
Dynamic scoping is not the same as deferred computation. -- Steve
On Sat, Oct 23, 2021, 10:58 PM Steven D'Aprano
... On the other hand, if this could express a much more general deferred computation, I'd be really enthusiastic (subject to syntax and behavioral details).
However, I recognize that a new general "dynamically scoped lambda" would indeed have a lot of edge cases.
Dynamic scoping is not the same as deferred computation.
Of course not generally. But a dynamic deferred could cover this specific desire of the proposal. So stawman proposal: def fun(seq, low=0, high=defer: len(seq)): assert low < high # other stuff... Where the implication here is that the "defer expression" creates a dynamic scope. The reason this could appeal to me is that it wouldn't be limited to function signatures, nor even necessarily most useful there. Instead, a deferred object would be a completely general thing that could be bound anywhere any object can. Such a capability would allow passing around potential computational "code blocks", but only perform the work if or when a value is required.
On Sun, Oct 24, 2021 at 2:21 PM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
On Sat, Oct 23, 2021, 10:58 PM Steven D'Aprano
... On the other hand, if this could express a much more general deferred computation, I'd be really enthusiastic (subject to syntax and behavioral details).
However, I recognize that a new general "dynamically scoped lambda" would indeed have a lot of edge cases.
Dynamic scoping is not the same as deferred computation.
Of course not generally. But a dynamic deferred could cover this specific desire of the proposal.
So stawman proposal:
def fun(seq, low=0, high=defer: len(seq)): assert low < high # other stuff...
Where the implication here is that the "defer expression" creates a dynamic scope.
At what point is this defer-expression to be evaluated? For instance: def f(x=defer: a + b): a, b = 3, 5 return x Would this return 8, or a defer-expression? If 8, then the scope isn't truly dynamic, since there's no way to keep it deferred until it moves to another scope. If not 8, then I'm not sure how you'd define the scope or what triggers its evaluation. ChrisA
On Sat, Oct 23, 2021, 11:28 PM Chris Angelico <rosuav@gmail.com> wrote:
So stawman proposal:
def fun(seq, low=0, high=defer: len(seq)): assert low < high # other stuff...
Where the implication here is that the "defer expression" creates a dynamic scope.
At what point is this defer-expression to be evaluated? For instance:
def f(x=defer: a + b): a, b = 3, 5 return x
Would this return 8, or a defer-expression? If 8, then the scope isn't truly dynamic, since there's no way to keep it deferred until it moves to another scope. If not 8, then I'm not sure how you'd define the scope or what triggers its evaluation.
This would return 8. Basically as if the expression were passed as a string, and `eval(x)` were run when the name "x" was looked up within a scope. An elaboration of this strawman could allow partial binding at the static scope as well. E.g. cursor = db_connection.cursor() table = "employees" expensive_query = defer c=cursor, t=table: c.execute( f"SELECT * FROM {t} WHERE name={name}") def employee_check(q=expensive_query): if random.random() > 0.5: name = "Smith" return q So 'c' and 't' would be closed over when the deferred is defined, but 'name' would utilize the dynamic scope. In particular, the expensive operation of resolving the deferred would only occur if it was needed.
On Sat, Oct 23, 2021, 11:46 PM David Mertz, Ph.D.
def f(x=defer: a + b):
a, b = 3, 5 return x
Would this return 8, or a defer-expression? If 8, then the scope isn't truly dynamic, since there's no way to keep it deferred until it moves to another scope. If not 8, then I'm not sure how you'd define the scope or what triggers its evaluation.
Oh... Keep in mind I'm proposing a strawman deliberately, but the most natural approach to keeping an object deferred rather than evaluated is simply to say so: def f(x=defer: a + b): a, b = 3, 5 fn2(defer: x) # look for local a, b within fn2() if needed # ... other stuff return x # return 8 here
On Sun, Oct 24, 2021 at 2:52 PM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
On Sat, Oct 23, 2021, 11:46 PM David Mertz, Ph.D.
def f(x=defer: a + b): a, b = 3, 5 return x
Would this return 8, or a defer-expression? If 8, then the scope isn't truly dynamic, since there's no way to keep it deferred until it moves to another scope. If not 8, then I'm not sure how you'd define the scope or what triggers its evaluation.
Oh... Keep in mind I'm proposing a strawman deliberately, but the most natural approach to keeping an object deferred rather than evaluated is simply to say so:
def f(x=defer: a + b): a, b = 3, 5 fn2(defer: x) # look for local a, b within fn2() if needed # ... other stuff return x # return 8 here
How would it know to look for a and b inside fn2's scope, instead of looking for x inside fn2's scope? ChrisA
On Sat, Oct 23, 2021 at 9:21 PM Chris Angelico <rosuav@gmail.com> wrote:
On Sun, Oct 24, 2021 at 2:52 PM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
On Sat, Oct 23, 2021, 11:46 PM David Mertz, Ph.D.
def f(x=defer: a + b): a, b = 3, 5 return x
Would this return 8, or a defer-expression? If 8, then the scope isn't
truly dynamic, since there's no way to keep it deferred until it moves to another scope. If not 8, then I'm not sure how you'd define the scope or what triggers its evaluation.
Oh... Keep in mind I'm proposing a strawman deliberately, but the most
natural approach to keeping an object deferred rather than evaluated is simply to say so:
def f(x=defer: a + b): a, b = 3, 5 fn2(defer: x) # look for local a, b within fn2() if needed # ... other stuff return x # return 8 here
How would it know to look for a and b inside fn2's scope, instead of looking for x inside fn2's scope?
I am worried that this side-thread about dynamic scopes (which are a ridiculous idea IMO) will derail the decent proposal of the PEP. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Sun, Oct 24, 2021, 12:25 AM Guido van Rossum
I am worried that this side-thread about dynamic scopes (which are a ridiculous idea IMO) will derail the decent proposal of the PEP.
It's really not a suggestion about dynamic scoping but about more generalized deferred computation. This has been a topic of other threads over the years, and something I've wanted at least since I first worked with Dask's 'delayed()' function.
On Sun, Oct 24, 2021 at 09:39:27AM -0400, David Mertz, Ph.D. wrote:
This has been a topic of other threads over the years, and something I've wanted at least since I first worked with Dask's 'delayed()' function.
You mean this? https://docs.dask.org/en/latest/delayed.html -- Steve
Yes! Exactly that. I believe (and believed in some discussions here since maybe 4-5 years ago) that having something close to the dask.delayed() function baked into the language works be a good think. And as a narrow point, it could address the narrower late-bound function argument matter as one narrow use. On Sun, Oct 24, 2021, 9:59 AM Steven D'Aprano <steve@pearwood.info> wrote:
On Sun, Oct 24, 2021 at 09:39:27AM -0400, David Mertz, Ph.D. wrote:
This has been a topic of other threads over the years, and something I've wanted at least since I first worked with Dask's 'delayed()' function.
You mean this?
https://docs.dask.org/en/latest/delayed.html
-- Steve _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ORMQES... Code of Conduct: http://python.org/psf/codeofconduct/
Hi Please forgive me if it's not already been considered. Is the following valid syntax, and if so what's the semantics? Here it is: def puzzle(*, a=>b+1, b=>a+1): return a, b Aside: In a functional programming language a = b + 1 b = a + 1 would be a syntax (or at least compile time) error. -- Jonathan
On Mon, Oct 25, 2021 at 3:43 AM Jonathan Fine <jfine2358@gmail.com> wrote:
Hi
Please forgive me if it's not already been considered. Is the following valid syntax, and if so what's the semantics? Here it is:
def puzzle(*, a=>b+1, b=>a+1): return a, b
Aside: In a functional programming language a = b + 1 b = a + 1 would be a syntax (or at least compile time) error.
There are two possibilities: either it's a SyntaxError, or it's a run-time UnboundLocalError if you omit both of them (in which case it would be perfectly legal and sensible if you specify one of them). I'm currently inclined towards SyntaxError, since permitting it would open up some hard-to-track-down bugs, but am open to suggestions about how it would be of value to permit this. ChrisA
On Mon, Oct 25, 2021 at 3:47 AM Chris Angelico <rosuav@gmail.com> wrote:
On Mon, Oct 25, 2021 at 3:43 AM Jonathan Fine <jfine2358@gmail.com> wrote:
Hi
Please forgive me if it's not already been considered. Is the following valid syntax, and if so what's the semantics? Here it is:
def puzzle(*, a=>b+1, b=>a+1): return a, b
Aside: In a functional programming language a = b + 1 b = a + 1 would be a syntax (or at least compile time) error.
There are two possibilities: either it's a SyntaxError, or it's a run-time UnboundLocalError if you omit both of them (in which case it would be perfectly legal and sensible if you specify one of them).
I'm currently inclined towards SyntaxError, since permitting it would open up some hard-to-track-down bugs, but am open to suggestions about how it would be of value to permit this.
In fact, on subsequent consideration, I'm inclining more strongly towards SyntaxError, due to the difficulty of explaining the actual semantics. Changing the PEP accordingly. ChrisA
Hi Chris You wrote: In fact, on subsequent consideration, I'm inclining more strongly
towards SyntaxError, due to the difficulty of explaining the actual semantics. Changing the PEP accordingly.
Your PEP, so your choice. I now think that if implemented, your PEP adds to the Python compiler (and also runtime?) tools for detecting and well-ordering Directed Acyclic Graphs (DAG). Here's another problem. Suppose def puzzle (*, a=>..., z>=...) gives rise to a directed acyclic graph, and all the initialisation functions consume and use a value from a counter. The semantics of puzzle will now depend on the linearization you choose for the DAG. (This consumption and use of the value from a counter could be internal to the initialisation function.) -- Jonathan
On Mon, 25 Oct 2021, Chris Angelico wrote:
On Mon, Oct 25, 2021 at 3:47 AM Chris Angelico <rosuav@gmail.com> wrote:
On Mon, Oct 25, 2021 at 3:43 AM Jonathan Fine <jfine2358@gmail.com> wrote:
Please forgive me if it's not already been considered. Is the following valid syntax, and if so what's the semantics? Here it is:
def puzzle(*, a=>b+1, b=>a+1): return a, b
There are two possibilities: either it's a SyntaxError, or it's a run-time UnboundLocalError if you omit both of them (in which case it would be perfectly legal and sensible if you specify one of them).
I'm currently inclined towards SyntaxError, since permitting it would open up some hard-to-track-down bugs, but am open to suggestions about how it would be of value to permit this.
In fact, on subsequent consideration, I'm inclining more strongly towards SyntaxError, due to the difficulty of explaining the actual semantics. Changing the PEP accordingly.
I think the semantics are easy to specify: the argument defaults get evaluated for unspecified order, in left to right order as specified in the def. Those may trigger exceptions as usual. Erik
On Sun, 24 Oct 2021, Erik Demaine wrote:
I think the semantics are easy to specify: the argument defaults get evaluated for unspecified order, in left to right order as specified in the def. Those may trigger exceptions as usual.
Sorry, that should be: I think the semantics are easy to specify: the argument defaults get evaluated for unspecified ARGUMENT(s), in left to right order as specified in the def. Those may trigger exceptions as usual.
On Mon, Oct 25, 2021 at 4:29 AM Erik Demaine <edemaine@mit.edu> wrote:
On Sun, 24 Oct 2021, Erik Demaine wrote:
I think the semantics are easy to specify: the argument defaults get evaluated for unspecified order, in left to right order as specified in the def. Those may trigger exceptions as usual.
Sorry, that should be:
I think the semantics are easy to specify: the argument defaults get evaluated for unspecified ARGUMENT(s), in left to right order as specified in the def. Those may trigger exceptions as usual.
Ah, but is it ALL argument defaults, or only those that are late-evaluated? Either way, it's going to be inconsistent with itself and harder to explain. That's what led me to change my mind. ChrisA
On Mon, Oct 25, 2021 at 05:23:38AM +1100, Chris Angelico wrote:
On Mon, Oct 25, 2021 at 4:29 AM Erik Demaine <edemaine@mit.edu> wrote:
On Sun, 24 Oct 2021, Erik Demaine wrote:
I think the semantics are easy to specify: the argument defaults get evaluated for unspecified order, in left to right order as specified in the def. Those may trigger exceptions as usual.
Sorry, that should be:
I think the semantics are easy to specify: the argument defaults get evaluated for unspecified ARGUMENT(s), in left to right order as specified in the def. Those may trigger exceptions as usual.
Ah, but is it ALL argument defaults, or only those that are late-evaluated? Either way, it's going to be inconsistent with itself and harder to explain. That's what led me to change my mind.
The rules for applying parameter defaults are well-defined. I would have to look it up to be sure, but by memory the rules are: 1. apply positional arguments from left to right; - if there are more positional arguments than parameters, raise; 2. apply named keyword arguments to parameters: - if the parameter already has a value, raise; - if the keyword parameter doesn't exist, raise; 3. for any parameter still without a value, fetch its default; - if there is no default, then raise. I would say that it makes most sense to assign early-bound defaults first, then late-bound defaults, specifically so that late-bound defaults can refer to early-bound ones: def func(x=0, @y=x+1) So step 3 above should become: 3. for any parameters still without a value, skip those which are late-bound, and fetch the default of the others; - if there is no default, then raise; 4. for any parameters still without a value, which will all be late-bound, run from left-to-right and evaluate the default. This will be consistent and understandable, and if you get an UnboundLocalError, the cause should be no more confusing than any other UnboundLocalError. Note that step 4 (evaluating the late-bound defaults) can raise *any* exception at all (it's an arbitrary expression, so it can fail in arbitrary ways). I see no good reason for trying to single out UnboundLocalError for extra protection by turning it into a syntax error. -- Steve
On Mon, Oct 25, 2021 at 6:13 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Mon, Oct 25, 2021 at 05:23:38AM +1100, Chris Angelico wrote:
On Mon, Oct 25, 2021 at 4:29 AM Erik Demaine <edemaine@mit.edu> wrote:
On Sun, 24 Oct 2021, Erik Demaine wrote:
I think the semantics are easy to specify: the argument defaults get evaluated for unspecified order, in left to right order as specified in the def. Those may trigger exceptions as usual.
Sorry, that should be:
I think the semantics are easy to specify: the argument defaults get evaluated for unspecified ARGUMENT(s), in left to right order as specified in the def. Those may trigger exceptions as usual.
Ah, but is it ALL argument defaults, or only those that are late-evaluated? Either way, it's going to be inconsistent with itself and harder to explain. That's what led me to change my mind.
The rules for applying parameter defaults are well-defined. I would have to look it up to be sure...
And that right there is all the evidence I need. If you, an experienced Python programmer, can be unsure, then there's a strong indication that novice programmers will have far more trouble. Why permit bad code at the price of hard-to-explain complexity? Offer me a real use-case where this would matter. So far, we had better use-cases for arbitrary assignment expression targets than for back-to-front argument default references, and those were excluded. ChrisA
On Mon, 25 Oct 2021, Chris Angelico wrote:
On Mon, Oct 25, 2021 at 6:13 PM Steven D'Aprano <steve@pearwood.info> wrote:
The rules for applying parameter defaults are well-defined. I would have to look it up to be sure...
And that right there is all the evidence I need. If you, an experienced Python programmer, can be unsure, then there's a strong indication that novice programmers will have far more trouble. Why permit bad code at the price of hard-to-explain complexity?
I'm not sure how this helps; the rules are already a bit complicated. Steven's proposed rules are a natural way to extend the existing rules; I don't see the new rules as (much) more complicated.
Offer me a real use-case where this would matter. So far, we had better use-cases for arbitrary assignment expression targets than for back-to-front argument default references, and those were excluded.
I can think of a few examples, though they are a bit artificial: ``` def search_listdir(path = None, files := os.listdir(path), start = 0, end = len(files)): '''specify path or files''' # variation of the LocaleTextCalendar from stdlib (in a message of Steven's) class Calendar: default_firstweekday = 0 def __init__(self, firstweekday := Calendar.default_firstweekday, locale := find_default_locale(), firstweekdayname := locale.lookup_day_name(firstweekday)): ... Calendar.default_firstweekday = 1 ``` But I think the main advantage of the left-to-right semantics is simplicity and predictability. I don't think the following functions should evaluate the default values in different orders. ``` def f(a := side_effect1(), b := side_effect2()): ... def g(a := side_effect1(), b := side_effect2() + a): ... def h(a := side_effect1() + b, b := side_effect2()): ... ``` I expect left-to-right semantics of the side effects (so function h will probably raise an error), just like I get from the corresponding tuple expressions: ``` (a := side_effect1(), b := side_effect2()) (a := side_effect1(), b := side_effect2() + a) (a := side_effect1() + b, b := side_effect2()) ``` As Jonathan Fine mentioned, if you defined the order to be a linearization of the partial order on arguments, (a) this would be complicated and (b) it would be ambiguous. I think, if you're going to forbid `def f(a := b, b:= a)` at the compiler level, you would need to forbid using late-bound arguments (at least) in least-bound argument expressions. But I don't see a reason to forbid this. It's rare that order would matter, and if it did, a quick experiment or learning "left to right" is really easy. The tuple expression equivalence leads me to think that `:=` is decent notation. As a result, I would expect: ``` def f(a := expr1, b := expr2, c := expr3): pass ``` to behave the same as: ``` _no_a = object() _no_b = object() _no_c = object() def f(a = _no_a, b = _no_b, c = _no_c): (a := expr1 if a is _no_a else a, b := expr2 if b is _no_b else b, c := expr3 if c is _no_c else c) ``` Given that `=` assignments within a function's parameter spec already only means "assign when another value isn't specified", this is pretty similar. On Mon, 25 Oct 2021, Chris Angelico wrote:
On Sun, 24 Oct 2021, Erik Demaine wrote:
I think the semantics are easy to specify: the argument defaults get evaluated for unspecified ARGUMENT(s), in left to right order as specified in the def. Those may trigger exceptions as usual.
Ah, but is it ALL argument defaults, or only those that are late-evaluated? Either way, it's going to be inconsistent with itself and harder to explain. That's what led me to change my mind.
I admit I missed this subtlety, though again I don't think it would often make a difference. But working out subtleties is what PEPs and discussion are for. :-) I'd be inclined to assign the early-bound argument defaults before the late-bound arguments, because their values are already known (they're stored right in the function argument) so they can't cause side effects, and it could offer slight incremental benefits, like being able to write the following (again, somewhat artificial): ``` def manipulate(top_list): def recurse(start=0, end := len(rec_list), rec_list=top_list): ... ``` But I don't feel strongly either way about either interpretation. Mixing both types of default arguments breaks the analogy to tuple expressions above, alas. The corresponding tuple expression with `=` is just invalid. Personally, I'd expect to use late-bound defaults almost all or all the time; they behave more how I expect and how I usually need them (I use a fair amount of `[]` and `{}` and `set()` as default values). The only context I'd use/want the current default behavior is to hack closures, as in: ``` for thing in things: thing.callback = lambda thing=thing: print(thing.name) ``` I believe the general preference for late-bound defaults is why Guido called this a "wart" in https://mail.python.org/archives/list/python-ideas@python.org/message/T4VPHD... By the way, JavaScript's semantics for default arguments are just like what I'm describing: they are evaluated at call time, in the function scope, and in left-to-right order. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/... https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/... A key difference from the PEP is that JavaScript doesn't have the notion of "omitted arguments"; any omitted arguments are just passed in as `undefined`; so `f()` and `f(undefined)` always behave the same (triggering default argument behavior). There is a subtlety mentioned in the case of JavaScript, which is that the default value expressions are evaluated in their own scope: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/... This is perhaps worth considering for the Python context. I'm not sure this is as important in Python, because UnboundLocalError exists (so attempts to access things in the function's scope will fail), but perhaps I'm missing a ramification... Erik -- Erik Demaine | edemaine@mit.edu | http://erikdemaine.org/
On Tue, Oct 26, 2021 at 3:32 AM Erik Demaine <edemaine@mit.edu> wrote:
As Jonathan Fine mentioned, if you defined the order to be a linearization of the partial order on arguments, (a) this would be complicated and (b) it would be ambiguous. I think, if you're going to forbid `def f(a := b, b:= a)` at the compiler level, you would need to forbid using late-bound arguments (at least) in least-bound argument expressions. But I don't see a reason to forbid this. It's rare that order would matter, and if it did, a quick experiment or learning "left to right" is really easy.
Oh yes, absolutely. I have never at any point considered any sort of linearization or reordering of evaluation, and it would be a nightmare. They'll always be evaluated left-to-right. The two options on the table are: 1) Allow references to any value that has been provided in any way 2) Allow references only to parameters to the left Option 2 is a simple SyntaxError on compilation (you won't even get as far as the def statement). Option 1 allows everything all up to the point where you call it, but then might raise UnboundLocalError if you refer to something that wasn't passed. The permissive option allows mutual references as long as one of the arguments is provided, but will give a peculiar error if you pass neither. I think this is bad API design. If you have a function for which one or other of two arguments must be provided, it should raise TypeError when you fail to do so, not UnboundLocalError.
Ah, but is it ALL argument defaults, or only those that are late-evaluated? Either way, it's going to be inconsistent with itself and harder to explain. That's what led me to change my mind.
I admit I missed this subtlety, though again I don't think it would often make a difference. But working out subtleties is what PEPs and discussion are for. :-)
Yeah. I have plans to try this out on someone who knows some Python but has no familiarity with this proposal, and see how he finds it.
I'd be inclined to assign the early-bound argument defaults before the late-bound arguments, because their values are already known (they're stored right in the function argument) so they can't cause side effects, and it could offer slight incremental benefits, like being able to write the following (again, somewhat artificial):
``` def manipulate(top_list): def recurse(start=0, end := len(rec_list), rec_list=top_list): ... ```
That would be the most logical semantics, if it's permitted at all.
Personally, I'd expect to use late-bound defaults almost all or all the time; they behave more how I expect and how I usually need them (I use a fair amount of `[]` and `{}` and `set()` as default values).
Interesting. In many cases, the choice will be irrelevant, and early-bound is more efficient. There aren't many situations where early-bind semantics are going to be essential, but there will be huge numbers where late-bind semantics will be unnecessary.
A key difference from the PEP is that JavaScript doesn't have the notion of "omitted arguments"; any omitted arguments are just passed in as `undefined`; so `f()` and `f(undefined)` always behave the same (triggering default argument behavior).
Except when it doesn't, and you have to use null instead... I have never understood those weird inconsistencies!
There is a subtlety mentioned in the case of JavaScript, which is that the default value expressions are evaluated in their own scope:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/...
Yeah, well, JS scope is a weird mess of historical artifacts. Fortunately, we don't have to be compatible with it :)
This is perhaps worth considering for the Python context. I'm not sure this is as important in Python, because UnboundLocalError exists (so attempts to access things in the function's scope will fail), but perhaps I'm missing a ramification...
Hmm. I think the only way it could possibly matter would be something like this: def f(x=>spam): global spam spam += 1 Unsure what this should do. A naive interpretation would be this: def f(x=None): if x is None: x = spam global spam spam += 1 and would bomb with SyntaxError. But perhaps it's better to permit this, on the understanding that a global statement anywhere in a function will apply to late-bound defaults; or alternatively, to evaluate the arguments in a separate scope. Or, which would be a simpler way of achieving the same thing: all name lookups inside function defaults come from the enclosing scope unless they are other arguments. But maybe that's unnecessarily complicated. ChrisA
On Mon, Oct 25, 2021 at 10:28 AM Chris Angelico <rosuav@gmail.com> wrote:
[...] The two options on the table are:
1) Allow references to any value that has been provided in any way 2) Allow references only to parameters to the left
Option 2 is a simple SyntaxError on compilation (you won't even get as far as the def statement). Option 1 allows everything all up to the point where you call it, but then might raise UnboundLocalError if you refer to something that wasn't passed.
Note that if you were to choose the SyntaxError option, you'd be breaking new ground. Everywhere else in Python, undefined names are runtime errors (NameError or UnboundLocalError). -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Option 2 is a simple SyntaxError on compilation (you won't even get as
far as the def statement). Option 1 allows everything all up to the point where you call it, but then might raise UnboundLocalError if you refer to something that wasn't passed.
Note that if you were to choose the SyntaxError option, you'd be breaking new ground. Everywhere else in Python, undefined names are runtime errors (NameError or UnboundLocalError).
That’s why I said earlier that this is not technically a SyntaxError. Would it be possible to raise a UnboundLocalError at function definition time if any deferred parameters refer to any others. Functionality similar to a SyntaxError, but more in line with present behavior. -CHB
--
Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Tue, Oct 26, 2021 at 4:36 AM Guido van Rossum <guido@python.org> wrote:
On Mon, Oct 25, 2021 at 10:28 AM Chris Angelico <rosuav@gmail.com> wrote:
[...] The two options on the table are:
1) Allow references to any value that has been provided in any way 2) Allow references only to parameters to the left
Option 2 is a simple SyntaxError on compilation (you won't even get as far as the def statement). Option 1 allows everything all up to the point where you call it, but then might raise UnboundLocalError if you refer to something that wasn't passed.
Note that if you were to choose the SyntaxError option, you'd be breaking new ground. Everywhere else in Python, undefined names are runtime errors (NameError or UnboundLocalError).
I'm considering this to be more similar to mismatching local and global usage, or messing up nonlocal names:
def spam(): ... ham ... global ham ... File "<stdin>", line 3 SyntaxError: name 'ham' is used prior to global declaration def spam(): ... def ham(): ... nonlocal eggs ... File "<stdin>", line 3 SyntaxError: no binding for nonlocal 'eggs' found
The problem is the bizarre inconsistencies that can come up, which are difficult to explain unless you know exactly how everything is implemented internally. What exactly is the difference between these, and why should some be legal and others not? def f1(x=>y + 1, y=2): ... def f2(x=>y + 1, y=>2): ... def f3(x=>y + 1, *, y): ... def f4(x=>y + 1): y = 2 def f5(x=>y + 1): global y y = 2 And importantly, do Python core devs agree with less-skilled Python programmers on the intuitions? If this should be permitted, there are two plausible semantic meanings for these kinds of constructs: 1) Arguments are defined left-to-right, each one independently of each other 2) Early-bound arguments and those given values are defined first, then late-bound arguments The first option is much easier to explain, but will never give useful results for out-of-order references (unless it's allowed to refer to the containing scope or something). The second is closer to the "if x is None: x = y + 1" equivalent, but is harder to explain. Two-phase initialization is my second-best preference after rejecting with SyntaxError, but I would love to see some real-world usage before opening it up. Once permission is granted, it cannot be revoked, and it might turn out that one of the other behaviours would have made more sense. ChrisA
There are other options. Maybe you can't combine early and late binding defaults in the same signature. Or maybe all early binding defaults must precede all late binding defaults. FWIW have you started an implementation yet? "If the implementation is easy to explain, ..." On Mon, Oct 25, 2021 at 10:49 AM Chris Angelico <rosuav@gmail.com> wrote:
On Tue, Oct 26, 2021 at 4:36 AM Guido van Rossum <guido@python.org> wrote:
On Mon, Oct 25, 2021 at 10:28 AM Chris Angelico <rosuav@gmail.com>
wrote:
[...] The two options on the table are:
1) Allow references to any value that has been provided in any way 2) Allow references only to parameters to the left
Option 2 is a simple SyntaxError on compilation (you won't even get as far as the def statement). Option 1 allows everything all up to the point where you call it, but then might raise UnboundLocalError if you refer to something that wasn't passed.
Note that if you were to choose the SyntaxError option, you'd be breaking new ground. Everywhere else in Python, undefined names are runtime errors (NameError or UnboundLocalError).
I'm considering this to be more similar to mismatching local and global usage, or messing up nonlocal names:
def spam(): ... ham ... global ham ... File "<stdin>", line 3 SyntaxError: name 'ham' is used prior to global declaration def spam(): ... def ham(): ... nonlocal eggs ... File "<stdin>", line 3 SyntaxError: no binding for nonlocal 'eggs' found
The problem is the bizarre inconsistencies that can come up, which are difficult to explain unless you know exactly how everything is implemented internally. What exactly is the difference between these, and why should some be legal and others not?
def f1(x=>y + 1, y=2): ... def f2(x=>y + 1, y=>2): ... def f3(x=>y + 1, *, y): ... def f4(x=>y + 1): y = 2 def f5(x=>y + 1): global y y = 2
And importantly, do Python core devs agree with less-skilled Python programmers on the intuitions?
If this should be permitted, there are two plausible semantic meanings for these kinds of constructs:
1) Arguments are defined left-to-right, each one independently of each other 2) Early-bound arguments and those given values are defined first, then late-bound arguments
The first option is much easier to explain, but will never give useful results for out-of-order references (unless it's allowed to refer to the containing scope or something). The second is closer to the "if x is None: x = y + 1" equivalent, but is harder to explain.
Two-phase initialization is my second-best preference after rejecting with SyntaxError, but I would love to see some real-world usage before opening it up. Once permission is granted, it cannot be revoked, and it might turn out that one of the other behaviours would have made more sense.
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/46ZWYA... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Tue, Oct 26, 2021 at 4:54 AM Guido van Rossum <guido@python.org> wrote:
There are other options. Maybe you can't combine early and late binding defaults in the same signature. Or maybe all early binding defaults must precede all late binding defaults.
All early must precede all late would make a decent option. Will keep that in mind.
FWIW have you started an implementation yet? "If the implementation is easy to explain, ..."
Not yet. Juggling a lot of things; will get to that Real Soon™, unless someone else offers to help out, which I would definitely welcome. ChrisA
On Tue, Oct 26, 2021 at 04:48:17AM +1100, Chris Angelico wrote:
The problem is the bizarre inconsistencies that can come up, which are difficult to explain unless you know exactly how everything is implemented internally. What exactly is the difference between these, and why should some be legal and others not?
They should all be legal. Legal doesn't mean "works". Code that raises an exception is still legal code.
def f1(x=>y + 1, y=2): ... def f2(x=>y + 1, y=>2): ... def f3(x=>y + 1, *, y): ... def f4(x=>y + 1): y = 2 def f5(x=>y + 1): global y y = 2
What "bizarre inconsistencies" do you think they have? Each example is different so it is hardly shocking if they behave different too. f1() assigns positional arguments first (there are none), then keyword arguments (still none), then early-bound defaults left to right (y=2), then late-bound defaults left to right (x=y+1). That is, I argue, the most useful behaviour. But if you insist on a strict left-to-right single pass to assign defaults, then instead it will raise UnboundLocalError because y doesn't have a value. Just like the next case: f2() assigns positional arguments first (there are none), then keyword arguments (still none), then early-bound defaults left to right (none of these either), then late-bound defaults left to right (x=y+1) which raises UnboundLocalError because y is a local but doesn't have a value yet. f3() assigns positional arguments first (there are none), then keyword arguments (still none), at which point it raises TypeError because you have a mandatory keyword-only argument with no default. f4() is just like f2(). And lastly, f5() assigns positional arguments first (there are none), then keyword arguments (still none), then early-bound defaults left to right (none of these either), then late-bound defaults left to right (x=y+1) which might raise NameError if global y doesn't exist, otherwise it will succeed. Each of those cases is easily understandable. There is no reason to expect the behaviour in all four cases to be the same, so we can hardly complain that they are "inconsistent" let alone that they are "bizarrely inconsistent". The only novelty here is that functions with late-binding can raise arbitrary exceptions, including UnboundLocalError, before the body of the function is entered. If you don't like that, then you don't like late-bound defaults at all and you should be arguing in favour of rejecting the PEP :-( If we consider code that already exists today, with the None sentinel trick, each of those cases have equivalent errors today, even if some of the fine detail is different (e.g. getting TypeError because we attempt to add 1 to None instead of an unbound local). However there is a real, and necessary, difference in behaviour which I think you missed: def func(x=x, y=>x) # or func(x=x, @y=x) The x=x parameter uses global x as the default. The y=x parameter uses the local x as the default. We can live with that difference. We *need* that difference in behaviour, otherwise these examples won't work: def method(self, x=>self.attr) # @x=self.attr def bisect(a, x, lo=0, hi=>len(a)) # @hi=len(a) Without that difference in behaviour, probably fifty or eighty percent of the use-cases are lost. (And the ones that remain are mostly trivial ones of the form arg=[].) So we need this genuine inconsistency. If you can live with that actual inconsistency, why are you losing sleep over behaviour (functions f1 through f4) which isn't actually inconsistent? * Code that does different things is supposed to behave differently; * The differences in behaviour are easy to understand; * You can't prevent the late-bound defaults from raising UnboundLocalError, so why are you trying to turn a tiny subset of such errors into SyntaxError? * The genuine inconsistency is *necessary*: late-bound expressions should be evaluated in the function's namespace, not the surrounding (global) namespace.
And importantly, do Python core devs agree with less-skilled Python programmers on the intuitions?
We should write a list of the things that Python wouldn't have if the intuitions of "less-skilled Python programmers" was a neccessary condition. - no metaclasses, descriptors or decorators; - no classes, inheritence (multiple or single); - no slices or zero-based indexing; - no mutable objects; - no immutable objects; - no floats or Unicode strings; etc. I think that, *maybe*, we could have `print("Hello world")`, so long as the programmer's intuition is that print needs parentheses.
If this should be permitted, there are two plausible semantic meanings for these kinds of constructs:
1) Arguments are defined left-to-right, each one independently of each other 2) Early-bound arguments and those given values are defined first, then late-bound arguments
The first option is much easier to explain, but will never give useful results for out-of-order references (unless it's allowed to refer to the containing scope or something). The second is closer to the "if x is None: x = y + 1" equivalent, but is harder to explain.
You just explained it perfectly in one sentence. The two options are equally easy to explain. The second takes a few more words, but the concepts are no harder. And the second is much more useful. In comparison, think about how hard it is to explain your preferred behaviour, a SyntaxError. Think about how many posts you have written, and how many examples you have given, hundreds maybe thousands of words, dozens or hundreds of sentences, and you have still not convinced everyone that "raise SyntaxError" is the right thing to do. "Why does this simple function definition raise SyntaxError?" is MUCH harder to explain than "Why does a default value that tries to access an unbound local variable raise UnboundLocalError?".
Two-phase initialization is my second-best preference after rejecting with SyntaxError, but I would love to see some real-world usage before opening it up. Once permission is granted, it cannot be revoked, and it might turn out that one of the other behaviours would have made more sense.
Being cautious about new syntax is often worthy, but here you are being overcautious. You are trying to prohibit something as a syntax error because it *might* fail at runtime. We don't even protect against things that we know *will* fail! x = 1 + 'a' # Not a syntax error. In this case, two-pass defaults is clearly superior because it would allow everything that the one-pass behaviour would allow, *plus more* applications that we haven't even thought of yet (but others will). Analogy: When Python 1 was first evolving, nobody said that we ought to be cautious about parallel assignment: a, b, c = ... just because the user might misuse it. a = 1 if False: b = 1 # oops I forgot to define b a, b = b, a # SyntaxError just in case Nor did we lose sleep over which parallel assignment model is better, and avoid making a decision: a, b = b, a # Model 1: push b push a swap a = pop stack b = pop stack versus: # Model 2: push b a = pop stack push a b = pop stack The two models are identical if the expressions on the right are all distinct from the targets on the left, e.g. `a, b = x, y`, but the first model allows us to do much more useful things that the second doesn't, such as the "swap two variables" idiom. Be bold! The "two pass" model is clearly better than the "one pass" model. You don't need to prevaricate just in case. Worst case, the Steering Council will say "Chris we love everything about the PEP except this..." and you will have to change it. But they won't because the two pass model is clearly the best *wink* -- Steve
On Tue, Oct 26, 2021 at 3:00 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Oct 26, 2021 at 04:48:17AM +1100, Chris Angelico wrote:
The problem is the bizarre inconsistencies that can come up, which are difficult to explain unless you know exactly how everything is implemented internally. What exactly is the difference between these, and why should some be legal and others not?
They should all be legal. Legal doesn't mean "works". Code that raises an exception is still legal code.
Then there's no such thing as illegal code, and my entire basis for explanation is bunk. Come on, you know what I mean. If it causes SyntaxError:, it's not legal code. Just because that's a catchable exception doesn't change anything. Example:
def f5(x=>y + 1): global y y = 2
According to the previously-defined equivalencies, this would mean: def f5(x=None): if x is None: x = y + 1 global y y = 2 And that's a SyntaxError. Do you see what I mean now? Either these things are not consistent with existing idioms, or they're not consistent with each other. Since writing that previous post, I have come to the view that "consistency with existing idioms" is the one that gets sacrificed to resolve this. I haven't yet gotten started on implementation (definitely gonna get to that Real Soon Now™), but one possible interpretation of f5, once disconnected from the None parallel, is that omitting x would use one more than the module-level y. That implies that a global statement *anywhere* in a function will also apply to the function header, despite it not otherwise being legal to refer to a name earlier in the function than the global statement.
And lastly, f5() assigns positional arguments first (there are none), then keyword arguments (still none), then early-bound defaults left to right (none of these either), then late-bound defaults left to right (x=y+1) which might raise NameError if global y doesn't exist, otherwise it will succeed.
It's interesting that you assume this. By any definition, the header is a reference prior to the global statement, which means the global statement would have to be hoisted. I think that's probably the correct behaviour, but it is a distinct change from the current situation.
However there is a real, and necessary, difference in behaviour which I think you missed:
def func(x=x, y=>x) # or func(x=x, @y=x)
The x=x parameter uses global x as the default. The y=x parameter uses the local x as the default. We can live with that difference. We *need* that difference in behaviour, otherwise these examples won't work:
def method(self, x=>self.attr) # @x=self.attr
def bisect(a, x, lo=0, hi=>len(a)) # @hi=len(a)
Without that difference in behaviour, probably fifty or eighty percent of the use-cases are lost. (And the ones that remain are mostly trivial ones of the form arg=[].) So we need this genuine inconsistency.
I agree, we do need that particular inconsistency. I want to avoid others where possible.
If you can live with that actual inconsistency, why are you losing sleep over behaviour (functions f1 through f4) which isn't actually inconsistent?
(Sleep? What is sleep? I don't lose what I don't have!) Based on the multi-pass assignment model, which you still favour, those WOULD be quite inconsistent, and some of them would make little sense. It would also mean that there is a distinct semantic difference between: def f1(x=>y + 1, y=2): ... def f2(x=>y + 1, y=>2): ... in that it changes what's viable and what's not. (Since you don't like the term "legal" here, I'll go with "viable", since a runtime exception isn't terribly useful.) Changing the default from y=2 to y=>2 would actually stop the example from working. Multi-pass initialization makes sense where it's necessary. Is it really necessary here?
And importantly, do Python core devs agree with less-skilled Python programmers on the intuitions?
We should write a list of the things that Python wouldn't have if the intuitions of "less-skilled Python programmers" was a neccessary condition.
- no metaclasses, descriptors or decorators; - no classes, inheritence (multiple or single); - no slices or zero-based indexing; - no mutable objects; - no immutable objects; - no floats or Unicode strings;
etc. I think that, *maybe*, we could have `print("Hello world")`, so long as the programmer's intuition is that print needs parentheses.
No, you misunderstand. I am not saying that less-skilled programmers have to intuit things perfectly; I am saying that, when there are drastic differences of expectation, there is probably a problem. I can easily explain "arguments are assigned left to right". It is much harder to explain multi-stage initialization and why different things can be referenced.
Two-phase initialization is my second-best preference after rejecting with SyntaxError, but I would love to see some real-world usage before opening it up. Once permission is granted, it cannot be revoked, and it might turn out that one of the other behaviours would have made more sense.
Being cautious about new syntax is often worthy, but here you are being overcautious. You are trying to prohibit something as a syntax error because it *might* fail at runtime. We don't even protect against things that we know *will* fail!
x = 1 + 'a' # Not a syntax error.
But this is an error: x = 1 def f(): print(x) x = 2 And so is this: def f(x): global x As is this: def f(): x = 1 global x x = 2 You could easily give these functions meaning using any of a variety of rules, like "the global statement applies to what's after it" or "the global statement applies to the whole function regardless of placement". Why are they SyntaxErrors? Is that being overcautious, or is it blocking code that makes no sense? The two-pass model is closer to existing idioms. That's of value, but it isn't the greatest justification. And given that there is no idiom that perfectly matches the semantics, I don't consider that to be strong enough to justify the increase in complexity. ChrisA
On Tue, Oct 26, 2021 at 05:27:49PM +1100, Chris Angelico wrote:
On Tue, Oct 26, 2021 at 3:00 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Oct 26, 2021 at 04:48:17AM +1100, Chris Angelico wrote:
The problem is the bizarre inconsistencies that can come up, which are difficult to explain unless you know exactly how everything is implemented internally. What exactly is the difference between these, and why should some be legal and others not?
They should all be legal. Legal doesn't mean "works". Code that raises an exception is still legal code.
Then there's no such thing as illegal code,
I mean that code that compiles and runs is legal, even if it raises a runtime error. Code that cannot compile due to syntax errors is "illegal", we often talk about "illegal syntax": None[0] # Legal syntax, still raises import() = while x or and else # Illegal syntax Sorry if I wasn't clear.
and my entire basis for explanation is bunk. Come on, you know what I mean. If it causes SyntaxError:, it's not legal code.
Sorry Chris, I don't know what you mean. It only causes syntax error because you are forcing it to cause syntax error, not because it cannot be interpreted under the existing (proposed or actual) semantics. You are (were?) arguing that something that is otherwise meaningful should be a syntax error because there are some circumstances that it could fail. That's not "illegal code" in the sense I mean, and I don't know why you want it to be a syntax error (unless you've changed your mind). We don't do this: y = x+1 # Syntax error, because x might be undefined and we shouldn't make this a syntax error def func(@spam=eggs+1, @eggs=spam-1): either just because `func()` with no arguments raises. So long as you pass at least one argument, it works fine, and that may be perfectly suitable for some uses. Let linters worry about flagging that as an violation. The interpreter should be for consenting adults. There is plenty of code that we can already write that might raise a NameError or UnboundLocalError. This is not special enough to promote it to a syntax error.
def f5(x=>y + 1): global y y = 2
According to the previously-defined equivalencies, this would mean:
def f5(x=None): if x is None: x = y + 1 global y y = 2
Of course it would not mean that. That's a straw-man. You have deliberately written code which you know is illegal (now, it wasn't illegal just a few releases back). Remember that "global y" is not an executable statement, it is a declaration, we can move the declaration anywhere we want to make the code legal. So it would be equivalent to: def f5(x=None): global y if x is None: x = y + 1 y = 2 And it can still raise NameError if y is not defined. Caveat utilitor (let the user beware). Parameters (and their defaults) are not written inside the function body, they are written in the function header, and the function header by definition must preceed the body and any declarations inside it. We should not allow such an unimportant technicality to prevent late bound defaults from using globals. Remember that for two decades or so, global declarations could be placed anywhere in the function body. It is only recently that we have tightened that up with a rule that the declaration must occur before any use of a name inside the function body. We created that more restrictive rule by fiat, we can loosen it *for late-bound expressions* by fiat too: morally, global declarations inside the body are deemed to occur before the parameter defaults. Done and solved. (I don't know why we decided on this odd rule that the global declaration has to occur before the usage of the variable, instead of just insisting that any globals be declared immediately after the function header and docstring. Oh well.)
That implies that a global statement *anywhere* in a function will also apply to the function header, despite it not otherwise being legal to refer to a name earlier in the function than the global statement.
Great minds think alike :-) If it makes you happy, you could enforce a rule that the global has to occur after the docstring and before the function body, but honestly I'm not sure why we would bother. Some more comments, which hopefully match your vision of the feature: If a late bound default refers to a name -- and most of them will -- we should follow the same rules as we otherwise would, to the extent that makes sense. For example: * If the name in the default expression matches a parameter, then it refers to the parameter, not the same name in the surrounding scope; parameters are always local to the function, so the name should be local to the function inside the default expression too. * If the name in the default expression matches a local name in the body of the function, that is, one that we assign to and haven't declared as global or nonlocal, then the default expression should likewise treat it as local. * If the name in the default matches a name in the function body that has been declared global or nonlocal, then treat it the same way in the default expression. * Otherwise treat it as global/nonlocal/builtin. (I think that covers all the cases.) Do these scoping rules mean it is possible to write defaults that will fail at run time? Yes. So does the code we can write today. Don't worry about it. It is the coder's responsibility, not the interpreters and not yours, to ensure that the code they write works.
And lastly, f5() assigns positional arguments first (there are none), then keyword arguments (still none), then early-bound defaults left to right (none of these either), then late-bound defaults left to right (x=y+1) which might raise NameError if global y doesn't exist, otherwise it will succeed.
It's interesting that you assume this. By any definition, the header is a reference prior to the global statement, which means the global statement would have to be hoisted. I think that's probably the correct behaviour, but it is a distinct change from the current situation.
See my comments above. What other possible meaning would make sense? We can write a language with whatever restrictions we like: # Binding operations must follow # the I before E rule, unless after C self.increment = 1 # Syntax error because E occurs before I container = [] # Syntax error because I before E after C but such a language would be no fun to use. Or possibly lots of fun, if you had a twisted mind *wink* So yes, I looked at what the clear and obvious intention of the code was, and assumed that it should work. Executable pseudocode, remember? There's no need to restrict things just for the sake of restricting them.
Based on the multi-pass assignment model, which you still favour, those WOULD be quite inconsistent, and some of them would make little sense. It would also mean that there is a distinct semantic difference between:
def f1(x=>y + 1, y=2): ... def f2(x=>y + 1, y=>2): ...
Sure. They behave differently because they are different. These are different too: # Block 1 y = 2 x = y + 1 # Block 2 x = y + 1 y = 2
in that it changes what's viable and what's not. (Since you don't like the term "legal" here, I'll go with "viable", since a runtime exception isn't terribly useful.) Changing the default from y=2 to y=>2 would actually stop the example from working.
Um, yes? Changing the default from y=2 to y="two" will also stop it from working. Even if you swap the order of the parameters.
Multi-pass initialization makes sense where it's necessary. Is it really necessary here?
We already have multi-pass initialisation. 1. positional arguments are applied, left to right; 2. then keyword arguments; 3. then defaults are applied. (It is, I think, an implementation detail whether 2 and 3 are literally two separate passes or whether they can be rolled into a single pass. There are probably many good ways to actually implement binding of arguments to parameters. But semantically, argument binding to parameters behaves as if it were multiple passes. Since the number of parameters is likely to be small (more likely 6 parameters than 6000), we shouldn't care about the cost of a second pass to fill in the late-bound defaults after all the early-bound defaults are done.
No, you misunderstand. I am not saying that less-skilled programmers have to intuit things perfectly; I am saying that, when there are drastic differences of expectation, there is probably a problem.
I can easily explain "arguments are assigned left to right". It is much harder to explain multi-stage initialization and why different things can be referenced.
I disagree that it is much harder. In any case, my fundamental model here is that if we can do something using pseudo-late binding (the "if arg is None" idiom), then it should (more or less) be possible using late-binding. We should be able to just move the expression from the body of the function to the parameter and in most cases it should work. Obviously some conditions apply: - single expressions only, not a full block; - exceptions may change (e.g. a TypeError from `None + 1` may turn into an UnboundLocalError, etc) - not all cases will work, due to order of operations, but we should be able to get most cases to work. Inside the body of a function, we can apply pseudo-late binding using the None idiom in any order we like. As late-binding parameters, we are limited to left-to-right. But we can get close to the (existing) status quo by ensuring that all early-bound defaults are applied before we start the late-bound defaults. # Status quo def function(arg, spam=None, eggs="something useful"): if spam is None: spam = process(eggs) eggs is guaranteed to have a result here because the early-bound defaults are all assigned before the body of the function is entered. So in the new regime of late-binding, I want to write: def function(arg, @spam=process(eggs), eggs="something useful"): and the call to process(eggs) should occur after the early bound default is assigned. The easiest way to get that is to say that early bound defaults are assigned in one pass, and late bound in a second pass. Without that, many use cases for late-binding (I won't try to guess a proportion) are not going to translate to the new idiom. -- Steve
On Tue, Oct 26, 2021 at 11:10 PM Steven D'Aprano <steve@pearwood.info> wrote:
Based on the multi-pass assignment model, which you still favour, those WOULD be quite inconsistent, and some of them would make little sense. It would also mean that there is a distinct semantic difference between:
def f1(x=>y + 1, y=2): ... def f2(x=>y + 1, y=>2): ...
Sure. They behave differently because they are different.
These are different too:
# Block 1 y = 2 x = y + 1
# Block 2 x = y + 1 y = 2
Yes, those ARE different. Those are more equivalent to changing the order of the parameters in the function signature, and I think we all agree that that DOES make a difference. The question is whether these could change meaning if you used a different type of assignment, such as: y := 2 x = y + 1 Does that suddenly make it legal? I think you'll find that this sort of thing is rather surprising. And that's what we have here: changing from one form of argument default to another changes whether left-to-right applies or not. I don't want that. And based on an experiment with a less-experienced Python programmer (admittedly only a single data point), neither do other people. Left-to-right makes sense; multi-pass does not.
Multi-pass initialization makes sense where it's necessary. Is it really necessary here?
We already have multi-pass initialisation.
1. positional arguments are applied, left to right; 2. then keyword arguments; 3. then defaults are applied.
(It is, I think, an implementation detail whether 2 and 3 are literally two separate passes or whether they can be rolled into a single pass. There are probably many good ways to actually implement binding of arguments to parameters. But semantically, argument binding to parameters behaves as if it were multiple passes.
Those aren't really multi-pass assignment though, because they could just as easily be assigned simultaneously. You can't, in Python code, determine which order the parameters were assigned. There are rules about how to map positional and keyword arguments to the names, but it would be just as logical to say: 1. Assign all defaults 2. Assign all keyword args, overwriting defaults 3. Assign positional args, overwriting defaults but not kwargs And the net result would be exactly the same. But with anything that executes arbitrary Python code, it matters, and it matters what state the other values are in. So we have a few options: a) Assign all early-evaluated defaults and explicitly-passed arguments, leaving others unbound; then process late-evaluated defaults one by one b) Assign parameters one by one, left to right
Since the number of parameters is likely to be small (more likely 6 parameters than 6000), we shouldn't care about the cost of a second pass to fill in the late-bound defaults after all the early-bound defaults are done.
I'm not concerned with performance, I'm concerned with semantics.
No, you misunderstand. I am not saying that less-skilled programmers have to intuit things perfectly; I am saying that, when there are drastic differences of expectation, there is probably a problem.
I can easily explain "arguments are assigned left to right". It is much harder to explain multi-stage initialization and why different things can be referenced.
I disagree that it is much harder.
In any case, my fundamental model here is that if we can do something using pseudo-late binding (the "if arg is None" idiom), then it should (more or less) be possible using late-binding.
We should be able to just move the expression from the body of the function to the parameter and in most cases it should work.
There are enough exceptions that this parallel won't really work, so I'd rather leave aside the parallel and just describe how argument defaults work. Yes, you can achieve the same effect in other ways, but you can't do a mechanical transformation and expect it to behave identically.
Inside the body of a function, we can apply pseudo-late binding using the None idiom in any order we like. As late-binding parameters, we are limited to left-to-right. But we can get close to the (existing) status quo by ensuring that all early-bound defaults are applied before we start the late-bound defaults.
# Status quo def function(arg, spam=None, eggs="something useful"): if spam is None: spam = process(eggs)
eggs is guaranteed to have a result here because the early-bound defaults are all assigned before the body of the function is entered. So in the new regime of late-binding, I want to write:
def function(arg, @spam=process(eggs), eggs="something useful"):
and the call to process(eggs) should occur after the early bound default is assigned. The easiest way to get that is to say that early bound defaults are assigned in one pass, and late bound in a second pass.
Without that, many use cases for late-binding (I won't try to guess a proportion) are not going to translate to the new idiom.
Can you find some actual real-world cases where this is true? I was unable to find any examples where I didn't have to apologize for the contrivedness of them. Having an argument default depend on arguments that come after it seems very surprising, especially since they can't be passed positionally anyway; so it would only be a very narrow set of circumstances where this is a problem - if they're keyword-only args, they can be reordered into something more logical, thus solving the problem. I think you're far too caught up on equivalences that don't exist. ChrisA
On Tue, Oct 26, 2021 at 11:10 PM Steven D'Aprano <steve@pearwood.info> wrote:
Based on the multi-pass assignment model, which you still favour, those WOULD be quite inconsistent, and some of them would make little sense. It would also mean that there is a distinct semantic difference between:
def f1(x=>y + 1, y=2): ... def f2(x=>y + 1, y=>2): ... Sure. They behave differently because they are different.
These are different too:
# Block 1 y = 2 x = y + 1
# Block 2 x = y + 1 y = 2 Yes, those ARE different. Those are more equivalent to changing the order of the parameters in the function signature, and I think we all agree that that DOES make a difference. The question is whether these could change meaning if you used a different type of assignment, such as:
y := 2 x = y + 1
Does that suddenly make it legal? I think you'll find that this sort of thing is rather surprising. And that's what we have here: changing from one form of argument default to another changes whether left-to-right applies or not.
I don't want that. And based on an experiment with a less-experienced Python programmer (admittedly only a single data point), neither do other people. Left-to-right makes sense; multi-pass does not. As I may be the data point in question: One of my posts seems to have got lost again, so I reproduce some of it (reworked): What I DON'T want to see is allowing something like this being legal: def f(a := b+1, b := e+1, c := a+1, d := 42, e := d+1): If no arguments are passed, the interpreter has to work out to evaluate first d, then e, then b, then a, then finally c. If some arguments are
On 26/10/2021 18:25, Chris Angelico wrote: passed, I guess the same order would work. But it feels ... messy. And obfuscated. And if this is legal (note: it IS a legitimate use case): def DrawCircle(centre=(0,0), radius := circumference / TWO_PI, circumference := radius * TWO_PI): the interpreter has to work out whether to evaluate the 2nd or 3rd arg first, depending on which is passed. AFAICS all this may need multiple passes though the args at runtime. Complicated, and inefficient. *If* it could all be sorted out at compile time, my objection would become weaker. There has been support for evaluating all early-bound defaults before all late-bound defaults. I have been persuaded that this is a reasonable option. AFAICS there would be little practical difference from straight left-to-right evaluation of defaults, since assigning an early-bound default should not have a side effect. So it could even be an implementation choice. Best wishes Rob PS Can I echo Guido's plea that people don't derail this PEP by trying to shoehorn deferred-evaluation-objects (or whatever you want to call them) into it? As Chris A says, that's a separate idea and should go into a separate PEP. If I need a screwdriver, I buy a screwdriver, not an expensive Swiss Army knife.
On Wed, Oct 27, 2021 at 10:38 AM Rob Cliffe <rob.cliffe@btinternet.com> wrote:
On 26/10/2021 18:25, Chris Angelico wrote:
On Tue, Oct 26, 2021 at 11:10 PM Steven D'Aprano <steve@pearwood.info> wrote:
Based on the multi-pass assignment model, which you still favour, those WOULD be quite inconsistent, and some of them would make little sense. It would also mean that there is a distinct semantic difference between:
def f1(x=>y + 1, y=2): ... def f2(x=>y + 1, y=>2): ... Sure. They behave differently because they are different.
These are different too:
# Block 1 y = 2 x = y + 1
# Block 2 x = y + 1 y = 2 Yes, those ARE different. Those are more equivalent to changing the order of the parameters in the function signature, and I think we all agree that that DOES make a difference. The question is whether these could change meaning if you used a different type of assignment, such as:
y := 2 x = y + 1
Does that suddenly make it legal? I think you'll find that this sort of thing is rather surprising. And that's what we have here: changing from one form of argument default to another changes whether left-to-right applies or not.
I don't want that. And based on an experiment with a less-experienced Python programmer (admittedly only a single data point), neither do other people. Left-to-right makes sense; multi-pass does not. As I may be the data point in question:
You're not - I sat down with one of my brothers and led him towards the problem in question, watching his attempts to solve it. Then showed him what we were discussing, and asked his interpretations of it.
One of my posts seems to have got lost again, so I reproduce some of it (reworked): What I DON'T want to see is allowing something like this being legal: def f(a := b+1, b := e+1, c := a+1, d := 42, e := d+1): If no arguments are passed, the interpreter has to work out to evaluate first d, then e, then b, then a, then finally c. If some arguments are passed, I guess the same order would work. But it feels ... messy. And obfuscated.
At no point will the interpreter reorder things. There have only ever been two (or three) options seriously considered: 1) Assign all parameters from left to right, giving them either a passed-in value or a default 2) First assign all parameters that were passed in values, and those with early-bound defaults; then, in a separate left-to-right pass, assign all late-bound defaults 3) Same as one of the other two, but validated at compilation time and raising SyntaxError for out-of-order references
And if this is legal (note: it IS a legitimate use case): def DrawCircle(centre=(0,0), radius := circumference / TWO_PI, circumference := radius * TWO_PI): the interpreter has to work out whether to evaluate the 2nd or 3rd arg first, depending on which is passed. AFAICS all this may need multiple passes though the args at runtime. Complicated, and inefficient. *If* it could all be sorted out at compile time, my objection would become weaker.
This is not as legit as you might think, since it runs into other problems. Having codependent arguments is not going to be solved by this proposal (for instance, what happens if you pass a radius of 3 and a circumference of 12?), so I'm not going to try to support narrow subsets of this that just happen to "work" (like the case where you only ever pass one of them).
There has been support for evaluating all early-bound defaults before all late-bound defaults. I have been persuaded that this is a reasonable option. AFAICS there would be little practical difference from straight left-to-right evaluation of defaults, since assigning an early-bound default should not have a side effect. So it could even be an implementation choice.
If it's an implementation choice, then it'll mean that code is legal on some interpreters and not others. Whether that's a major problem or not I don't know, but generally, when you can't depend on your code working on all interpreters, it's not best practice (eg "open(fn).read()" isn't recommended, even if it happens to close the file promptly in CPython). ChrisA
On Wed, Oct 27, 2021 at 10:38 AM Rob Cliffe <rob.cliffe@btinternet.com> wrote:
There has been support for evaluating all early-bound defaults before all late-bound defaults. I have been persuaded that this is a reasonable option. AFAICS there would be little practical difference from straight left-to-right evaluation of defaults, since assigning an early-bound default should not have a side effect. So it could even be an implementation choice. If it's an implementation choice, then it'll mean that code is legal on some interpreters and not others. Whether that's a major problem or not I don't know, but generally, when you can't depend on your code working on all interpreters, it's not best practice (eg "open(fn).read()" isn't recommended, even if it happens to close the file promptly in CPython).
ChrisA I guess I wasn't clear. I'm not sure if you understood what I meant to say, or not. First I should have said "binding" rather than "evaluating". What I meant was that in cases like def f(a=earlydefault1, b:= latedefault2, c=earlydefault3, d:=latedefault4): it makes no real difference if the bindings are done in the order a, c, b, d (early ones before late ones) or a, b, c, d (strict left-to-right) since binding an early default should have no side effects, so (I
On 27/10/2021 00:50, Chris Angelico wrote: thought, wrongly) it could be an implementation detail. Of course there IS a difference: it allows late default values to refer to subsequent early default values, e.g. in the example above `latedefault2' could refer to `c`. So yes, then that code would be legal on some interpreters and not others, as you said. If you understood exactly what I meant, I apologise. Rob Cliffe
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7GKSZE... Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, Oct 27, 2021 at 12:50 PM Rob Cliffe <rob.cliffe@btinternet.com> wrote:
On Wed, Oct 27, 2021 at 10:38 AM Rob Cliffe <rob.cliffe@btinternet.com> wrote:
There has been support for evaluating all early-bound defaults before all late-bound defaults. I have been persuaded that this is a reasonable option. AFAICS there would be little practical difference from straight left-to-right evaluation of defaults, since assigning an early-bound default should not have a side effect. So it could even be an implementation choice. If it's an implementation choice, then it'll mean that code is legal on some interpreters and not others. Whether that's a major problem or not I don't know, but generally, when you can't depend on your code working on all interpreters, it's not best practice (eg "open(fn).read()" isn't recommended, even if it happens to close the file promptly in CPython).
ChrisA I guess I wasn't clear. I'm not sure if you understood what I meant to say, or not. First I should have said "binding" rather than "evaluating". What I meant was that in cases like def f(a=earlydefault1, b:= latedefault2, c=earlydefault3, d:=latedefault4): it makes no real difference if the bindings are done in the order a, c, b, d (early ones before late ones) or a, b, c, d (strict left-to-right) since binding an early default should have no side effects, so (I
On 27/10/2021 00:50, Chris Angelico wrote: thought, wrongly) it could be an implementation detail. Of course there IS a difference: it allows late default values to refer to subsequent early default values, e.g. in the example above `latedefault2' could refer to `c`. So yes, then that code would be legal on some interpreters and not others, as you said. If you understood exactly what I meant, I apologise.
Yep, that's precisely the distinction that matters: whether it's legal to refer to parameters further to the right. If we consider tuple unpacking as an approximate parallel: def f(): a = [10,20,30] i, a[i] = 1, "Hi"
On Wed, Oct 27, 2021 at 12:59 PM Chris Angelico <rosuav@gmail.com> wrote:
On Wed, Oct 27, 2021 at 12:50 PM Rob Cliffe <rob.cliffe@btinternet.com> wrote:
On Wed, Oct 27, 2021 at 10:38 AM Rob Cliffe <rob.cliffe@btinternet.com> wrote:
There has been support for evaluating all early-bound defaults before all late-bound defaults. I have been persuaded that this is a reasonable option. AFAICS there would be little practical difference from straight left-to-right evaluation of defaults, since assigning an early-bound default should not have a side effect. So it could even be an implementation choice. If it's an implementation choice, then it'll mean that code is legal on some interpreters and not others. Whether that's a major problem or not I don't know, but generally, when you can't depend on your code working on all interpreters, it's not best practice (eg "open(fn).read()" isn't recommended, even if it happens to close the file promptly in CPython).
ChrisA I guess I wasn't clear. I'm not sure if you understood what I meant to say, or not. First I should have said "binding" rather than "evaluating". What I meant was that in cases like def f(a=earlydefault1, b:= latedefault2, c=earlydefault3, d:=latedefault4): it makes no real difference if the bindings are done in the order a, c, b, d (early ones before late ones) or a, b, c, d (strict left-to-right) since binding an early default should have no side effects, so (I
On 27/10/2021 00:50, Chris Angelico wrote: thought, wrongly) it could be an implementation detail. Of course there IS a difference: it allows late default values to refer to subsequent early default values, e.g. in the example above `latedefault2' could refer to `c`. So yes, then that code would be legal on some interpreters and not others, as you said. If you understood exactly what I meant, I apologise.
Yep, that's precisely the distinction that matters: whether it's legal to refer to parameters further to the right. If we consider tuple unpacking as an approximate parallel:
def f(): a = [10,20,30] i, a[i] = 1, "Hi"
Premature send, oops... def f(): a = [10, 20, 30] i, a[i] = 1, "Hi" print(a) It's perfectly valid to refer to something from earlier in the multiple assignment, because they're assigned left to right. Python doesn't start to look up the name 'a' until it's finished assigning to 'i'. Since Python doesn't really have a concept of statics or calculated constants, we don't really have any parallel, but imagine that we could do this: def f(): # Calculate this at function definition time and then save it # as a constant const pos = random.randrange(3) a = [10, 20, 30] i, a[i] = pos, "Hi" This is something what I'm describing. The exact value of an early-bound argument default gets calculated at definition time and saved; then it gets assigned to its corresponding parameter if one wasn't given. (Actually, I'd really like Python to get something like this, as it'd completely replace the "random=random" optimization - there'd be no need to pollute the function's signature with something that exists solely for the optimization. It'd also make some of these kinds of things a bit easier to explain, since there would be a concept of def-time evaluation separate from the argument list. But we have what we have.) So I think that we did indeed understand one another. ChrisA
On Tue, Oct 26, 2021 at 4:46 PM Rob Cliffe via Python-ideas < python-ideas@python.org> wrote:
There has been support for evaluating all early-bound defaults before all late-bound defaults. I have been persuaded that this is a reasonable option.
How could that be avoided? by definition, early bound is evaluated "earlier" than late-bound :-) early-bound (i.e. regular) parameters are evaluated at function definition time. But the time we get to the late-bound ones, those are actual values, not expressions. The interpreter could notice that early bound names are used in late-bound expressions and raise an error, but if not, there'd be no issue with when they were evaluated. This could cause a bit of confusion with "getting" that it's not a simple left-to-right rule, but that's the same potential confusion with early vs late bound parameters anyway. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Wed, Oct 27, 2021 at 11:17 AM Christopher Barker <pythonchb@gmail.com> wrote:
On Tue, Oct 26, 2021 at 4:46 PM Rob Cliffe via Python-ideas <python-ideas@python.org> wrote:
There has been support for evaluating all early-bound defaults before all late-bound defaults. I have been persuaded that this is a reasonable option.
How could that be avoided? by definition, early bound is evaluated "earlier" than late-bound :-)
early-bound (i.e. regular) parameters are evaluated at function definition time. But the time we get to the late-bound ones, those are actual values, not expressions.
The interpreter could notice that early bound names are used in late-bound expressions and raise an error, but if not, there'd be no issue with when they were evaluated.
This could cause a bit of confusion with "getting" that it's not a simple left-to-right rule, but that's the same potential confusion with early vs late bound parameters anyway.
The question is whether code like this should work: def f(a=>b + 1, b=2): ... f() f(b=4) Pure left-to-right assignment would raise UnboundLocalError in both cases. Tiered evaluation wouldn't. Are there any other places in Python where assignments aren't done left to right, but are done in two distinct phases? ChrisA
On Tue, Oct 26, 2021 at 5:24 PM Chris Angelico <rosuav@gmail.com> wrote:
How could that be avoided? by definition, early bound is evaluated "earlier" than late-bound :-)
This could cause a bit of confusion with "getting" that it's not a simple left-to-right rule, but that's the same potential confusion with early vs late bound parameters anyway.
sorry, got tangled up between "evaluating" and "name binding"
The question is whether code like this should work:
def f(a=>b + 1, b=2): ...
f() f(b=4)
Pure left-to-right assignment would raise UnboundLocalError in both cases. Tiered evaluation wouldn't.
Nice, simple example. I'm not a newbie, but my students are, and I think they'd find "tiered" evaluation really confusing. Are there any other places in Python where assignments aren't done
left to right, but are done in two distinct phases?
I sure can't think of one. I've been thinking about this from the perspective of a teacher or Python. I"m not looking forward to having one more thing to teach about function definitions -- I struggle enough with cover all of the *args, **kwargs, keyword-only, positional-only options. Python used to be such a simple language, not so much anymore :-( That being said, we currently have to teach, fairly early on, the consequences of using a mutable as a default value. And this PEP would make that easier to cover. But I think it's really important to keep the semantics as simple as possible, and left-to-right name binding is they way to do that. (all this complicated by the fact that there is a LOT of code and advice in the wild about the None idiom, but what can you do?) -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Wed, Oct 27, 2021 at 11:41 AM Christopher Barker <pythonchb@gmail.com> wrote:
I've been thinking about this from the perspective of a teacher or Python. I"m not looking forward to having one more thing to teach about function definitions -- I struggle enough with cover all of the *args, **kwargs, keyword-only, positional-only options.
Python used to be such a simple language, not so much anymore :-(
Yeah. My intention here is that this should be completely orthogonal to argument types (positional-only, positional-or-keyword, keyword-only), completely orthogonal to type hints (or, as I'm discovering as I read through the grammar, type comments), and as much as possible else. The new way will look very similar to the existing way of writing defaults, because it's still defining the default.
That being said, we currently have to teach, fairly early on, the consequences of using a mutable as a default value. And this PEP would make that easier to cover. But I think it's really important to keep the semantics as simple as possible, and left-to-right name binding is they way to do that.
Yes. It will now be possible to say "to construct a new list every time, write it like this" instead of drastically changing the style of code. def add_item(thing, target=[]): print("Oops, that uses a single shared default target") def add_item(thing, target=>[]): print("If you don't specify a target, it makes a new list")
(all this complicated by the fact that there is a LOT of code and advice in the wild about the None idiom, but what can you do?)
And that's not all going away. There will always be some situations where you can't define the default with an expression. The idea that a parameter is optional, but doesn't have a value, may itself be worth exploring (maybe some way to say "arg=pass" and then have an operator "if unset arg:" that returns True if it's unbound), but that's for another day. ChrisA
On 27/10/2021 01:47, Chris Angelico wrote:
The idea that a parameter is optional, but doesn't have a value, may itself be worth exploring (maybe some way to say "arg=pass" and then have an operator "if unset arg:" that returns True if it's unbound), but that's for another day.
ChrisA Indeed. And it could be useful to know if a parameter was passed a value or given the default value. Python has very comprehensive introspection abilities, but this is a (small) gap. Rob Cliffe _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/TSUPWX... Code of Conduct: http://python.org/psf/codeofconduct/
On Tue, Oct 26, 2021, 9:54 PM Rob Cliffe via Python-ideas < python-ideas@python.org> wrote:
On 27/10/2021 01:47, Chris Angelico wrote:
The idea that a parameter is optional, but doesn't have a value, may itself be worth exploring (maybe some way to say "arg=pass" and then have an operator "if unset arg:" that returns True if it's unbound), but that's for another day.
ChrisA Indeed. And it could be useful to know if a parameter was passed a value or given the default value. Python has very comprehensive introspection abilities, but this is a (small) gap. Rob Cliffe
I'll try to summarize why I still have pause even though after thinking about it I still can't really think of a solid example showing that this "give me the default" issue is a concrete problem: Until this point, exactly how to provide a default argument value has been the Wild West in python, and as Paul Moore said, there's really not a way to introspect whether a parameter was "defaulted". The result is that a cornucopia of APIs have flourished. By necessity, all these previous APIs provided ways to ask for the default through passing some special value, and this has felt like "the pythonic way"for a night time. We are all used to it (but perhaps only tolerated it in many cases). The proposal blesses a new API with language support, and it will suddenly become THE pythonic approach. But this newly blessed, pythonic API is a radical departure from years- decades- of coding habits. And so even though I like the proposal, I'm just concerned it could be a little bit more painful than at first glance. So it just seems like some version of these concerns belongs in the PEP. Thanks Chris A for putting up with what isn't much more than a hunch (at least on my part) and I'll say nothing more about it. Carry on.
On Wed, Oct 27, 2021 at 1:13 PM Ricky Teachey <ricky@teachey.org> wrote:
I'll try to summarize why I still have pause even though after thinking about it I still can't really think of a solid example showing that this "give me the default" issue is a concrete problem:
Until this point, exactly how to provide a default argument value has been the Wild West in python, and as Paul Moore said, there's really not a way to introspect whether a parameter was "defaulted". The result is that a cornucopia of APIs have flourished. By necessity, all these previous APIs provided ways to ask for the default through passing some special value, and this has felt like "the pythonic way"for a night time. We are all used to it (but perhaps only tolerated it in many cases).
The proposal blesses a new API with language support, and it will suddenly become THE pythonic approach. But this newly blessed, pythonic API is a radical departure from years- decades- of coding habits.
That's a very good point, but I'd like to turn it on its head and explain the situation from the opposite perspective. For years - decades - Python has lacked a way to express the concept that a default argument is something more than a simple value. Coders have used a variety of workarounds. In the future, most or all of those workarounds will no longer be necessary, and we will have a single obvious way to do things. Suppose that, up until today, Python had not had support for big integers - that the only numbers representable were those that fit inside a 32-bit (for compatibility, of course) two's complement integer. People who need to work with larger numbers would use a variety of tricks: storing digits in strings and performing arithmetic manually, or using a tuple of integers to represent a larger number, or using floats and accepting a loss of precision. Then Python gains native support for big integers. Yes, it would be a radical departure from years of workarounds. Yes, some people would continue to use the other methods, because there would be enough differences to warrant it. And yes, there would be backward-incompatible API changes as edge cases get cleaned up. Is it worth it? For integers, I can respond with a resounding YES, because we have plenty of evidence that they are immensely valuable! With default argument expressions, it's less of an obvious must-have, but I do believe that the benefits outweigh the costs. Will the standard library immediately remove all "=None" workaround defaults? No. Probably some of them, but not all. Will there be breakage as a result of something passing None where it wanted the default? Probably - if not in the stdlib, then I am sure it'll happen in third-party code. Will future code be more readable as a result? Absolutely. ChrisA
On Tue, Oct 26, 2021 at 10:44 PM Chris Angelico <rosuav@gmail.com> wrote:
On Wed, Oct 27, 2021 at 1:13 PM Ricky Teachey <ricky@teachey.org> wrote:
I'll try to summarize why I still have pause even though after thinking
about it I still can't really think of a solid example showing that this "give me the default" issue is a concrete problem:
Until this point, exactly how to provide a default argument value has
been the Wild West in python, and as Paul Moore said, there's really not a way to introspect whether a parameter was "defaulted". The result is that a cornucopia of APIs have flourished. By necessity, all these previous APIs provided ways to ask for the default through passing some special value, and this has felt like "the pythonic way"for a night time. We are all used to it (but perhaps only tolerated it in many cases).
The proposal blesses a new API with language support, and it will
suddenly become THE pythonic approach. But this newly blessed, pythonic API is a radical departure from years- decades- of coding habits.
That's a very good point, but I'd like to turn it on its head and explain the situation from the opposite perspective.
For years - decades - Python has lacked a way to express the concept that a default argument is something more than a simple value. Coders have used a variety of workarounds. In the future, most or all of those workarounds will no longer be necessary, and we will have a single obvious way to do things.
Suppose that, up until today, Python had not had support for big integers - that the only numbers representable were those that fit inside a 32-bit (for compatibility, of course) two's complement integer. People who need to work with larger numbers would use a variety of tricks: storing digits in strings and performing arithmetic manually, or using a tuple of integers to represent a larger number, or using floats and accepting a loss of precision. Then Python gains native support for big integers. Yes, it would be a radical departure from years of workarounds. Yes, some people would continue to use the other methods, because there would be enough differences to warrant it. And yes, there would be backward-incompatible API changes as edge cases get cleaned up. Is it worth it? For integers, I can respond with a resounding YES, because we have plenty of evidence that they are immensely valuable! With default argument expressions, it's less of an obvious must-have, but I do believe that the benefits outweigh the costs.
Will the standard library immediately remove all "=None" workaround defaults? No. Probably some of them, but not all. Will there be breakage as a result of something passing None where it wanted the default? Probably - if not in the stdlib, then I am sure it'll happen in third-party code. Will future code be more readable as a result? Absolutely.
ChrisA
If I might paraphrase Agrippa from the book of Acts: "Almost thou persuadest me..." ;) It's a fine answer. <http://python.org/psf/codeofconduct/> Thanks for the attentiveness to my concern, Chris. Very much appreciated! --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
On Wed, Oct 27, 2021 at 1:52 PM Ricky Teachey <ricky@teachey.org> wrote:
If I might paraphrase Agrippa from the book of Acts: "Almost thou persuadest me..." ;) It's a fine answer.
Thanks for the attentiveness to my concern, Chris. Very much appreciated!
My pleasure. This has been a fairly productive discussion thread, and highly informative; thank you for being a part of that! Now, if I could just figure out what's going on in the grammar... (Actually, the PEG parser is a lot easier to understand than Python's older grammar was. But this is my first time messing with it, so I'm going through all the stages of brand-new discovery.) ChrisA
On 2021-10-26 17:41, Christopher Barker wrote:
Python used to be such a simple language, not so much anymore :-(
I quite agree, and I feel like this is my biggest reason why I don't want this "feature" (or any of another gazillion features that have been suggested and/or accepted here, including the walrus). The benefit of this PEP simply does not justify the added complexity of the language as a whole. Using None (or some sentinel) and including code to check for it works fine. It's not worth it to add a whole new layer of behavior to something as basic as argument-passing just to avoid having to type `if arg is None`. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Wed, Oct 27, 2021 at 1:15 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-26 17:41, Christopher Barker wrote:
Python used to be such a simple language, not so much anymore :-(
I quite agree, and I feel like this is my biggest reason why I don't want this "feature" (or any of another gazillion features that have been suggested and/or accepted here, including the walrus). The benefit of this PEP simply does not justify the added complexity of the language as a whole. Using None (or some sentinel) and including code to check for it works fine. It's not worth it to add a whole new layer of behavior to something as basic as argument-passing just to avoid having to type `if arg is None`.
help(bisect.bisect)
bisect_right(a, x, lo=0, hi=None, *, key=None) ... Optional args lo (default 0) and hi (default len(a)) bound the slice of a to be searched. In the docstring, both lo and hi are given useful, meaningful defaults. In the machine-readable signature, which is also what would be used for tab completion or other tools, lo gets a very useful default, but hi gets a default of None. For the key argument, None makes a very meaningful default; it means that no transformation is done. But in the case of hi, the default really and truly is len(a), but because of a technical limitation, that can't be written that way. Suppose this function were written as: bisect_right(a, x, lo=None, hi=None, *, key=None) Do we really need the ability to specify actual function defaults? I mean, you can just have a single line of code "if lo is None: lo = 0" and it'd be fine. Why do we need the ability to put "lo=0" in the signature? I put it to you that the same justification should be valid for hi. ChrisA
On 2021-10-27 at 13:47:31 +1100, Chris Angelico <rosuav@gmail.com> wrote:
On Wed, Oct 27, 2021 at 1:15 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-26 17:41, Christopher Barker wrote:
Python used to be such a simple language, not so much anymore :-(
I quite agree, and I feel like this is my biggest reason why I don't want this "feature" (or any of another gazillion features that have been suggested and/or accepted here, including the walrus). The benefit of this PEP simply does not justify the added complexity of the language as a whole. Using None (or some sentinel) and including code to check for it works fine. It's not worth it to add a whole new layer of behavior to something as basic as argument-passing just to avoid having to type `if arg is None`.
help(bisect.bisect)
bisect_right(a, x, lo=0, hi=None, *, key=None) ... Optional args lo (default 0) and hi (default len(a)) bound the slice of a to be searched.
[...]
Suppose this function were written as:
bisect_right(a, x, lo=None, hi=None, *, key=None)
Do we really need the ability to specify actual function defaults? I mean, you can just have a single line of code "if lo is None: lo = 0" and it'd be fine. Why do we need the ability to put "lo=0" in the signature? I put it to you that the same justification should be valid for hi.
I agree. Why do we need the ability to put "lo=0" in the signature? If we have to rewrite parts of bisect_right for late binding anyway, then why not go all out? If you want to bisect a slice, then bisect a slice: result = bisect_right(a[lo:hi], x) Oh, wait, what if a has billions of elements and creating a slice containing a million or two out of the middle is too expensive? Then provide two functions: def bisect_right(a, x, key=None): return bisect_slice_right(a, x, 0, len(a), key) def bisect_slice_right(a, x, lo, hi, key=None): "actual guts of bisect function go here" "don't actually slice a; that might be really expensive" No extraneous logic to (re)compute default values at all. Probably fewer/simpler units tests, too, but that might depend on the programmer or other organizational considerations. On the other side, parts of doc strings may end up being duplicated, and flat is better than nested. (No, I don't know the backwards compatibility issues that might arise from re-writing bisect_right in this manner, nor do I know what the options are to satisfy IDEs and/or users thereof.) Running with Brandon Barnwell's point, is there a lot of code that gets a lot simpler (and who gets to define "simpler"?) with late binding default values? Can that same code achieve the same outcome by being refactored the same way as I did bisect_right? I'm willing to accept that my revised bisect_right is horrible by some reasonably objective standard, too.
a bit OT: If you want to bisect a slice, then bisect a slice:
result = bisect_right(a[lo:hi], x)
Oh, wait, what if a has billions of elements and creating a slice containing a million or two out of the middle is too expensive?
Yup, that's why Iproposed about a year ago on this list that there should be a way to get a slice view (or slice iterator, AKA islice) easily :-) You can use itertools.islice for this though -- oops, no you can't: TypeError Traceback (most recent call last) <ipython-input-36-9efe49979a8e> in <module> ----> 1 result = bisect.bisect_right(islice(a, lo, hi), x) TypeError: object of type 'itertools.islice' has no len() We really do need a slice view :-) -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On 2021-10-26 19:47, Chris Angelico wrote:
help(bisect.bisect)
bisect_right(a, x, lo=0, hi=None, *, key=None) ... Optional args lo (default 0) and hi (default len(a)) bound the slice of a to be searched.
In the docstring, both lo and hi are given useful, meaningful defaults. In the machine-readable signature, which is also what would be used for tab completion or other tools, lo gets a very useful default, but hi gets a default of None.
How would tab completion work with this new feature? How could a late-bound default be tab-completed? Usually when I've used tab-completion with function arguments it's only completing the argument name, not the default value.
For the key argument, None makes a very meaningful default; it means that no transformation is done. But in the case of hi, the default really and truly is len(a), but because of a technical limitation, that can't be written that way.
Here is some code: def foo(): for a in x: print(a) for b in x: print(b) other_func(foo) Due to a technical limitation, I cannot write foo as a lambda to pass it inline as an argument to foo. Is this also a problem? If so, should we develop the ever-whispered-about multiline lambda syntax? Yes, there are some kinds of default behaviors that cannot be syntactically written inline in the function signature. I don't see that as a huge problem. The fact that you want to write `len(a)` in such a case is not, for me, sufficient justification for all the other cans of worms opened by this proposal (as seen in various subthreads on this list), to say nothing of the additional burden on learners or on readers of code who now must add this to their list of things they have to grok when reading code.
Suppose this function were written as:
bisect_right(a, x, lo=None, hi=None, *, key=None)
Do we really need the ability to specify actual function defaults? I mean, you can just have a single line of code "if lo is None: lo = 0" and it'd be fine. Why do we need the ability to put "lo=0" in the signature? I put it to you that the same justification should be valid for hi.
I understand what you're saying, but I just don't agree. For one thing, functions are written once and called many times. Normal default arguments provide a major leverage effect: a single default argument value can simplify many calls to the function from many call sites, because you can (potentially) omit certain arguments at every call. Late-bound defaults provide no additional simplicity at the call sites; all they do is permit a more terse spelling at the function definition site. Since the definition only has to be written once, a bit more verbosity there is not so much of a burden. Also, the attempt to squeeze both kinds of defaults into the signature breaks the clear and simple rule we have, which is that the signature line is completely evaluated right away (i.e., it is part of the enclosing scope). Right now we have a clean separation between the function definition and the body, which this proposal will muddy quite profoundly. The problem is exacerbated by the proposed syntaxes, nearly all of which I find ugly to varying degrees. But I think even with the best syntax, the underlying problem remains that switching back and forth between definition-scope and body-scope within the signature is confusing. Finally, I think a big problem with the proposal for me is that it really only targets what I see as a quite special case, which is late-bound expressions that are small enough to readably fit in the argument list. All of the arguments (har har) about readability go out the window if people start putting anything complex in there. Similarly if there is any more complex logic required (such as combinations of arguments whose defaults are interdependent in nontrivial ways) that cannot easily be expressed in separable argument defaults, we're still going to have to do it in the function body anyway. So we are adding an entirely new complication to the basic argument syntax just to handle (what I see as) a quite narrow range of expressions. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On 27/10/2021 03:12, Brendan Barnwell wrote:
On 2021-10-26 17:41, Christopher Barker wrote:
Python used to be such a simple language, not so much anymore :-(
I quite agree, and I feel like this is my biggest reason why I don't want this "feature" (or any of another gazillion features that have been suggested and/or accepted here, including the walrus). The benefit of this PEP simply does not justify the added complexity of the language as a whole. Using None (or some sentinel) and including code to check for it works fine. It's not worth it to add a whole new layer of behavior to something as basic as argument-passing just to avoid having to type `if arg is None`.
While I disagree with you re this particular feature, I sympathise with the general sentiment. Perhaps the first objective of Python 4 should be to get rid of as many features as possible. Just like when you're forced to throw out all your precious old junk (school reports, prizes, the present from Aunt Maud, theatre programmes, books you never read, clothes you never wear .....). Nah, who am I kidding? Each feature will have its band of devotees that will defend it TO THE DEATH! Of course, what it should REALLY have are all MY favorite features, including some that haven't been added yet.😁 Rob Cliffe
On 2021-10-26 19:50, Rob Cliffe via Python-ideas wrote:
On 27/10/2021 03:12, Brendan Barnwell wrote:
On 2021-10-26 17:41, Christopher Barker wrote:
Python used to be such a simple language, not so much anymore :-(
I quite agree, and I feel like this is my biggest reason why I don't want this "feature" (or any of another gazillion features that have been suggested and/or accepted here, including the walrus). The benefit of this PEP simply does not justify the added complexity of the language as a whole. Using None (or some sentinel) and including code to check for it works fine. It's not worth it to add a whole new layer of behavior to something as basic as argument-passing just to avoid having to type `if arg is None`.
While I disagree with you re this particular feature, I sympathise with the general sentiment. Perhaps the first objective of Python 4 should be to get rid of as many features as possible. Just like when you're forced to throw out all your precious old junk (school reports, prizes, the present from Aunt Maud, theatre programmes, books you never read, clothes you never wear .....).
Nah, who am I kidding? Each feature will have its band of devotees that will defend it TO THE DEATH! Of course, what it should REALLY have are all MY favorite features, including some that haven't been added yet.😁 Rob Cliffe
Now you're talking! 100% agree! Assuming of course that by "MY favorite features" you mean, well, MY favorite features. . . :-) -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Wed, Oct 27, 2021 at 1:52 PM Rob Cliffe via Python-ideas <python-ideas@python.org> wrote:
On 27/10/2021 03:12, Brendan Barnwell wrote:
On 2021-10-26 17:41, Christopher Barker wrote:
Python used to be such a simple language, not so much anymore :-(
I quite agree, and I feel like this is my biggest reason why I don't want this "feature" (or any of another gazillion features that have been suggested and/or accepted here, including the walrus). The benefit of this PEP simply does not justify the added complexity of the language as a whole. Using None (or some sentinel) and including code to check for it works fine. It's not worth it to add a whole new layer of behavior to something as basic as argument-passing just to avoid having to type `if arg is None`.
While I disagree with you re this particular feature, I sympathise with the general sentiment. Perhaps the first objective of Python 4 should be to get rid of as many features as possible. Just like when you're forced to throw out all your precious old junk (school reports, prizes, the present from Aunt Maud, theatre programmes, books you never read, clothes you never wear .....).
Nah, who am I kidding? Each feature will have its band of devotees that will defend it TO THE DEATH! Of course, what it should REALLY have are all MY favorite features, including some that haven't been added yet.😁
One truism of language design is that the simpler the language is (and the easier to explain to a novice), the harder it is to actually use. For instance, we don't *need* async/await, or generators, or list comprehensions, or for loops, or any of those other tools for processing partial data; all we really need is a class with the appropriate state management. And there are people who genuinely prefer coding a state machine to writing a generator function. No problem! You're welcome to. But the language is richer for having these tools, and we can more easily express our logic using them. Each feature adds to the complexity of the language, but if they are made as orthogonal as possible, they generally add linearly. But they add exponentially to the expressiveness. Which ultimately means that orthogonality is the greatest feature in language design; it allows you to comprehend features one by one, and build up your mental picture of the code in simple ways, while still having the full power that it offers. As an example of orthogonality, Python's current argument defaults don't care whether you're working with integers, strings, lists, or anything else. They always behave the same way: using one specific value (object) to be the value given if one is not provided. They're also completely orthogonal with argument passing styles (positional and keyword), and which ones are valid for which parameters. And orthogonal again with type annotations and type comments. All these features go in the 'def' statement - or 'lambda', which has access to nearly all the same features (it can't have type comments, but everything else works) - but you don't have to worry about exponential complexity, because there's no conflicts between them. One of my goals here is to ensure that the distinction between early-bound and late-bound argument defaults is, again, orthogonal with everything else. You should be able to change "x=None" to "x=>[]" without having to wonder whether you're stopping yourself from adding a type annotation in the future. This is why I'm strongly inclined to syntaxes that adorn the equals sign, rather than those which put tokens elsewhere (eg "@x=[]"), because it keeps the default part self-contained. (It's actually quite fascinating how language design and game design work in parallel. The challenges in making a game fair, balanced, and fun are very similar to the challenges in making a language usable, elegant, and clean. I guess it's because, ultimately, both design challenges are about the humans who'll use the thing, and humans are still humans.) ChrisA
On 2021-10-26 20:15, Chris Angelico wrote:
One truism of language design is that the simpler the language is (and the easier to explain to a novice), the harder it is to actually use. For instance, we don't *need* async/await, or generators, or list comprehensions, or for loops, or any of those other tools for processing partial data; all we really need is a class with the appropriate state management. And there are people who genuinely prefer coding a state machine to writing a generator function. No problem! You're welcome to. But the language is richer for having these tools, and we can more easily express our logic using them.
Each feature adds to the complexity of the language, but if they are made as orthogonal as possible, they generally add linearly. But they add exponentially to the expressiveness. Which ultimately means that orthogonality is the greatest feature in language design; it allows you to comprehend features one by one, and build up your mental picture of the code in simple ways, while still having the full power that it offers.
These are fascinating and great points, but again I see the issues slightly differently. I wouldn't agree that "the simpler the language the harder it is to use". That to me implies an equivalence with "the more complex the language the easier it is to use", which hopefully we agree is untrue. Rather, both extremely simple AND extremely complex languages are difficult to use. The goal is a "sweet spot" in which you add complexity in just the right areas to achieve maximum expressibility with minimum cognitive load. And I think Python overall has done a good job of this, better than pretty much any other language I know of. And I think part of that good design has involved not "sweating the small stuff" in the sense of trying to add lots of special syntax to handle every mildly annoying pain point, but instead focusing on a relatively small number of well-chosen building blocks (iteration, for instance) and providing a clean combinatoric framework to facilitate the exponential expressiveness you describe.
As an example of orthogonality, Python's current argument defaults don't care whether you're working with integers, strings, lists, or anything else. They always behave the same way: using one specific value (object) to be the value given if one is not provided. They're also completely orthogonal with argument passing styles (positional and keyword), and which ones are valid for which parameters. And orthogonal again with type annotations and type comments. All these features go in the 'def' statement - or 'lambda', which has access to nearly all the same features (it can't have type comments, but everything else works) - but you don't have to worry about exponential complexity, because there's no conflicts between them.
One of my goals here is to ensure that the distinction between early-bound and late-bound argument defaults is, again, orthogonal with everything else. You should be able to change "x=None" to "x=>[]" without having to wonder whether you're stopping yourself from adding a type annotation in the future. This is why I'm strongly inclined to syntaxes that adorn the equals sign, rather than those which put tokens elsewhere (eg "@x=[]"), because it keeps the default part self-contained.
So what I would say here is that argument-passing styles are in the sweet spot and argument binding time is not. One reason (as I mentioned in another post on this thread) is that argument-passing occurs at every call site, so simplifying it has the kind of exponential benefit you describe. But argument binding only happens once, when the function is defined, so its benefits scale with how many functions you write, not how many calls you make. Nonetheless two different argument-binding syntaxes still impose a cognitive burden, since now to be able to read Python code a person has to be able to understand both syntaxes instead of just one. The combination of "benefit scales only with the number of definitions" and "have to know two syntaxes all the time", for me, makes the cost/benefit ratio not pencil out. There are also other issues, such as the difficulty of finding a good syntax. I think it's typical of Pythonic style to avoid cramming too much into too small a space. If we have a long expression, we can break it up into pieces assigned to separate variables. If we have a long function, we can break it up into separate functions. But there is no straightforward way to "break up" a single functions signature into multiple signatures (while still having just one function). This means that trying to introduce late-binding logic into the signature requires us to cram it into this uniquely restricted space. And this in turn means that a lot is riding on the choice of a syntax that is visually optimal, because there isn't going to be any way to "factor it out" into something more readable. (Other than, of course, the solution we currently have, which is to put the logic in the function body.) Also there is the question of how this new orthogonal dimension relates to existing dimensions (i.e., is it "linearly independent"). Right now I think we have a pretty simple rule: If you need complicated logic to express as an early-bound default, you move that logic before the function, assign whatever you need to a variable, and then use that variable as the default. If on the other hand you need complicated logic to express a late-bound default, you pick some sentinel (often None) to act as an early-bound placeholder, and move the logic into the function body. Now it's true that we have asymmetry, in that SIMPLE logic can be readably inlined as an early-bound default, whereas even simple logic cannot be inlined as a late-bound default because there is no inline way to express late-bound defaults. But I still think it's worth noticing that the new syntax is not going to add a "full dimension". It is only going to be useful for VERY SIMPLE late-bound default values. If the late-bound default is too complicated, inlining it will backfire and make things LESS readable. So although the syntax is orthogonal, it is not really separating out early and late default binding; it is only separating "simple late default binding". Finally, even if the change is orthogonal to existing dimensions of function-calling, there is still a cost to adding this new dimension in the first place (i.e., one more thing to know). And there still needs to be a decision made about whether the cognitive burden of adding that dimension is worth the gain in expressiveness. Of course there is a lot of subjectivity here and it seems I am in the minority, but to me the ability to concisely express short, simple expressions for late-binding defaults doesn't fall in that sweet spot. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Wed, Oct 27, 2021 at 5:14 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
Now it's true that we have asymmetry, in that SIMPLE logic can be readably inlined as an early-bound default, whereas even simple logic cannot be inlined as a late-bound default because there is no inline way to express late-bound defaults. But I still think it's worth noticing that the new syntax is not going to add a "full dimension". It is only going to be useful for VERY SIMPLE late-bound default values. If the late-bound default is too complicated, inlining it will backfire and make things LESS readable. So although the syntax is orthogonal, it is not really separating out early and late default binding; it is only separating "simple late default binding".
And that right there is the crux of it. The simple cases ARE quite common. Of course there will always be cases that don't work that way, so yes, there will always be the need to use sentinels and put the logic inside the function; but there are plenty where code will benefit from putting the default into the signature, just as we already have with early-bound defaults. It's easy to argue against a feature by showing that it can be abused. For instance, I could rewrite your def function thus: def foo(): for a in x: print(a) for b in x: print(b) other_func(lambda: [print(a) for lst in x,x for a in lst].append(0)]) Tada! I've worked around a technical limitation. Is this good code? No. Would some code benefit from a multi-line lambda function? Definitely. Workarounds can be horrifically clunky, and then they provide a strong incentive to do things better. Or they can be fairly insignificant, which provides a much weaker incentive. (In this case, it's probably fine to just use def!) But they're still workarounds, and there is always benefit to being able to expressing your logic without having to fight the language's limitations. ChrisA
On 27/10/2021 08:56, Chris Angelico wrote:
On Wed, Oct 27, 2021 at 5:14 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
But I still think it's worth noticing that the new syntax is not going to add a "full dimension". It is only going to be useful for VERY SIMPLE late-bound default values. If the late-bound default is too complicated, inlining it will backfire and make things LESS readable. So although the syntax is orthogonal, it is not really separating out early and late default binding; it is only separating "simple late default binding". And that right there is the crux of it. The simple cases ARE quite common. Of course there will always be cases that don't work that way, so yes, there will always be the need to use sentinels and put the logic inside the function; but there are plenty where code will benefit from putting the default into the signature, just as we already have with early-bound defaults.
It's easy to argue against a feature by showing that it can be abused.
+1. The same argument (Brendan's) could be used against having e.g. list comprehensions. Rob Cliffe
On 2021-10-26 20:15, Chris Angelico wrote:
One of my goals here is to ensure that the distinction between early-bound and late-bound argument defaults is, again, orthogonal with everything else. You should be able to change "x=None" to "x=>[]" without having to wonder whether you're stopping yourself from adding a type annotation in the future. This is why I'm strongly inclined to syntaxes that adorn the equals sign, rather than those which put tokens elsewhere (eg "@x=[]"), because it keeps the default part self-contained.
Another point that I forgot to mention when replying to this before: You are phrasing this in terms of orthogonality in argument-passing. But why think of it that way? If we think of it in terms of expression evaluation, your proposal is quite non-orthogonal, because you're essentially creating a very limited form of deferred evaluation that works only in function arguments. In a function argument, people will be able to do `x=>[]`, but they won't be able to do that anywhere else. So you're creating a "mode" for deferred evaluation. This is why I don't get why you seem so resistant to the idea of a more general deferred evaluation approach to this problem. Generalizing deferred evaluation somehow would make the proposal MORE orthogonal to other features, because it would mean you could use a deferred expression as an argument in the same way you could use it in other places. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Sun, Oct 31, 2021 at 9:17 AM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-26 20:15, Chris Angelico wrote:
One of my goals here is to ensure that the distinction between early-bound and late-bound argument defaults is, again, orthogonal with everything else. You should be able to change "x=None" to "x=>[]" without having to wonder whether you're stopping yourself from adding a type annotation in the future. This is why I'm strongly inclined to syntaxes that adorn the equals sign, rather than those which put tokens elsewhere (eg "@x=[]"), because it keeps the default part self-contained.
Another point that I forgot to mention when replying to this before:
You are phrasing this in terms of orthogonality in argument-passing. But why think of it that way? If we think of it in terms of expression evaluation, your proposal is quite non-orthogonal, because you're essentially creating a very limited form of deferred evaluation that works only in function arguments. In a function argument, people will be able to do `x=>[]`, but they won't be able to do that anywhere else. So you're creating a "mode" for deferred evaluation.
This is why I don't get why you seem so resistant to the idea of a more general deferred evaluation approach to this problem. Generalizing deferred evaluation somehow would make the proposal MORE orthogonal to other features, because it would mean you could use a deferred expression as an argument in the same way you could use it in other places.
Please expand on this. How would you provide an expression that gets evaluated in *someone else's context*? The way I've built it, the expression is written and compiled in the context that it will run in. The code for the default expression is part of the function that it serves. If I were to generalize this in any way, it would be to separate two parts: "optional parameter, if omitted, leave unbound" and "if local is unbound: do something". Not to "here's an expression, go evaluate it later", which requires a lot more compiler help. ChrisA
On 10/26/2021 7:38 PM, Rob Cliffe via Python-ideas wrote:
PS Can I echo Guido's plea that people don't derail this PEP by trying to shoehorn deferred-evaluation-objects (or whatever you want to call them) into it? As Chris A says, that's a separate idea and should go into a separate PEP. If I need a screwdriver, I buy a screwdriver, not an expensive Swiss Army knife.
As I've said before, I disagree with this. If you're going to introduce this feature, you need some way of building an inspect.Signature object that refers to the code to be executed. My concern is that if we add something that has deferred evaluation of code, but we don't think of how it might interact with other future uses of deferred evaluation, we might not be able to merge the two ideas. Maybe there's something that could be factored out of PEP 649 (Deferred Evaluation Of Annotations Using Descriptors) that could be used with PEP 671? That said, I'm still -1 on PEP 671. Eric
I'm -100 now on "deferred evaluation, but contorted to be useless outside of argument declarations." At first I thought it might be harmless, but nothing I really care about. After the discussion, I think the PEP would be actively harmful to future Python features. On Sat, Oct 30, 2021, 3:57 PM Eric V. Smith <eric@trueblade.com> wrote:
On 10/26/2021 7:38 PM, Rob Cliffe via Python-ideas wrote:
PS Can I echo Guido's plea that people don't derail this PEP by trying to shoehorn deferred-evaluation-objects (or whatever you want to call them) into it? As Chris A says, that's a separate idea and should go into a separate PEP. If I need a screwdriver, I buy a screwdriver, not an expensive Swiss Army knife.
As I've said before, I disagree with this. If you're going to introduce this feature, you need some way of building an inspect.Signature object that refers to the code to be executed. My concern is that if we add something that has deferred evaluation of code, but we don't think of how it might interact with other future uses of deferred evaluation, we might not be able to merge the two ideas.
Maybe there's something that could be factored out of PEP 649 (Deferred Evaluation Of Annotations Using Descriptors) that could be used with PEP 671?
That said, I'm still -1 on PEP 671.
Eric
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/CCFPVN... Code of Conduct: http://python.org/psf/codeofconduct/
On 2021-10-30 15:07, David Mertz, Ph.D. wrote:
I'm -100 now on "deferred evaluation, but contorted to be useless outside of argument declarations."
At first I thought it might be harmless, but nothing I really care about. After the discussion, I think the PEP would be actively harmful to future Python features.
I'm not sure I'm -100, but still a hard -1, maybe -10. I agree it seems totally absurd to add a type of deferred expression but restrict it to only work inside function definitions. That doesn't make any sense. If we have a way to create deferred expressions we should try to make them more generally usable. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Sat, 30 Oct 2021 at 23:13, Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-30 15:07, David Mertz, Ph.D. wrote:
I'm -100 now on "deferred evaluation, but contorted to be useless outside of argument declarations."
At first I thought it might be harmless, but nothing I really care about. After the discussion, I think the PEP would be actively harmful to future Python features.
I'm not sure I'm -100, but still a hard -1, maybe -10.
I agree it seems totally absurd to add a type of deferred expression but restrict it to only work inside function definitions. That doesn't make any sense. If we have a way to create deferred expressions we should try to make them more generally usable.
I was in favour of the idea, but having seen the implications I'm now -0.5, moving towards -1. I'm uncomfortable with *not* having a "proper" mechanism for building signature objects and other introspection (I don't consider having the expression as a string and requiring consumers to eval it, to be "proper"). And so, I think the implication is that this feature would need some sort of real deferred expression to work properly - and I'd rather deferred expressions were defined as a standalone mechanism, where the full range of use cases (including, but not limited to, late-bound defaults!) can be considered. Paul
On Sun, Oct 31, 2021 at 9:20 AM Paul Moore <p.f.moore@gmail.com> wrote:
On Sat, 30 Oct 2021 at 23:13, Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-30 15:07, David Mertz, Ph.D. wrote:
I'm -100 now on "deferred evaluation, but contorted to be useless outside of argument declarations."
At first I thought it might be harmless, but nothing I really care about. After the discussion, I think the PEP would be actively harmful to future Python features.
I'm not sure I'm -100, but still a hard -1, maybe -10.
I agree it seems totally absurd to add a type of deferred expression but restrict it to only work inside function definitions. That doesn't make any sense. If we have a way to create deferred expressions we should try to make them more generally usable.
I was in favour of the idea, but having seen the implications I'm now -0.5, moving towards -1. I'm uncomfortable with *not* having a "proper" mechanism for building signature objects and other introspection (I don't consider having the expression as a string and requiring consumers to eval it, to be "proper"). And so, I think the implication is that this feature would need some sort of real deferred expression to work properly - and I'd rather deferred expressions were defined as a standalone mechanism, where the full range of use cases (including, but not limited to, late-bound defaults!) can be considered.
Bear in mind that the status quo is, quite honestly, a form of white lie. In the example of bisect: def bisect(a, hi=None): ... def bisect(a, hi=>len(a)): ... neither form of the signature will actually say that the default value is the length of a. In fact, I have never said that the consumer should eval it. There is fundamentally no way to determine the true default value for hi without first figuring out what a is. So which is better: to have the value None, or to have a marker saying "this will be calculated later, and here's a human-readable description: len(a)"? I know that status quo wins a stalemate, but you're holding the new feature to a FAR higher bar than current idioms. ChrisA
On 2021-10-30 15:35, Chris Angelico wrote:
Bear in mind that the status quo is, quite honestly, a form of white lie. In the example of bisect:
def bisect(a, hi=None): ... def bisect(a, hi=>len(a)): ...
neither form of the signature will actually say that the default value is the length of a. In fact, I have never said that the consumer should eval it. There is fundamentally no way to determine the true default value for hi without first figuring out what a is.
The way you use the term "default value" doesn't quite work for me. More and more I think that part of the issue here is that "hi=>len(a)" we aren't providing a default value at all. What we're providing is default *code*. To me a "value" is something that you can assign to a variable, and current (aka "early bound") argument defaults are that. But these new late-bound arguments aren't really default "values", they're code that is run under certain circumstances. If we want to make them values, we need to define "evaluate len(a) later" as some kind of first-class value.
So which is better: to have the value None, or to have a marker saying "this will be calculated later, and here's a human-readable description: len(a)"?
Let me say that in a different way. . . :-) Which is better, to have a marker saying "this will be calculated later", or to have a marker saying "this will be calculated later, and here's a human-readable description"? My point is that None is already a marker. It's true it's not a special-purpose marker meaning "this will be calculated later", but I think in practice it is a marker that says "be sure to read the documentation to understand what passing None will do here". Increasingly it seems to me as if you are placing inordinate weight on the idea that the benefit of default arguments is providing a "human readable" description in the default help() and so on. And, to be frank, I just don't care about that. We can already provide human-readable descriptions in documentation and we should do that instead of trying to create gimped human-readable descriptions that only work in special cases. Or, to put it even more bluntly, from my perspective, having help() show something maybe sort of useful just in the case where the person wrote very simple default-argument logic and didn't take the time to write a real docstring is simply not a worthwhile goal. So really the status quo is "you can already have the human-readable description but you have to type it in the docstring yourself". I don't see that as a big deal. So yes, the status quo is better, because it is not really any worse, and it avoids the complications that are arising in this thread (i.e., what order are the arguments evaluated in, can they reference each other, what symbol do we use, how do we implement it without affecting existing introspection, etc.). -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Sun, Oct 31, 2021 at 9:54 AM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-30 15:35, Chris Angelico wrote:
Bear in mind that the status quo is, quite honestly, a form of white lie. In the example of bisect:
def bisect(a, hi=None): ... def bisect(a, hi=>len(a)): ...
neither form of the signature will actually say that the default value is the length of a. In fact, I have never said that the consumer should eval it. There is fundamentally no way to determine the true default value for hi without first figuring out what a is.
The way you use the term "default value" doesn't quite work for me. More and more I think that part of the issue here is that "hi=>len(a)" we aren't providing a default value at all. What we're providing is default *code*. To me a "value" is something that you can assign to a variable, and current (aka "early bound") argument defaults are that. But these new late-bound arguments aren't really default "values", they're code that is run under certain circumstances. If we want to make them values, we need to define "evaluate len(a) later" as some kind of first-class value.
This is correct. In fact, the way it is in the syntax, the second one is a "default expression". But the Signature object can't actually have the default expression, so it would have to have either a default value, or some sort of marker. Argument defaults, up to Python 3.11, are always default values. If PEP 671 is accepted, it will make sense to have default values AND default expressions.
So which is better: to have the value None, or to have a marker saying "this will be calculated later, and here's a human-readable description: len(a)"?
Let me say that in a different way. . . :-)
Which is better, to have a marker saying "this will be calculated later", or to have a marker saying "this will be calculated later, and here's a human-readable description"?
My point is that None is already a marker. It's true it's not a special-purpose marker meaning "this will be calculated later", but I think in practice it is a marker that says "be sure to read the documentation to understand what passing None will do here".
That's true. The trouble is that it isn't uniquely such a marker, and in fact is very often the actual default value. When you call dict.get(), the second argument has a default of None, and if you omit it, you really truly do get None as a result. Technically, for tools that look at func.__defaults__, I have Ellipsis doing that kind of job. But (a) that's far less common than None, and (b) there's a way to figure out whether it's a real default value or a marker. Of course, there will still be functions that have pseudo-defaults, so you can never be truly 100% sure, but at least you get an indication for those functions that actually use default expressions. So what you have is a marker saying "this is either the value None, or something that will be calculated later". Actually there are multiple markers; None might mean that, but so might "<object object at 0x7f27e77f0570>", which is more likely to mean that it'll be calculated later, but harder to recognize reliably. And in all cases, it might not be a value that's calculated later, but it might be a change in effect (maybe causing an exception to be raised rather than a value being returned). The markers currently are very ad-hoc and can't be depended on by tools. There is fundamentally no way to do better than a marker, but we can at least have more useful markers.
Increasingly it seems to me as if you are placing inordinate weight on the idea that the benefit of default arguments is providing a "human readable" description in the default help() and so on. And, to be frank, I just don't care about that. We can already provide human-readable descriptions in documentation and we should do that instead of trying to create gimped human-readable descriptions that only work in special cases. Or, to put it even more bluntly, from my perspective, having help() show something maybe sort of useful just in the case where the person wrote very simple default-argument logic and didn't take the time to write a real docstring is simply not a worthwhile goal.
Interesting. Then why have default arguments at all? What's the point of saying "if omitted, the default is zero" in a machine-readable way? After all, you could just have it in the docstring. And there are plenty of languages where that's the case. I'm of the opinion that having more information machine-readable is always better. Are you saying that it isn't? Or alternatively, that it's only useful when it fits in a strict subset of constant values (values that don't depend on anything else, and can be calculated at function definition time)?
So really the status quo is "you can already have the human-readable description but you have to type it in the docstring yourself". I don't see that as a big deal. So yes, the status quo is better, because it is not really any worse, and it avoids the complications that are arising in this thread (i.e., what order are the arguments evaluated in, can they reference each other, what symbol do we use, how do we implement it without affecting existing introspection, etc.).
And it also allows an easy transformation for functions that currently are experiencing issues from things being too constant. Consider: default_timeout = 500 def connect(timeout=default_timeout): If that default can be changed at run time, how do you fix the function? By my proposal, you just mark the default as being late-bound. With the status quo, now you need to bury the real default in the body of the function, and make a public statement that the default is None (or something else). That's not a true statement, since None doesn't really make sense as a default, but that's what you have to do to work around a technical limitation. ChrisA
On 2021-10-30 16:12, Chris Angelico wrote:
Increasingly it seems to me as if you are placing inordinate weight on the idea that the benefit of default arguments is providing a "human readable" description in the default help() and so on. And, to be frank, I just don't care about that. We can already provide human-readable descriptions in documentation and we should do that instead of trying to create gimped human-readable descriptions that only work in special cases. Or, to put it even more bluntly, from my perspective, having help() show something maybe sort of useful just in the case where the person wrote very simple default-argument logic and didn't take the time to write a real docstring is simply not a worthwhile goal.
Interesting. Then why have default arguments at all? What's the point of saying "if omitted, the default is zero" in a machine-readable way? After all, you could just have it in the docstring. And there are plenty of languages where that's the case.
The point of default arguments is to allow users of the function to omit arguments at the call site. It doesn't have anything to do with docstrings. Or do you mean why not just have all omitted arguments set to some kind of "undefined" value and then check each one in the body of the function and replace it with a default if you want to? Well, for one thing it makes for cleaner error handling, since Python can tell at the time of the call that a required argument wasn't supplied and raise that error right away. It's sort of like a halfway type check, where you're not actually checking that the correct types of arguments were passed, but at least you know that the arguments that need to be passed were passed and not left out entirely. For another thing, it does mean that if you know the default at the time you're defining the function, you can specify it then. What you can't do is specify the default if you don't know the default at function definition time, but only know "how you're going to decide what value to use" (which is a process, not a value).
I'm of the opinion that having more information machine-readable is always better. Are you saying that it isn't? Or alternatively, that it's only useful when it fits in a strict subset of constant values (values that don't depend on anything else, and can be calculated at function definition time)?
Now wait a minute, before you said the goal was for it to be human readable, but now you're saying it's about being machine readable! :-) What I am saying is that there is a qualitative difference between "I know now (at function definition time) what value to use for this argument if it's missing" and "I know now (at function definition time) *what I will do* if this argument is missing". Specifying "what you will do" is naturally what you do inside the function. It's a process to be done later, it's logic, it's code. It is not the same as finalizing an actual VALUE at function definition time. So yes, there is a qualitative difference between: # this if argument is undefined: argument = some_constant_value # and this if argument is undefined: # arbitrary code here I mean, the difference is that in one case arbitrary code is allowed! That's a big difference. Based on some of your other posts, I'm guessing that what you mean about machine readability is that you appreciate certain kinds of labor-saving "self-documentation" techniques, whereby when we write the machine-readable code, the interpreter automatically derives some human-readable descriptions for stuff. For instance when we write `def foo` we're just defining an arbitrary symbol to be used elsewhere in the code, but if we get an exception Python doesn't just tell us "exception in function number 1234" or the line number, but also tells us the function name. And yeah, I agree that can be useful. And I agree that it would be "nice" if we could write "len(a)" without quotes as machine-readable code, and then have that stored as some human-readable thing that could be shown when appropriate. But if that's nice, why is it only nice in function arguments? Why is it only nice to be able to associate the code `len(a)` with the human-readable string "len(a)" just when that string happens to occur in a function signature? On top of that, even if I agree that that is useful, I see the benefit of that in this case (generating docstrings based on default arguments) as very marginal. I think I agree with the spirit of what you mean "having more information machine-readable is always good", but of course I don't agree that that's literally true --- because you have to balance that good against other goods. In this case, perhaps most notably, we have to balance it against the cognitive load of having two different ways to write arguments, which will have quite different semantics, which based on current proposals are going to differ in a single character, and both of which can be interleaved arbitrarily in the function signature. That's leaving aside all the other questions about the more subtle details of the proposal (like mutual references between defaults), which will only increase the potential cognitive burden for code readers. So yes, it's true that adding convenience functions to derive human-readable forms from machine-readable code is handy, but it's not ALWAYS automatically good regardless of other considerations, and I don't see that it outweighs the costs here. The benefit of autogenerating the string "len(a)" from the argument spec isn't quite zero but it's pretty tiny. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Sun, Oct 31, 2021 at 12:17 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-30 16:12, Chris Angelico wrote:
Increasingly it seems to me as if you are placing inordinate weight on the idea that the benefit of default arguments is providing a "human readable" description in the default help() and so on. And, to be frank, I just don't care about that. We can already provide human-readable descriptions in documentation and we should do that instead of trying to create gimped human-readable descriptions that only work in special cases. Or, to put it even more bluntly, from my perspective, having help() show something maybe sort of useful just in the case where the person wrote very simple default-argument logic and didn't take the time to write a real docstring is simply not a worthwhile goal.
Interesting. Then why have default arguments at all? What's the point of saying "if omitted, the default is zero" in a machine-readable way? After all, you could just have it in the docstring. And there are plenty of languages where that's the case.
The point of default arguments is to allow users of the function to omit arguments at the call site. It doesn't have anything to do with docstrings.
That's optional parameters. It's entirely possible to have optional parameters with no concept of defaults; it means the function can be called with or without those arguments, and the corresponding parameters will either be set to some standard "undefined" value, or in some other way marked as not given.
Or do you mean why not just have all omitted arguments set to some kind of "undefined" value and then check each one in the body of the function and replace it with a default if you want to? Well, for one thing it makes for cleaner error handling, since Python can tell at the time of the call that a required argument wasn't supplied and raise that error right away.
That's the JavaScript way - every parameter is optional. But it's entirely possible to have some mandatory and some optional, while still not having a concept of defaults. Python, so far, has completely conflated those two concepts: if a parameter is optional, it always has a default argument value. (The converse is implied - mandatory args can't have defaults.)
It's sort of like a halfway type check, where you're not actually checking that the correct types of arguments were passed, but at least you know that the arguments that need to be passed were passed and not left out entirely.
Given that Python doesn't generally have any sort of argument type checking, that's exactly what we'll get by default.
For another thing, it does mean that if you know the default at the time you're defining the function, you can specify it then. What you can't do is specify the default if you don't know the default at function definition time, but only know "how you're going to decide what value to use" (which is a process, not a value).
Right. That's the current situation.
I'm of the opinion that having more information machine-readable is always better. Are you saying that it isn't? Or alternatively, that it's only useful when it fits in a strict subset of constant values (values that don't depend on anything else, and can be calculated at function definition time)?
Now wait a minute, before you said the goal was for it to be human readable, but now you're saying it's about being machine readable! :-)
Truly machine readable is the best: any tool can know exactly what will happen. That is fundamentally not possible when the default value is calculated. Mostly machine readable means that the machine can figure out what the default is, even if it doesn't know what that means. My proposal (not in the reference implementation as yet) is to have late-bound defaults contain a marker saying "the default will be len(a)", even though the "len(a)" part would be just a text string.
What I am saying is that there is a qualitative difference between "I know now (at function definition time) what value to use for this argument if it's missing" and "I know now (at function definition time) *what I will do* if this argument is missing". Specifying "what you will do" is naturally what you do inside the function. It's a process to be done later, it's logic, it's code. It is not the same as finalizing an actual VALUE at function definition time. So yes, there is a qualitative difference between:
# this if argument is undefined: argument = some_constant_value
# and this if argument is undefined: # arbitrary code here
I mean, the difference is that in one case arbitrary code is allowed! That's a big difference.
Right. That is a very real difference, which is why there is a very real difference between early-bound and late-bound defaults. But both are argument defaults.
Based on some of your other posts, I'm guessing that what you mean about machine readability is that you appreciate certain kinds of labor-saving "self-documentation" techniques, whereby when we write the machine-readable code, the interpreter automatically derives some human-readable descriptions for stuff. For instance when we write `def foo` we're just defining an arbitrary symbol to be used elsewhere in the code, but if we get an exception Python doesn't just tell us "exception in function number 1234" or the line number, but also tells us the function name.
And yeah, I agree that can be useful. And I agree that it would be "nice" if we could write "len(a)" without quotes as machine-readable code, and then have that stored as some human-readable thing that could be shown when appropriate. But if that's nice, why is it only nice in function arguments? Why is it only nice to be able to associate the code `len(a)` with the human-readable string "len(a)" just when that string happens to occur in a function signature?
It would be very nice to have that feature for a number of places. It's been requested for assertions, for instance. If that subfeature becomes more generally available, the language will be the richer for it.
So yes, it's true that adding convenience functions to derive human-readable forms from machine-readable code is handy, but it's not ALWAYS automatically good regardless of other considerations, and I don't see that it outweighs the costs here. The benefit of autogenerating the string "len(a)" from the argument spec isn't quite zero but it's pretty tiny.
It's mainly about writing expressive code, which can then be interpreted by humans AND machines. It's about writing function defaults as function defaults, not working around a technical limitation. It's about writing function headers such that they have what function headers should have, allowing the function body to contain only the function body. We could just write every function with *args, **kwargs, and then do all argument checking inside the function. We don't do this, because it's the job of the function header to manage this. It's not the function body's job to replace placeholders with actual values when arguments are omitted. ChrisA
On Sat, Oct 30, 2021 at 6:32 PM Chris Angelico <rosuav@gmail.com> wrote:
We could just write every function with *args, **kwargs, and then do all argument checking inside the function. We don't do this, because it's the job of the function header to manage this. It's not the function body's job to replace placeholders with actual values when arguments are omitted.
This is actually a great point, and I don't think it' sa straw man -- a major project I'm soaring on is highly dynamic with a lot of subclassing with complex __init__ signatures. And due to laziness or, to be generous, attempts to be DRY, we have a LOT of mostly *args, **kwargs parameterizations, and the result is that an, e.g. typo in a parameter name doesn't get caught till the very top of the class hierarchy, and it's REALLY hard to tell where the issue is. (and there is literally code manipulating kwargs, like: thing = kwargs.get('something') I'm pushing to refactor that code somewhat to have more clearly laid out function definitions, even if that means some repetition -- after all, we need to document it anyway, so why not have the interpreter do some of the checking for us? The point is: clearly specifying what's required, what's optional, and what the defaults are if optional, is really, really useful -- and this PEP will add another very handy feature to that. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Sat, Oct 30, 2021 at 06:52:33PM -0700, Christopher Barker wrote:
The point is: clearly specifying what's required, what's optional, and what the defaults are if optional, is really, really useful -- and this PEP will add another very handy feature to that.
+1 Earlier I said that better help() is "icing on the cake". I like icing on my cake, and I think that having late-bound defaults clearly readable without any extra effort is a good thing. The status quo is that if you introspect a function's parameters, you will see the sentinel, not the actual default value, or the expression that gives the default value, that the body of the function actually uses. >>> inspect.signature(bisect.bisect_left) <Signature (a, x, lo=0, hi=None, *, key=None)> I challenge anyone to honestly say that if that signature read: <Signature (a, x, lo=0, hi=len(a), *, key=None)> they would not be able to infer the meaning, or that Python would be a worse language if the interpreter managed the evaluation of that default so you didn't have to. And if you really want to manage the late evaluation of defaults yourself, you will still be able to. -- Steve
On Sun, Oct 31, 2021 at 3:31 PM Steven D'Aprano <steve@pearwood.info> wrote:
>>> inspect.signature(bisect.bisect_left) <Signature (a, x, lo=0, hi=None, *, key=None)>
I challenge anyone to honestly say that if that signature read:
<Signature (a, x, lo=0, hi=len(a), *, key=None)>
they would not be able to infer the meaning, or that Python would be a worse language if the interpreter managed the evaluation of that default so you didn't have to.
There is a downside: it is possible to flat-out lie to the interpreter, by mutating bisect_left.__defaults__, so that help() will give a completely false signature. But if you want to shoot yourself in the foot, there are already plenty of gorgeous guns available. Behold, the G3SG1 "High Seas" of footguns:
def spam(x, y, z="foo", *, count=4): ... ... def ham(a, *, n): ... ... spam.__wrapped__ = ham inspect.signature(spam) <Signature (a, *, n)>
Ahhhhh whoops. We just managed to lie to ourselves. Good job, us. ChrisA
On Sun, Oct 31, 2021 at 03:43:25PM +1100, Chris Angelico wrote:
There is a downside: it is possible to flat-out lie to the interpreter, by mutating bisect_left.__defaults__, so that help() will give a completely false signature.
>>> def func(arg="finest green eggs and ham"): ... pass ... >>> inspect.signature(func) <Signature (arg='finest green eggs and ham')> >>> >>> func.__defaults__ = ("yucky crap",) >>> inspect.signature(func) <Signature (arg='yucky crap')> If help, or some other tool is caching the function signature, perhaps it shouldn't :-)
But if you want to shoot yourself in the foot, there are already plenty of gorgeous guns available.
Indeed. Beyond avoiding segmentation faults, I don't think we need to care about people who mess about with the public attributes of functions. You can touch, you can even change them, but keeping the function working is no longer our responsibility at that point. If you change the defaults, you shouldn't get a seg fault when you call the function, but you might get an exception. -- Steve
On Sun, Oct 31, 2021 at 4:15 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Sun, Oct 31, 2021 at 03:43:25PM +1100, Chris Angelico wrote:
There is a downside: it is possible to flat-out lie to the interpreter, by mutating bisect_left.__defaults__, so that help() will give a completely false signature.
>>> def func(arg="finest green eggs and ham"): ... pass ... >>> inspect.signature(func) <Signature (arg='finest green eggs and ham')> >>> >>> func.__defaults__ = ("yucky crap",) >>> inspect.signature(func) <Signature (arg='yucky crap')>
If help, or some other tool is caching the function signature, perhaps it shouldn't :-)
Yep, but with late-bound defaults, there is a slight difference. With early-bound ones, you do have a guarantee that the signature and the behaviour are synchronized; with late-bound, the behaviour is encoded in the function, and the signature has (or will have, once I write that part) some sort of snapshot, either the AST or a source code snippet. (At the moment, they all just show Ellipsis.) So you could reach in and replace the __defaults_extra__ and change how the signature looks:
def foo(a=[], b=>[]): ... ... dis.dis(foo) 1 0 QUERY_FAST 1 (b) 2 POP_JUMP_IF_TRUE 4 (to 8) 4 BUILD_LIST 0 6 STORE_FAST 1 (b) >> 8 LOAD_CONST 0 (None) 10 RETURN_VALUE foo.__defaults__ ([], Ellipsis) foo.__defaults_extra__ (None, '')
The first slot of __defaults_extra__ indicates that the first default is early-bound, but the second one will be the description - "[]" - that would get used in inspect/help. Replacing that would let you change what the default appears to be. I don't think this is a major problem. It's no worse than other things you can mess with, and if you do that sort of thing, you get only what you asked for; there's no way you can get a segfault or even an exception, as long as you use either None or a string. (I should probably have some validation to make sure that those are the only two types in the tuple. Will jot that down as a TODO.) Changing whether the extra slot is None or not is amusing, though still not particularly significant. If you have an early-bound default of Ellipsis, changing extra from None to a string will pretend that it's a late-bound with that value. (If the default value isn't Ellipsis, then according to the spec, the extra should be ignored; but it's possible that some tools will end up using extra first, which would mean they'd be deceived regardless of the actual value.) The behaviour will actually be UnboundLocalError, but inspecting the signature would show the claimed value. On the flip side, if you have a late-bound default and change the extra from a string to None, it will turn it into an early default of Ellipsis, with a small amount of dead code at the start of the function. All of this is implementation details though. What I'll document is that changing __defaults_extra__ requires a tuple of Nones and/or strings (and __kwdefaults_extra__ requires a dict mapping strings to None and/or strings), and I won't recommend actually changing it.
But if you want to shoot yourself in the foot, there are already plenty of gorgeous guns available.
Indeed. Beyond avoiding segmentation faults, I don't think we need to care about people who mess about with the public attributes of functions. You can touch, you can even change them, but keeping the function working is no longer our responsibility at that point.
If you change the defaults, you shouldn't get a seg fault when you call the function, but you might get an exception.
Exactly, and that's what happens. I suppose in some cases it might be nice to get the exception when you assign to __defaults_extra__, but it's not that big a deal if it results in an exception when you call the function. Generally, I would expect that most uses of these dunders will be read-only, or copying them from some other function. Not a lot else. ChrisA
On Sun, Oct 31, 2021 at 5:25 PM Chris Angelico <rosuav@gmail.com> wrote:
I don't think this is a major problem. It's no worse than other things you can mess with, and if you do that sort of thing, you get only what you asked for; there's no way you can get a segfault or even an exception, as long as you use either None or a string.
(I should probably have some validation to make sure that those are the only two types in the tuple. Will jot that down as a TODO.)
Since __kwdefaults__ and __kwdefaults_extra__ are mutable dicts, there's not a lot of point trying to validate them on attribute assignment. Oh well. Would have been nicer to catch errors earlier. Not that it makes a huge difference. None is the most significant value here (it means "that's an early-bound default, even though it's Ellipsis"), and everything else currently just means "that's a late-bound default". ChrisA
On 2021-10-30 18:29, Chris Angelico wrote:
What I am saying is that there is a qualitative difference between "I know now (at function definition time) what value to use for this argument if it's missing" and "I know now (at function definition time) *what I will do* if this argument is missing". Specifying "what you will do" is naturally what you do inside the function. It's a process to be done later, it's logic, it's code. It is not the same as finalizing an actual VALUE at function definition time. So yes, there is a qualitative difference between:
# this if argument is undefined: argument = some_constant_value
# and this if argument is undefined: # arbitrary code here
I mean, the difference is that in one case arbitrary code is allowed! That's a big difference.
Right. That is a very real difference, which is why there is a very real difference between early-bound and late-bound defaults. But both are argument defaults.
I don't 100% agree with that. I mean, here's another way to come at this. Suppose we have this under the proposal (using some random syntax picked at random): def foo(a=1, b="two", c@=len(b), d@=a+c): You keep saying that c and d "are argument default" just like a and b. So can you tell me what the argument default for each of those arguments? The default for argument a is an integer. The default for argument b is a string. Can you tell me, in comparable terms, what the defaults for arguments c and d are? Currently, every argument default is a first-class value. As I understand it, your proposal breaks that assumption, and now argument defaults can be some kind of "construct" that is not a first class value, not any kind of object, just some sort of internal entity that no one can see or manipulate in any way, it just gets automatically evaluated later. I really don't like that. One of the things I like about Python is the "everything is an object" approach under which most of the things that programmers work with, apart from a very few base syntactic constructs, are objects. Many previous expansions to the language, like decorators, context managers, the iteration protocol, etc., worked by building on this object model. But this proposal seems to diverge quite markedly from that. If the "late-bound default" is not an object of some kind just like the early-bound ones are, then I can't agree that both "are argument defaults". -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Sun, Oct 31, 2021 at 1:03 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-30 18:29, Chris Angelico wrote:
What I am saying is that there is a qualitative difference between "I know now (at function definition time) what value to use for this argument if it's missing" and "I know now (at function definition time) *what I will do* if this argument is missing". Specifying "what you will do" is naturally what you do inside the function. It's a process to be done later, it's logic, it's code. It is not the same as finalizing an actual VALUE at function definition time. So yes, there is a qualitative difference between:
# this if argument is undefined: argument = some_constant_value
# and this if argument is undefined: # arbitrary code here
I mean, the difference is that in one case arbitrary code is allowed! That's a big difference.
Right. That is a very real difference, which is why there is a very real difference between early-bound and late-bound defaults. But both are argument defaults.
I don't 100% agree with that.
I mean, here's another way to come at this. Suppose we have this under the proposal (using some random syntax picked at random):
def foo(a=1, b="two", c@=len(b), d@=a+c):
You keep saying that c and d "are argument default" just like a and b. So can you tell me what the argument default for each of those arguments?
The default for a is the integer 1. The default for b is the string "two". The default for c is the length of b. The default for d is the sum of a and c.
The default for argument a is an integer. The default for argument b is a string. Can you tell me, in comparable terms, what the defaults for arguments c and d are?
You're assuming that every default is a *default value*. That is the current situation, but it is by no means the only logical way a default can be defined. See above: c's default is the length of b, which is presumably an integer, and d's default is the sum of that and a, which is probably also an integer (unless a is passed as a float or something).
Currently, every argument default is a first-class value. As I understand it, your proposal breaks that assumption, and now argument defaults can be some kind of "construct" that is not a first class value, not any kind of object, just some sort of internal entity that no one can see or manipulate in any way, it just gets automatically evaluated later.
I really don't like that. One of the things I like about Python is the "everything is an object" approach under which most of the things that programmers work with, apart from a very few base syntactic constructs, are objects. Many previous expansions to the language, like decorators, context managers, the iteration protocol, etc., worked by building on this object model. But this proposal seems to diverge quite markedly from that.
What object is this? a if random.randrange(2) else b Is that an object? No; it's an expression. It's a rule. Not EVERYTHING is an object. Every *value* is an object, and that isn't changing. Or here, what value does a have? def f(): if 0: a = 1 print(a) Does it have a value? No. Is it an object? No. The lack of value is not itself a value, and there is no object that represents it.
If the "late-bound default" is not an object of some kind just like the early-bound ones are, then I can't agree that both "are argument defaults".
Then we disagree. I see them both as perfectly valid defaults - one is a default value, the other is a default expression. ChrisA
On 2021-10-30 19:11, Chris Angelico wrote:
I mean, here's another way to come at this. Suppose we have this under the proposal (using some random syntax picked at random):
def foo(a=1, b="two", c@=len(b), d@=a+c):
You keep saying that c and d "are argument default" just like a and b. So can you tell me what the argument default for each of those arguments?
The default for a is the integer 1. The default for b is the string "two". The default for c is the length of b. The default for d is the sum of a and c.
The default for argument a is an integer. The default for argument b is a string. Can you tell me, in comparable terms, what the defaults for arguments c and d are?
You're assuming that every default is a*default value*. That is the current situation, but it is by no means the only logical way a default can be defined. See above: c's default is the length of b, which is presumably an integer, and d's default is the sum of that and a, which is probably also an integer (unless a is passed as a float or something).
Well, at least that clarifies matters. :-) I was already -1 on this but this moves me to a firm -100. "The length of b" is a description in English, not any kind of programming construct. "The length of b" has no meaning in Python. What we store for the default (even if we don't want to call it a value) has to be a Python construct, not a human-language description. I could say "the default of b is the notion of human frailty poured into golden goblet", and that would be just as valid as "the length of b" as a description and just as meaningless in terms of Python's data model. The default of every argument should be a first-class value. That's how things are now, and I think that's a very useful invariant to have. If we want to break it we need a lot more justification than "I don't like typing if x is None". Apart from all the other things about this proposal I don't support, I don't support the creation of a mysterious "expression" which is not a first class value and cannot be used or evaluated in any way except automatically in the context of calling a function. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Sun, Oct 31, 2021 at 2:23 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-30 19:11, Chris Angelico wrote:
I mean, here's another way to come at this. Suppose we have this under the proposal (using some random syntax picked at random):
def foo(a=1, b="two", c@=len(b), d@=a+c):
You keep saying that c and d "are argument default" just like a and b. So can you tell me what the argument default for each of those arguments?
The default for a is the integer 1. The default for b is the string "two". The default for c is the length of b. The default for d is the sum of a and c.
The default for argument a is an integer. The default for argument b is a string. Can you tell me, in comparable terms, what the defaults for arguments c and d are?
You're assuming that every default is a*default value*. That is the current situation, but it is by no means the only logical way a default can be defined. See above: c's default is the length of b, which is presumably an integer, and d's default is the sum of that and a, which is probably also an integer (unless a is passed as a float or something).
Well, at least that clarifies matters. :-)
I was already -1 on this but this moves me to a firm -100.
"The length of b" is a description in English, not any kind of programming construct. "The length of b" has no meaning in Python. What we store for the default (even if we don't want to call it a value) has to be a Python construct, not a human-language description. I could say "the default of b is the notion of human frailty poured into golden goblet", and that would be just as valid as "the length of b" as a description and just as meaningless in terms of Python's data model.
We have a very good construct to mean "the length of b". It looks like this: len(b) In CPython bytecode, it looks something like this: LOAD_GLOBAL "len" LOAD_FAST "b" CALL_FUNCTION with 1 argument (Very approximately, anyway.) And that's exactly what PEP 671 uses: in the source code, it uses len(b), and in the compiled executable, the corresponding bytecode sequence. Since we don't have a way to define human frailty and golden goblets, we don't have a good way to encode that in either source code or bytecode.
The default of every argument should be a first-class value. That's how things are now, and I think that's a very useful invariant to have. If we want to break it we need a lot more justification than "I don't like typing if x is None".
How important is this? Yes, it's the status quo, but if we evaluate every proposal by how closely it retains the status quo, nothing would ever be changed. Let's look at type checking for a moment, and compare a few similar functions: def spam(stuff: list, n: int): ... Takes a list and an integer. Easy. The integer is required. def spam(stuff: list, n: int = None): if n is None: n = len(stuff) ... Takes a list, and either an integer or None. What does None mean? Am I allowed to pass None, or is it just a construct that makes the parameter optional? _sentinel = object() def spam(stuff: list, n: int = _sentinel): if n is _sentinel: n = len(stuff) ... Takes a list, and either an integer or.... some random object. Can anyone with type checking experience (eg MyPy etc) say how this would best be annotated? Putting it like this doesn't work, nor does Optional[int]. def spam(stuff: list, n: int => len(stuff)): ... MyPy (obviously) doesn't yet support this syntax, so I can't test this, but I would assume that it would recognize that len() returns an integer, and accept this. (That's what it does if I use "= len(stuff)" and a global named stuff.) Currently, argument defaults have to be able to be evaluated at function definition time. It's fine if they can't be precomputed at compile time. Why do we have this exact restriction, neither more nor less? Is that really as important an invariant as you say?
Apart from all the other things about this proposal I don't support, I don't support the creation of a mysterious "expression" which is not a first class value and cannot be used or evaluated in any way except automatically in the context of calling a function.
You use similarly mysterious "expressions" all the time. They can't be treated as first-class values, but they can be used in the contexts they are in. This is exactly the same. You've probably even used expressions that are only evaluated conditionally: recip = 1 / x if x else 0 Is the "1 / x" a first-class value here? No - and that's why if/else is a compiler construct rather than a function. If you wouldn't use this feature, that's fine, but it shouldn't stand or fall on something that the rest of the language doesn't follow. ChrisA
On 2021-10-30 20:44, Chris Angelico wrote:
The default of every argument should be a first-class value. That's how things are now, and I think that's a very useful invariant to have. If we want to break it we need a lot more justification than "I don't like typing if x is None".
How important is this? Yes, it's the status quo, but if we evaluate every proposal by how closely it retains the status quo, nothing would ever be changed. Let's look at type checking for a moment, and compare a few similar functions: <snip>
As I've said before, the problem is that the benefit of this feature is too small to justify this large change to the status quo.
You use similarly mysterious "expressions" all the time. They can't be treated as first-class values, but they can be used in the contexts they are in. This is exactly the same. You've probably even used expressions that are only evaluated conditionally:
recip = 1 / x if x else 0
Is the "1 / x" a first-class value here? No - and that's why if/else is a compiler construct rather than a function.
No, but that example illustrates why: the expression there is not a first-class value, but that's fine because it also has no independent status. The expression as a whole is evaluated and the expression as a whole has a value and that is what you work with. You don't get to somehow stash away just the 1/x part and use it later, but without evaluating it yet. That is not even close to the situation envisioned in the proposal under discussion, because in the proposal this mysterious expression (the argument default), although not a first-class value, is going to be stored and used independently, yet without evaluating it and without "reifying" it into a function or other such object It's obvious that there are tons of things that aren't first-class values in Python (and in any language). The plus sign isn't a first class value, the sequence of characters `a = "this".cou` in source code isn't a first class value. But these usages aren't the same as what's in the proposal under discussion here. I'm not sure if you actually disagree with this or if you're just being disingenuous with your examples. Anyway, I appreciate your engaging with me in this discussion, especially since I'm just some random guy whose opinion is not of major importance. :-) But I think we're both kind of just repeating ourselves at this point. I acknowledge that your proposal is essentially clear and is aimed at serving a genuine use case, but I still see it as involving too much of a departure from existing conventions, too much hairiness in the details, and too little real benefit, to justify the change. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Sun, Oct 31, 2021 at 3:08 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-30 20:44, Chris Angelico wrote:
The default of every argument should be a first-class value. That's how things are now, and I think that's a very useful invariant to have. If we want to break it we need a lot more justification than "I don't like typing if x is None".
How important is this? Yes, it's the status quo, but if we evaluate every proposal by how closely it retains the status quo, nothing would ever be changed. Let's look at type checking for a moment, and compare a few similar functions: <snip>
As I've said before, the problem is that the benefit of this feature is too small to justify this large change to the status quo.
You use similarly mysterious "expressions" all the time. They can't be treated as first-class values, but they can be used in the contexts they are in. This is exactly the same. You've probably even used expressions that are only evaluated conditionally:
recip = 1 / x if x else 0
Is the "1 / x" a first-class value here? No - and that's why if/else is a compiler construct rather than a function.
No, but that example illustrates why: the expression there is not a first-class value, but that's fine because it also has no independent status. The expression as a whole is evaluated and the expression as a whole has a value and that is what you work with. You don't get to somehow stash away just the 1/x part and use it later, but without evaluating it yet. That is not even close to the situation envisioned in the proposal under discussion, because in the proposal this mysterious expression (the argument default), although not a first-class value, is going to be stored and used independently, yet without evaluating it and without "reifying" it into a function or other such object
It's actually the same. There's no "stashing away" of the expression part; it's just part of the function, same as everything else is. (I intend to have a string representation of it stashed away, but that IS a first-class value - a str object, or a PyUnicode if you look in the C API.) You can't use that later in any way other than by calling the function while not passing it the corresponding argument. You cannot use it independently, and you cannot retrieve an unevaluated version of it.
It's obvious that there are tons of things that aren't first-class values in Python (and in any language). The plus sign isn't a first class value, the sequence of characters `a = "this".cou` in source code isn't a first class value. But these usages aren't the same as what's in the proposal under discussion here. I'm not sure if you actually disagree with this or if you're just being disingenuous with your examples.
Right. Some things are simply invalid, others are compiler constructs. I'm pointing out that compiler constructs are very real things, despite not being values. I'm not sure about your ".cou" example; the only reason that isn't a first-class value is that, when you try to look it up, you'll get an error. Or are you saying that an assignment statement isn't a value? If so, then it's the same thing again: a compiler construct, a syntactic feature. In general, syntactic features fall into three broad categories: 1) Literals, which have clear values 2) Expressions, which will yield values when evaluated 3) Everything else. Mostly statements. No value ever. For instance, the notation >> 42j << is a complex literal (an imaginary number), and >> a+b << is an expression representing a sum. Thanks to constant folding, we can pretend that >> 3+4j << is a literal too, although technically it's an expression. Argument defaults are *always* expressions. They can be evaluated at function definition time, or - if PEP 671 is accepted - at function call time. The expression itself is nothing more than a compiler construct, and isn't a first-class value. If it's early-bound, then it'll be evaluated at def time to yield a value which then gets saved on the function object and then gets assigned any time the parameter has no value; if it's late-bound, then any time the parameter has no value, it'll be evaluated at call time, to yield a value which then gets assigned to the parameter. Either way, the same steps happen, just in a different order.
Anyway, I appreciate your engaging with me in this discussion, especially since I'm just some random guy whose opinion is not of major importance. :-) But I think we're both kind of just repeating ourselves at this point. I acknowledge that your proposal is essentially clear and is aimed at serving a genuine use case, but I still see it as involving too much of a departure from existing conventions, too much hairiness in the details, and too little real benefit, to justify the change.
I think everyone's opinion is of major importance here, because you're taking the time to discuss the feature. I could go on a long rant about this, but the short version is: neither representative democracy nor pure democracy is nearly as good as a system where those in charge (here, the PSF) can get informed debate from that specific subset of people who actually care about something :) And it's fine for you to believe that this shouldn't happen. As long as the debate is respectful, courteous, professional, and based on facts, not people ("this idea sucks because YOU thought of it"), it's worth having! I'm happy to continue answering questions or countering arguments, because I believe that this IS of more value than its costs; but I'm also open to being proven wrong on that point (and believe you me, the work of implementing it showed me that the cost isn't all where I thought it would be). ChrisA
Brendan Barnwell writes:
As I've said before, the problem is that the benefit of this feature is too small to justify this large change to the status quo.
I don't disagree with this (but the whole thing is a YAGNI fvo "you" = "me" so I'm uncomfortable agreeing as long "as x = x or default" keeps working ;-).
That is not even close to the situation envisioned in the proposal under discussion, because in the proposal this mysterious expression (the argument default), although not a first-class value, is going to be stored and used independently,
As far as I can see, this is exactly similar to the Lisp &aux, in the sense that there's no let form that a macro can grab in the body of the definition.[1] But the defaulted argument (see what I did there?) in both cases *is* a first-class value, stored in the usual place. In the case of Lisp &aux, as the current binding of the symbol on entry to the body of the defun, and in the case of Chris's proposal, as the current binding of the name on entry to the function body. It's always true in the case of Python that intropecting code is a fraught enterprise. Even in the case of def foo(x, y): return x + y foo(1, 1) it's not possible to introspect *how* foo produced "2" without messing with the byte code. I don't see this as very different: as has been pointed out several times, def bar(x = None): x = [] if x is None else x return x cannot be introspected without messing with the byte code, either. In both cases, the expression that produces the first-class object that x is bound to is explicit in the source code, just in different places, and invisible in the compiled code (unless you're willing to mess with the byte code, in which case you know where to find it after all). Footnotes: [1] True, said macro *can* see the &aux clause in the lambda form, reconstruct the let form, and do its evil work on the lambda, but such omniscience is possible because Lisp is the Language of the Gods, and rarely used by frail mortals.
On Sat, Oct 30, 2021 at 7:03 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
Right. That is a very real difference, which is why there is a very real difference between early-bound and late-bound defaults. But both are argument defaults.
I don't 100% agree with that.
I mean, here's another way to come at this. Suppose we have this under the proposal (using some random syntax picked at random):
def foo(a=1, b="two", c@=len(b), d@=a+c):
You keep saying that c and d "are argument default" just like a and b. So can you tell me what the argument default for each of those arguments?
I'm trying to figure out if this is a (english language) semantics issue or a real conceptual issue. Yes, we don't know what the default "value" (by either the general english definition or the python definition), but we do know that the default will be set to the result of the depression when evaluated in the context of the function, which is very clear to me, at least. The default for argument a is an integer. The default for argument
b is a string. Can you tell me, in comparable terms, what the defaults for arguments c and d are?
yes: the default for c is the result of evaluating `len(b)`, and default to d is the result of evaluating `a+c` in contrast, what are defaults in this case? def foo(a=1, b="two", c=None, d=None): obviously, they are None -- but how useful is that? How about: def foo(a=1, b="two", c=None, d=None): """ ... if None, c is computed as the length of the value of b, and d is computed as that length plus a """ if c is None: c = len(b) if d is None: d = a + c Is that really somehow more clear? And you'd better hope that the docstring matches the code! I'm having a really hard time seeing how this PEP would make anything less clear or confusing. -CHB
Currently, every argument default is a first-class value. As I understand it, your proposal breaks that assumption, and now argument defaults can be some kind of "construct" that is not a first class value, not any kind of object, just some sort of internal entity that no one can see or manipulate in any way, it just gets automatically evaluated later.
I really don't like that. One of the things I like about Python is the "everything is an object" approach under which most of the things that programmers work with, apart from a very few base syntactic constructs, are objects. Many previous expansions to the language, like decorators, context managers, the iteration protocol, etc., worked by building on this object model. But this proposal seems to diverge quite markedly from that.
If the "late-bound default" is not an object of some kind just like the early-bound ones are, then I can't agree that both "are argument defaults".
-- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/Q7OGRK... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On 2021-10-30 at 18:54:51 -0700, Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-30 18:29, Chris Angelico wrote:
Right. That is a very real difference, which is why there is a very real difference between early-bound and late-bound defaults. But both are argument defaults.
I don't 100% agree with that.
This seems to be the crux of this whole sub-discussion. This whole thing scratches an itch I don't have, likely because of the way I learned to design interfaces on all levels. A week or so ago, I was firmly in Brendan Barnwell's camp. I really don't like how the phrase "default value" applies to PEP-671's late binding, and I'm sure that there will remain cases in which actual code inside the function will be required. But I'm beginning to see the logic behind the arguments (pun intended) for the PEP. As prior art, consider Common Lisp's lambda expressions, which are effectively anonymous functions (such expressions are often bound to names, which is how Common Lisp creates named functions); see https://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node64.html for reference. The human description language/wording is different; , but what Python spells "default value," Common Lisp spells "initform." Python is currently much less flexible about when and in what context default values are evaluated; PEP-671 attempts to close that gap, but is hampered by certain technical and emotional baggage. (OTOH, Common Lisp's lambda expressions take one more step and include so-called "aux variables," which aren't parameters at all, but variables local to the function itself. I don't have enough background or context to know why these are included in a lambda expression.)
On Sun, Oct 31, 2021 at 2:43 PM <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:
On 2021-10-30 at 18:54:51 -0700, Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-30 18:29, Chris Angelico wrote:
Right. That is a very real difference, which is why there is a very real difference between early-bound and late-bound defaults. But both are argument defaults.
I don't 100% agree with that.
This seems to be the crux of this whole sub-discussion. This whole thing scratches an itch I don't have, likely because of the way I learned to design interfaces on all levels. A week or so ago, I was firmly in Brendan Barnwell's camp. I really don't like how the phrase "default value" applies to PEP-671's late binding, and I'm sure that there will remain cases in which actual code inside the function will be required. But I'm beginning to see the logic behind the arguments (pun intended) for the PEP.
Current versions of the PEP do not use the term "default value" when referring to late binding (or at least, if I've made a mistake there, then please point it out so I can fix it). I'm using the term "default expression", or just "default" (to cover both values and expressions). And yes; there will always be cases where you can't define the default with a simple expression. For instance, a one-arg lookup might raise an exception where a two-arg one could return a default value. That's currently best written with a dedicated object: _sentinel = object() def fetch(thing, default=_sentinel): ... attempt to get stuff if thing in stuff: return stuff[thing] if default is _sentinel: raise ThingNotFoundError return default In theory, optional arguments without defaults could be written something like this: def fetch(thing, default=pass): ... as above if not exists default: raise ThingNotFoundError return default But otherwise, there has to be some sort of value for every parameter. (I say this as a theory, but actually, the reference implementation of PEP 671 has code very similar to this. There's a bytecode QUERY_FAST which yields True if a local has a value, False if not. It's more efficient than "try: default; True; except UnboundLocalError: False" but will have the same effect.)
The human description language/wording is different; , but what Python spells "default value," Common Lisp spells "initform." Python is currently much less flexible about when and in what context default values are evaluated; PEP-671 attempts to close that gap, but is hampered by certain technical and emotional baggage.
Lisp's execution model is quite different from Python's, but I'd be curious to hear more about this. Can you elaborate? ChrisA
On Sun, Oct 31, 2021 at 02:56:36PM +1100, Chris Angelico wrote:
Current versions of the PEP do not use the term "default value" when referring to late binding (or at least, if I've made a mistake there, then please point it out so I can fix it). I'm using the term "default expression", or just "default" (to cover both values and expressions).
I was just thinking of suggesting that to you, so I'm glad to see you're much faster on the uptake than I am! Of course all parameters are syntactically an expression, including now: # the status quo def func(arg=CONFIG.get('key', NULL)): The default expression is evaluated at function definition time, and the result of that (an object, a.k.a. a value) is cached in the function object for later use. With late-binding: def func(@arg=CONFIG.get('key', NULL)): the expression is stashed away somewhere (implementation details), in some form (source code? byte-code? an AST?) rather than immediately evaluated. At function call time, the expression is evaluated, and the result (an object, a.k.a. a value) is bound to the parameter. In neither case is it correct to say that the default value of arg is the *expression* `CONFIG.get('key', NULL)`, it is in both the early and late bound cases the *result* of *using* (evaluating) the expression to generate a value. https://en.wikipedia.org/wiki/Use%E2%80%93mention_distinction I'm fairly confident that everyone understands that: "the default value is CONFIG.get('key', NULL)" is shorthand for the tediously long and pedantic explanation that it's not the expression itself that is the default value, but the result of evaluating the expression. Just like we understand it here: if arg is None: arg = CONFIG.get('key', NULL) The only difference is when the expression is evaluated. If we can understand that arg gets set to the result of the expression in the second case (the None sentinel), we should be able to understand it in if the syntax changes to late-bound parameters. -- Steve
On Sun, Oct 31, 2021 at 4:37 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Sun, Oct 31, 2021 at 02:56:36PM +1100, Chris Angelico wrote:
Current versions of the PEP do not use the term "default value" when referring to late binding (or at least, if I've made a mistake there, then please point it out so I can fix it). I'm using the term "default expression", or just "default" (to cover both values and expressions).
I was just thinking of suggesting that to you, so I'm glad to see you're much faster on the uptake than I am!
Of course all parameters are syntactically an expression, including now:
# the status quo def func(arg=CONFIG.get('key', NULL)):
The default expression is evaluated at function definition time, and the result of that (an object, a.k.a. a value) is cached in the function object for later use. With late-binding:
def func(@arg=CONFIG.get('key', NULL)):
the expression is stashed away somewhere (implementation details), in some form (source code? byte-code? an AST?) rather than immediately evaluated. At function call time, the expression is evaluated, and the result (an object, a.k.a. a value) is bound to the parameter.
The code for it is part of the byte-code, and I'm planning to have either the source code or the AST (or a reconstituted source code) stored for documentation purposes. This, in fact, is true at compilation time regardless of whether it's early-bound or late-bound. Consider: def make_func(): def func(arg=CONFIG.get('key', NULL)): ... Is the expression for this default argument value "stashed away" somewhere? Well, kinda, I guess. It's part of the code that the 'def' statement produces, and will be run when make_func() runs. The difference is that here: def make_func(): def func(arg=>CONFIG.get('key', NULL)): ... the code is part of func, rather than make_func. So you're absolutely right: either way, the default is an expression. I'm using the term "default expression" to mean that the expression is evaluated at call time rather than def time, but I'm open to other terminology.
In neither case is it correct to say that the default value of arg is the *expression* `CONFIG.get('key', NULL)`, it is in both the early and late bound cases the *result* of *using* (evaluating) the expression to generate a value.
https://en.wikipedia.org/wiki/Use%E2%80%93mention_distinction
I'm fairly confident that everyone understands that:
"the default value is CONFIG.get('key', NULL)"
is shorthand for the tediously long and pedantic explanation that it's not the expression itself that is the default value, but the result of evaluating the expression. Just like we understand it here:
if arg is None: arg = CONFIG.get('key', NULL)
The only difference is when the expression is evaluated.
Exactly. ChrisA
On 2021-10-31 at 14:56:36 +1100, Chris Angelico <rosuav@gmail.com> wrote:
On Sun, Oct 31, 2021 at 2:43 PM <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:
On 2021-10-30 at 18:54:51 -0700, Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-30 18:29, Chris Angelico wrote:
Right. That is a very real difference, which is why there is a very real difference between early-bound and late-bound defaults. But both are argument defaults.
I don't 100% agree with that.
This seems to be the crux of this whole sub-discussion. This whole thing scratches an itch I don't have, likely because of the way I learned to design interfaces on all levels. A week or so ago, I was firmly in Brendan Barnwell's camp. I really don't like how the phrase "default value" applies to PEP-671's late binding, and I'm sure that there will remain cases in which actual code inside the function will be required. But I'm beginning to see the logic behind the arguments (pun intended) for the PEP.
Current versions of the PEP do not use the term "default value" when referring to late binding (or at least, if I've made a mistake there, then please point it out so I can fix it). I'm using the term "default expression", or just "default" (to cover both values and expressions).
I still see anything more complicated than a constant or an extremely simple expression (len(a)? well, ok, maybe; (len(a) if is_prime((len(a)) else next_larger_prime(len(a)))? don't push it) as no longer being a default, but something more serious, but I don't have a better name for it than "computation" or "part of the function" or even "business logic" or "a bad API."
And yes; there will always be cases where you can't define the default with a simple expression. For instance, a one-arg lookup might raise an exception where a two-arg one could return a default value ...
Am I getting ahead of myself, or veering into the weeds, if I ask whether you can catch the exception or what the stacktrace might show? (At this point, that's probably more of a rhetorical question. Again, this is an itch I don't have, so I probably won't use it much.)
The human description language/wording is different; , but what Python spells "default value," Common Lisp spells "initform." Python is currently much less flexible about when and in what context default values are evaluated; PEP-671 attempts to close that gap, but is hampered by certain technical and emotional baggage.
Lisp's execution model is quite different from Python's, but I'd be curious to hear more about this. Can you elaborate?
https://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node64.html explains it in great detail with examples; you're interested in &optional, initform, and possibly supplied-p. Summarizing what I think is relevant: (lambda (&optional x)) is a function with an optional parameter called x. Calling that function without a parameter results in x being bound to nil (Lisp's [overloaded] canonical "false"/undefined/null value) in the function body. (lambda (&optional (x 4))) is a function with an optional parameter called x with an initform. Calling that function without a parameter results in x being bound to the value 4 in the function body. Calling that function with a parameter results in x being bound to the value of that parameter. (lambda (&optional (x 4 p))) is a function with an optional parameter called x with an initform and a supplied-p parameter called p. Calling that function without a parameter results in p being bound to nil and x being bound to the value 4 in the function body. Calling that function with a parameter results in p being bound to t (Lisp's canonical "true" value) and x being bound to the value of that parameter. By default (no pun intended), all of that happens at function call time, before the function body begins, but Lisp has ways of forcing evaluation to take place at other times. (lambda (a &optional (hi (length a)))) works as expected; it's a function that takes one parameter called a (presumably a sequence), and an optional parameter called hi (which defaults to the length of a, but can be overridden by the calling code). I am by no means an expert, and again, I tend not to use optional parameters and default values (aka initforms) except in the simplest ways.
On Sat, Oct 30, 2021 at 10:51:42PM -0700, 2QdxY4RzWzUUiLuE@potatochowder.com wrote:
I still see anything more complicated than a constant or an extremely simple expression (len(a)? well, ok, maybe; (len(a) if is_prime((len(a)) else next_larger_prime(len(a)))? don't push it) as no longer being a default, but something more serious, but I don't have a better name for it than "computation" or "part of the function" or even "business logic" or "a bad API."
There is no benefit to using an actual constant as a late-bound default. If the value is constant, then why delay evaluation? You're going to get the same constant one way or another. So linters should flag misuse like: func(@arg=0) List and dict displays ("literals") like [] and {} are a different story, but they aren't constants. I agree that extremely complex expressions fall under the category of "Don't Do That". But that's a code review and/or linter problem to solve. Most uses of late-binding defaults are going to be short and simple, such as: * an empty list or dict display; * call a function; * access an attribute of self; * len of another argument. Right now, defaults can be set to arbitrarily complex expressions. When is the last time you saw a default that was uncomfortably complex? -- Steve
On Sun, Oct 31, 2021 at 4:55 PM <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:
On 2021-10-31 at 14:56:36 +1100, Chris Angelico <rosuav@gmail.com> wrote:
On Sun, Oct 31, 2021 at 2:43 PM <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:
On 2021-10-30 at 18:54:51 -0700, Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-30 18:29, Chris Angelico wrote:
Right. That is a very real difference, which is why there is a very real difference between early-bound and late-bound defaults. But both are argument defaults.
I don't 100% agree with that.
This seems to be the crux of this whole sub-discussion. This whole thing scratches an itch I don't have, likely because of the way I learned to design interfaces on all levels. A week or so ago, I was firmly in Brendan Barnwell's camp. I really don't like how the phrase "default value" applies to PEP-671's late binding, and I'm sure that there will remain cases in which actual code inside the function will be required. But I'm beginning to see the logic behind the arguments (pun intended) for the PEP.
Current versions of the PEP do not use the term "default value" when referring to late binding (or at least, if I've made a mistake there, then please point it out so I can fix it). I'm using the term "default expression", or just "default" (to cover both values and expressions).
I still see anything more complicated than a constant or an extremely simple expression (len(a)? well, ok, maybe; (len(a) if is_prime((len(a)) else next_larger_prime(len(a)))? don't push it) as no longer being a default, but something more serious, but I don't have a better name for it than "computation" or "part of the function" or even "business logic" or "a bad API."
If your function header doesn't fit in the source code for your function header, then you're probably doing things that are too complicated :) Nothing's changing there.
And yes; there will always be cases where you can't define the default with a simple expression. For instance, a one-arg lookup might raise an exception where a two-arg one could return a default value ...
Am I getting ahead of myself, or veering into the weeds, if I ask whether you can catch the exception or what the stacktrace might show?
(At this point, that's probably more of a rhetorical question. Again, this is an itch I don't have, so I probably won't use it much.)
Ah, sorry, I wasn't too clear here. Consider this API: _sentinel = object() def get_thing(name, default=_sentinel): populate_thing_cache(name) if name in thing_cache: return thing_cache[name] if default is not _sentinel: return default raise ThingNotFoundError In this case, there's no late-bound default value that could be used here. The code to test whether we got one argument or two has to happen down below. So this sort of code wouldn't change as a result of PEP 671; it still needs to be able to be called with either one argument or two, and the distinction can't be written as "if default is _sentinel: default = ..." at the top of the function. But in terms of catching exceptions from default expressions: No, you can't, unless you wrap it in a function or something. I don't expect this sort of thing to be a common need, and if it is, the exception-catching part probably needs its own descriptive name.
The human description language/wording is different; , but what Python spells "default value," Common Lisp spells "initform." Python is currently much less flexible about when and in what context default values are evaluated; PEP-671 attempts to close that gap, but is hampered by certain technical and emotional baggage.
Lisp's execution model is quite different from Python's, but I'd be curious to hear more about this. Can you elaborate?
https://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node64.html explains it in great detail with examples; you're interested in &optional, initform, and possibly supplied-p. Summarizing what I think is relevant:
(lambda (&optional x)) is a function with an optional parameter called x. Calling that function without a parameter results in x being bound to nil (Lisp's [overloaded] canonical "false"/undefined/null value) in the function body.
(lambda (&optional (x 4))) is a function with an optional parameter called x with an initform. Calling that function without a parameter results in x being bound to the value 4 in the function body. Calling that function with a parameter results in x being bound to the value of that parameter.
(lambda (&optional (x 4 p))) is a function with an optional parameter called x with an initform and a supplied-p parameter called p. Calling that function without a parameter results in p being bound to nil and x being bound to the value 4 in the function body. Calling that function with a parameter results in p being bound to t (Lisp's canonical "true" value) and x being bound to the value of that parameter.
Ah okay. So what this gives you is a very clear indication of whether the default value was used or not. I'm not officially recommending this.... actually, let's make that stronger: This is bad code, but it's possible...
def foo(n=>(n_unset := True) and 123): ... try: n_unset ... except UnboundLocalError: print("n was set to", n) ... print("The value of n is:", n) ... foo() The value of n is: 123 foo(123) n was set to 123 The value of n is: 123
Don't. Just don't. :) ChrisA
I'm not sure this answers Chris's question about "technical and emotional baggage," but I hope to clarify the Lisp model a bit. 2QdxY4RzWzUUiLuE@potatochowder.com writes:
As prior art, consider Common Lisp's lambda expressions[...]. The human description language/wording is different; , but what Python spells "default value," Common Lisp spells "initform." Python is currently much less flexible about when and in what context default values are evaluated; PEP-671 attempts to close that gap, but is hampered by certain technical and emotional baggage.
No, they're equally flexible about when. The difference is that Lisp *never* evaluates the initform at definition time, while Python *always* evaluates defaults at definition time. Lisp's approach is more consistent conceptually than Chris's proposal.[1] That is, in Lisp the initform is conceptually always a form to be evaluated[2], while in Chris's approach there are default values and default expressions. In practice, Lisp marks default values by wrapping them in a quote form[3]. Thus where Lisp always has an object that is introspectable in the usual way, Chris's proposal has an invisible thunk that can't be introspected that way because it's inlined into the compiled function. From the naive programmer's point of view, Chris vs. Lisp just flips the polarity of marked vs. unmarked. The difference only becomes apparent if you want to do a sort of metaprogramming by manipulating the default in some way. This matters in Lisp because of macros, which can and do cut deeply into list structure of code, and revise it there. I guess it might matter in MacroPy, but I don't see a huge loss in Python 3.
(OTOH, Common Lisp's lambda expressions take one more step and include so-called "aux variables," which aren't parameters at all, but variables local to the function itself. I don't have enough background or context to know why these are included in a lambda expression.)
It's syntactic sugar that does nothing more nor less than wrap a let form binding the aux variables around the body of the definition. Footnotes: [1] But then everything in Lisp is more consistent because the List is the One DataType to Rule Them All and in the darkness bind them. [2] Even the default default of nil is conceptually evaluated: (eval nil) => nil. The compiler is allowed to optimize the evaluation away if it can determine the result, as in the case of nil or a lambda form (both of which eval to themselves). [3] (quote x) or 'x is a special form that does not evaluate its argument, and returns it as-is when the quote form is evaluated.
On Sat, Oct 30, 2021 at 06:54:51PM -0700, Brendan Barnwell wrote:
I mean, here's another way to come at this. Suppose we have this under the proposal (using some random syntax picked at random):
def foo(a=1, b="two", c@=len(b), d@=a+c):
You keep saying that c and d "are argument default" just like a and b. So can you tell me what the argument default for each of those arguments?
Sure. If you fail to provide an argument for c, the value that is bound to c by default (i.e. the default argument) is len(b), whatever that happens to be. If you fail to provide a value for d, the value that is bound to d by default is a+c. There is nothing in the concept of "default argument" that requires it to be known at function-definition time, or compile time, or when the function is typed into the editor. Do you have a problem understanding me if I say that strftime defaults to the current time? Surely you don't imagine that I mean that it defaults to 15:34:03, which was the time a few seconds ago when I wrote the words "current time". And you probably will understand me if I say that on POSIX systems such as Linux, the default permissions on newly created files is (indirectly) set by the umask.
Currently, every argument default is a first-class value.
And will remain so. This proposal does not add second-class values to the language. By the time the body of the function is entered, the late-bound parameters will have had their defaults evaluated, and the result bound to the parameter. Inside the function object itself, there will be some kind of opaque blob (possibly a function?) that holds the default's expression for later evaluation. That blob itself will be a first class object, like every other object in Python, even if its internal structure is partially or fully opaque.
As I understand it, your proposal breaks that assumption, and now argument defaults can be some kind of "construct" that is not a first class value, not any kind of object, just some sort of internal entity that no one can see or manipulate in any way, it just gets automatically evaluated later.
Kind of like functions themselves :-) >>> (lambda x, y: 2*x + 3**y).__code__.co_code b'd\x01|\x00\x14\x00d\x02|\x01\x13\x00\x17\x00S\x00' The internal structure of that co_code object is a mystery, it is not part of the Python language, only of the implementation, but it remains a first-class value. (My *guess* is that this may be the raw byte-code of the function body.) One difference will be, regardless of how the expression for the late-bound default is stored, there will be at the very least an API to extract a human-readable string representing the expression. -- Steve
On Sun, Oct 31, 2021 at 3:57 PM Steven D'Aprano <steve@pearwood.info> wrote:
Kind of like functions themselves :-)
>>> (lambda x, y: 2*x + 3**y).__code__.co_code b'd\x01|\x00\x14\x00d\x02|\x01\x13\x00\x17\x00S\x00'
The internal structure of that co_code object is a mystery, it is not part of the Python language, only of the implementation, but it remains a first-class value.
(My *guess* is that this may be the raw byte-code of the function body.)
It is; and fortunately, we have a handy tool for examining it:
dis.dis(b'd\x01|\x00\x14\x00d\x02|\x01\x13\x00\x17\x00S\x00') 0 LOAD_CONST 1 2 LOAD_FAST 0 4 BINARY_MULTIPLY 6 LOAD_CONST 2 8 LOAD_FAST 1 10 BINARY_POWER 12 BINARY_ADD 14 RETURN_VALUE
The exact meaning of this string depends on Python implementation and version. Importantly, though, this bytecode must be interpreted within a particular context (here, the context is the lambda function's code and the function itself), which provides meanings for consts and name lookups. There's no sensible way to inject this into some other function and expect it to mean "2*x + 3**y"; at best, it would actually take co_consts[1] * <co_varnames[0]> + co_consts[2] ** <co_varnames[1]>, but that might not mean anything either. ChrisA
On Sun, Oct 31, 2021 at 12:29:18PM +1100, Chris Angelico wrote:
That's optional parameters. It's entirely possible to have optional parameters with no concept of defaults; it means the function can be called with or without those arguments, and the corresponding parameters will either be set to some standard "undefined" value, or in some other way marked as not given.
That's certainly possible with languages like bash where there are typically (never?) no explicit parameters, the function is expected to pop arguments off the argument list and deal with them as it sees fit. Then a missing argument is literally missing from the argument list, and all the parameter binding logic that the Python interpreter handles for you has to be handled manually by the programmer. Bleh. We could emulate that in Python by having the interpreter flag "optional without a default" parameters in such a way that the parameter remains unbound when called without an argument, but why would we want such a thing? That's truly a YAGNI anti-feature. If you really want to emulate bash, we can just declare the function to take `*args` and manage it ourselves, like bash.
Python, so far, has completely conflated those two concepts: if a parameter is optional, it always has a default argument value. (The converse is implied - mandatory args can't have defaults.)
If a mandatory parameter has a default argument, it would never be used, because the function is always called with an argument for that parameter. So we have a four-way table: Mandatory Optional Parameters Parameters ------------- ----------- ------------ No default: okay YAGNI [1] Default: pointless okay ------------- ----------- ------------ [1] And if you do need it, it is easy to emulate with a sentinel: def func(arg=None): if arg is None: del arg process(arg) # May raise UnboundLocalError None of this is relevant to the question of when the default values should be evaluated. -- Steve
On Sun, Oct 31, 2021 at 3:16 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Sun, Oct 31, 2021 at 12:29:18PM +1100, Chris Angelico wrote:
That's optional parameters. It's entirely possible to have optional parameters with no concept of defaults; it means the function can be called with or without those arguments, and the corresponding parameters will either be set to some standard "undefined" value, or in some other way marked as not given.
That's certainly possible with languages like bash where there are typically (never?) no explicit parameters, the function is expected to pop arguments off the argument list and deal with them as it sees fit. Then a missing argument is literally missing from the argument list, and all the parameter binding logic that the Python interpreter handles for you has to be handled manually by the programmer. Bleh.
We could emulate that in Python by having the interpreter flag "optional without a default" parameters in such a way that the parameter remains unbound when called without an argument, but why would we want such a thing? That's truly a YAGNI anti-feature.
If you really want to emulate bash, we can just declare the function to take `*args` and manage it ourselves, like bash.
Python, so far, has completely conflated those two concepts: if a parameter is optional, it always has a default argument value. (The converse is implied - mandatory args can't have defaults.)
If a mandatory parameter has a default argument, it would never be used, because the function is always called with an argument for that parameter. So we have a four-way table:
Mandatory Optional Parameters Parameters ------------- ----------- ------------ No default: okay YAGNI [1] Default: pointless okay ------------- ----------- ------------
[1] And if you do need it, it is easy to emulate with a sentinel:
def func(arg=None): if arg is None: del arg process(arg) # May raise UnboundLocalError
None of this is relevant to the question of when the default values should be evaluated.
Agreed on all points, but I think the YAGNIness of it is less strong than you imply. Rather than a sentinel object, this would be a sentinel lack-of-object. The point of avoiding sentinels is, like everywhere else, that the sentinel isn't really a value - it's just a marker saying "this arg wasn't passed". So I agree with you that this isn't a feature of huge value, and it's not one that I'm pushing for, but it is at least internally consistent and well-defined. Usually, what ends up happening is that there's code somewhere saying "if arg is _sentinel:" (or "if arg is None:"), which would have exactly the same meaning without the sentinel, if we had a way to say "if arg isn't set". ChrisA
On Sat, Oct 30, 2021 at 03:52:14PM -0700, Brendan Barnwell wrote:
The way you use the term "default value" doesn't quite work for me. More and more I think that part of the issue here is that "hi=>len(a)" we aren't providing a default value at all. What we're providing is default *code*. To me a "value" is something that you can assign to a variable, and current (aka "early bound") argument defaults are that.
I think you are twisting the ordinary meaning of "default value" to breaking point in order to argue against this proposal. When we talk about "default values" for function parameters, we always mean something very simple: when you call the function, you can leave out the argument for some parameter from your call, and it will be automatically be assigned a default value by the time the code in the body of the function runs. It says nothing about how that default value is computed, or where it comes from; it says nothing about whether it is evaluated at compile time, or function creation time, or when the function is called. In this case, there are at least three models for providing that default value: 1. The value must be something which can be computed by the compiler, at compile time, without access to the runtime environment. Nothing that cannot be evaluated by static analysis can be used as the default. (In practice, that may limit defaults to literals.) 2. The value must be something which can be computed by the interpreter at function definition time. (Early binding.) This is the status quo for Python, but not for other languages like Smalltalk. 3. The value can be computed at function call time. (Late binding.) That's what Lisp (Smalltalk? others?) does. They are still default values, and it is specious to call them "code". From the perspective of the programmer, the parameter is bound to a value at call time, just like early binding. Recall that the interpreter still has to execute code to get the early bound value. It doesn't happen by magic: the interpreter needs to run code to fetch the precomputed value and bind it to the parameter. Late binding is *exactly* the same except we leave out the "pre". The thing you get bound to the parameter is still a value, not the code used to generate that value. The Python status quo (early binding, #2 above) is that if you want to delay the computation until call time, you have to use a sentinel, then calculate the default value yourself inside the body of the function, then bind it to the parameter manually. But that's just a work-around for lack of interpreter support for late binding.
But these new late-bound arguments aren't really default "values",
Of course they are, in the only sense that matters: when the body of the function runs, the parameter is bound to an object. That object is a value. How the interpreter got that value from, and when it was evaluated, it neither here nor there. It is still a value one way or another.
they're code that is run under certain circumstances. If we want to make them values, we need to define "evaluate len(a) later" as some kind of first-class value.
No we don't. Evaluating the default value later from outside the function's local scope will usually fail, and even if it doesn't fail, there's no use-cases for it. (Yet. If a good one comes up, we can add an API for it later.) And evaluating it from inside the function's local scope is unnecessary, as the interpreter has already done it for you. I believe that the PEP should declare that how the unevaluated defaults are stored prior to evaluation is a private implementation detail. We need an API (in the inspect module? as a method on function objects?) to allow consumers to query those defaults for human-readable text, e.g. needed by help(). But beyond that, I think they should be an opaque blob. Consider the way we implement comprehensions as functions. But we don't make that a language rule. It's just an implementation detail, and may change. Likewise unevaluated defaults may be functions, or ASTs, or blobs of compiled byte-code, or code objects, or even just stored as source code. Whatever the implementors choose.
Increasingly it seems to me as if you are placing inordinate weight on the idea that the benefit of default arguments is providing a "human readable" description in the default help() and so on. And, to be frank, I just don't care about that.
Better help() is just the icing on the cake. The bigger advantages include that we can reduce the need for special sentinels, and that default values no longer need to be evaluated by hand in the body of the function, as the interpreter handles them. And we write the default value in the function signature, where they belong, instead of in the body. Don't discount value of having the interpreter take on the grunt-work of evaluating defaults. Whatever extra complexity goes into the interpreter will be outweighed by the reduced complexity of a million functions no longer having to manually test for a sentinel and evaluate a default. Reducing grunt work is a good thing. Remember, there are languages (Perl? bash?) where there aren't even parameters to functions, the interpreter merely provides you with an argument list and you are responsible for popping the values out of the list and binding them to the variables you want. If we think that having to pop arguments from an argument list is unbelievably primitive, but are happy with having to check for a sentinel value then evaluate the actual desired default value yourself, then I think that we are falling for the Blub paradox. "Late bound defaults? Who needs them? It's bloat and needless frippery." "Default values? Who needs them? It's bloat and needless frippery." "Named parameters? Who needs them? It's bloat and needless frippery." "Functions? Who needs them? It's bloat and needless frippery." I have a mate who worked for a boss who was still arguing in favour of unstructured code without functions or named subroutines in the late 1990s. At the same time that they were working to fix the Y2K problem, his boss was telling them that GOTO was better than named functions, because it was no problem whatsoever to GOTO the line you wanted to jump to and then jump back again with another GOTO when you were finished. Anything more than that was just unnecessary bloat and frippery. -- Steve
On Sun, Oct 31, 2021 at 1:52 PM Steven D'Aprano <steve@pearwood.info> wrote:
I believe that the PEP should declare that how the unevaluated defaults are stored prior to evaluation is a private implementation detail. We need an API (in the inspect module? as a method on function objects?) to allow consumers to query those defaults for human-readable text, e.g. needed by help(). But beyond that, I think they should be an opaque blob.
I haven't re-posted, but while writing up the implementation, I did add a section on implementation details to the PEP. https://www.python.org/dev/peps/pep-0671/#implementation-details (FWIW, it's very similar to what you were saying in the other thread, only I chose to use Ellipsis, since it's commonly understood as a placeholder, rather than NotImplemented, which is a special signal for binary operators.) ChrisA
On Sat, 30 Oct 2021, Brendan Barnwell wrote:
I agree it seems totally absurd to add a type of deferred expression but restrict it to only work inside function definitions.
Functions are already a form of deferred evaluation. PEP 671 is an embellishment to this mechanism for some of the code in the function signature to actually get executed within the body scope, *just like the body of the function*. This doesn't seem weird to me.
If we have a way to create deferred expressions we should try to make them more generally usable.
Does anyone have a proposal for deferred expressions that could match the ease of use of PEP 671 in assigning a default argument of, say, `[]`? The proposals I've seen so far in this thread involve checking `isdeferred` and then resolving that deferred. This doesn't seem any easier than the existing sentinal approach for default arguments, whereas PEP 671 significantly simplifies this use-case. I also don't see how a function could distinguish a deferred default argument and a deferred argument passed in from another function. In my opinion, the latter would be really messy/dangerous to work with, because it could arbitrarily polute your scope. Whereas late-bound default arguments make a lot of sense: they're written in the function itself (just in the signature instead of the body), so we can see by looking at the code what happens. I've written code in dynamically scoped languages before. I don't recall enjoying it. But maybe I missed a proposal, or someone has an idea for how to fix these issues. Erik -- Erik Demaine | edemaine@mit.edu | http://erikdemaine.org/
On Sat, 30 Oct 2021, Erik Demaine wrote:
Functions are already a form of deferred evaluation. PEP 671 is an embellishment to this mechanism for some of the code in the function signature to actually get executed within the body scope, *just like the body of the function*.
I was thinking about what other forms of deferred evaluation Python has, and ran into descriptors [https://docs.python.org/3/howto/descriptor.html]. Classes support this mechanism for calling arbitrary code when accessing the attribute, instead of when calling the class: ``` class CallMeLater: '''Descriptor for calling a specified function with no arguments.''' def __init__(self, func): self.func = func def __get__(self, obj, objtype=None): return self.func() class Foo: early_list = [] late_list = CallMeLater(lambda: []) foo1 = Foo() foo2 = Foo() foo1.early_list == foo2.early_list == foo1.late_list == foo2.late_list foo1.early_list is foo2.early_list # the same [] foo1.late_list is not foo2.late_list # two different []s ``` Written this way, it feels quite a bit like early and late arguments to me. So this got me thinking: What if parameter defaults supported descriptors? Specifically, something like the following: If a parameter (passed or defaulted) has a __get__ method, call it with one argument (beyond self), namely, the function scope's locals(). Parameters are so processed in order from left to right. (PEPs 549 and 649 are somewhat related in that they also propose extending descriptors.) This would enable the following hand-rolled late-bound defaults (using two early-bound defaults): ``` def foo(early_list = [], late_list = CallMeLater(lambda: [])): ... ``` Or we could write a decorator to make this somewhat cleaner: ``` def late_defaults(func): '''Convert callable defaults into late-bound defaults''' func.__defaults__ = tuple( CallMeLater(default) if callable(default) else default for default in func.__defaults__ ) return func @late_defaults def foo(early_list = [], late_list = lambda: []): ... ``` It's also possible, but difficult, to write `end := len(a)` defaults: ``` class LateLength: '''Descriptor for calling len(specified name)''' def __init__(self, name): self.name = name def __get__(self, locals): return len(locals[self.name]) def __repr__(self): # This is bad form for repr, but it makes help(bisect) # output the "right" thing: end=len(a) return f'len({self.name})' def bisect(a, start=0, end=LateLength('a')): ... ``` One feature/bug of this approach is that someone calling the function could pass in a descriptor, and its __get__ method will get called by the function (immediately at the start of the call). Personally I find this dangerous, but those excited about general deferreds might like it? At least it's still executing the function in its natural scope; it's "just" the locals() dict that gets exposed, as an argument. Alternatively, we could forbid this (at least for now): perhaps a __get__ method only gets checked and called on a parameter when that parameter has its default value (e.g. `end is bisect.__defaults__[1]`). In addition to feeling safer (to me), this would enable a lot of optimization: * Parameters without defaults don't need any __get__ checking. * Default values could be checked for the presence of a __get__ method at function definition time (or when setting func.__defaults__), and that flag could get checked at function call time, and __get__ semantics occur only when that flag is set. (I'm not sure whether this would actually save time, though. Maybe if it were a global flag for the function, "any late-bound arguments here?". If not, old behavior and performance.) This proposal could be compatible with PEP 671. What I find nice about this proposal is that it's valid Python syntax today, just an extension of the data model. But I wouldn't necessarily want to use the ugly incantations above, and rather use some syntactic sugar on top of it -- and that's where PEP 671 could come in. What this proposal might offer is a *meaning* for that syntactic sugar, which is more general and perhaps more Pythonic (building on the existing Python data model). It provides another way to think about what the notation in PEP 671 means, and suggests a (different) mechanism to implement it. Some nice features: * __defaults__ naturally generalizes here; no need for auxiliary structures or different signatures for __defaults__. A tool looking at __defaults__ could either be aware of descriptors in this context or not. All other introspection should be the same. * It becomes possible to skip a positional argument again: pass in the value in __defaults__ and it will behave as if that argument wasn't passed. * The syntactic sugar could build a __repr__ (or some new dunder like __help__) that makes help() output the right thing, as in the example above. The use of locals() (as an argument to __get__) is rather ugly, and probably prevents name lookup optimization. Perhaps there's a better way, at least with the syntactic sugar. For eaxmple, in CPython, late-bound defaults using the syntactic sugar could compile the function to include some bytecode that sets the __get__ function's frame to be the function's frame before it gets called. Hmm, but then the function needs to know whether it's the default or something else that got passed in... What do people think? I'm still thinking about possible repurcussions, but it seems like a promising direction to explore... Erik -- Erik Demaine | edemaine@mit.edu | http://erikdemaine.org/
On Mon, Nov 1, 2021 at 2:39 AM Erik Demaine <edemaine@mit.edu> wrote:
On Sat, 30 Oct 2021, Erik Demaine wrote:
Functions are already a form of deferred evaluation. PEP 671 is an embellishment to this mechanism for some of the code in the function signature to actually get executed within the body scope, *just like the body of the function*.
I was thinking about what other forms of deferred evaluation Python has, and ran into descriptors [https://docs.python.org/3/howto/descriptor.html]. Classes support this mechanism for calling arbitrary code when accessing the attribute, instead of when calling the class:
``` class CallMeLater: '''Descriptor for calling a specified function with no arguments.''' def __init__(self, func): self.func = func def __get__(self, obj, objtype=None): return self.func()
class Foo: early_list = [] late_list = CallMeLater(lambda: [])
foo1 = Foo() foo2 = Foo() foo1.early_list == foo2.early_list == foo1.late_list == foo2.late_list foo1.early_list is foo2.early_list # the same [] foo1.late_list is not foo2.late_list # two different []s ```
Written this way, it feels quite a bit like early and late arguments to me. So this got me thinking:
What if parameter defaults supported descriptors? Specifically, something like the following:
If a parameter (passed or defaulted) has a __get__ method, call it with one argument (beyond self), namely, the function scope's locals(). Parameters are so processed in order from left to right.
(PEPs 549 and 649 are somewhat related in that they also propose extending descriptors.)
This is incompatible with the existing __get__ method, so it should get a different name. Also, functions have a __get__ method, so you definitely don't want to have everything that takes a callback run into this. Let's say it's __delayed__ instead.
This would enable the following hand-rolled late-bound defaults (using two early-bound defaults):
``` def foo(early_list = [], late_list = CallMeLater(lambda: [])): ... ```
Or we could write a decorator to make this somewhat cleaner:
``` def late_defaults(func): '''Convert callable defaults into late-bound defaults''' func.__defaults__ = tuple( CallMeLater(default) if callable(default) else default for default in func.__defaults__ ) return func
@late_defaults def foo(early_list = [], late_list = lambda: []): ... ```
It's also possible, but difficult, to write `end := len(a)` defaults:
``` class LateLength: '''Descriptor for calling len(specified name)''' def __init__(self, name): self.name = name def __get__(self, locals): return len(locals[self.name]) def __repr__(self): # This is bad form for repr, but it makes help(bisect) # output the "right" thing: end=len(a) return f'len({self.name})'
def bisect(a, start=0, end=LateLength('a')): ... ```
I'm having a LOT of trouble seeing this as an improvement.
One feature/bug of this approach is that someone calling the function could pass in a descriptor, and its __get__ method will get called by the function (immediately at the start of the call). Personally I find this dangerous, but those excited about general deferreds might like it? At least it's still executing the function in its natural scope; it's "just" the locals() dict that gets exposed, as an argument.
Yes, which means you can't access nonlocals or globals, only locals. So it has a subset of functionality in an awkward way.
Alternatively, we could forbid this (at least for now): perhaps a __get__ method only gets checked and called on a parameter when that parameter has its default value (e.g. `end is bisect.__defaults__[1]`).
That part's not a problem; if this has language support, it could be much more explicit: "if the end parameter was not set".
This proposal could be compatible with PEP 671. What I find nice about this proposal is that it's valid Python syntax today, just an extension of the data model. But I wouldn't necessarily want to use the ugly incantations above, and rather use some syntactic sugar on top of it -- and that's where PEP 671 could come in. What this proposal might offer is a *meaning* for that syntactic sugar, which is more general and perhaps more Pythonic (building on the existing Python data model). It provides another way to think about what the notation in PEP 671 means, and suggests a (different) mechanism to implement it.
I'm not seeing this as less ugly. You have the exact same problems, plus some more, AND it becomes impossible to have an object with this method as an early default - that's the sentinel problem.
Some nice features:
* __defaults__ naturally generalizes here; no need for auxiliary structures or different signatures for __defaults__. A tool looking at __defaults__ could either be aware of descriptors in this context or not. All other introspection should be the same.
You've just highlighted the sentinel problem: there is no value which can be used in __defaults__ that couldn't have been a viable early-bound default.
* It becomes possible to skip a positional argument again: pass in the value in __defaults__ and it will behave as if that argument wasn't passed.
That's not as valuable as you might think. Faking that an argument wasn't passed - that is, passing an argument that pretends that an argument wasn't passed - is already dubious, and it doesn't work with *args. It would also prevent the safety check that I used above; you have to completely conflate "passed this value" and "didn't pass any value".
The use of locals() (as an argument to __get__) is rather ugly, and probably prevents name lookup optimization.
Yes. It also prevents use of anything other than locals. For instance, you can't have global helper functions, or anything like that; you could use something like len() from the builtins, but you couldn't use a function defined in the same module. Passing both globals and locals would be better, but still imperfect; and it incurs double lookups every time.
Perhaps there's a better way, at least with the syntactic sugar. For eaxmple, in CPython, late-bound defaults using the syntactic sugar could compile the function to include some bytecode that sets the __get__ function's frame to be the function's frame before it gets called. Hmm, but then the function needs to know whether it's the default or something else that got passed in...
Yes, it does. Which doesn't work if you want to be able to pass the default to pretend that nothing was passed.
What do people think? I'm still thinking about possible repurcussions, but it seems like a promising direction to explore...
Sure. Explore anything you like! But I don't think that this is any less ugly than either the status quo or PEP 671, both of which involve actual real code being parsed by the compiler. Chrisa
On Mon, 1 Nov 2021, Chris Angelico wrote:
This is incompatible with the existing __get__ method, so it should get a different name. Also, functions have a __get__ method, so you definitely don't want to have everything that takes a callback run into this. Let's say it's __delayed__ instead.
Right, good point. I'm clearly still learning about descriptors. :-)
I'm having a LOT of trouble seeing this as an improvement.
It's not meant to be an improvement exactly, more of a compatible explanation of how PEP 671 works -- in the same way that `instance.method` doesn't "magically" make a bound method, but rather checks whether `instance.method` has a `__get__` attribute, and if so, calls it with `instance` as an argument, instead of returning `instance.method` directly. This mechanism makes the whole `instance.method` less magic, more introspectable, more overridable, etc., e.g. making classmethod and similar decorators possible. I'm trying to do the same thing with PEP 671 (though possibly failing :-)).
At least it's still executing the function in its natural scope; it's "just" the locals() dict that gets exposed, as an argument.
Yes, which means you can't access nonlocals or globals, only locals. So it has a subset of functionality in an awkward way.
My actual intent was to just be able to access the arguments, which are all locals to the function. [Conceptually, I was thinking of the arguments being in their own object, and then getting accessed once like attributes, which triggered __get__ if defined -- but this view isn't very good, in particular because we don't want to redefine what it means to pass functions as arguments!] But the __delayed__ method is already a function, so it has its own locals, nonlocals, and globals. The difference is that those are in the frame of __delayed__, which is outside the function with the defaults, and I wanted to access that function's arguments -- hence passing in the function's locals().
Alternatively, we could forbid this (at least for now): perhaps a __get__ method only gets checked and called on a parameter when that parameter has its default value (e.g. `end is bisect.__defaults__[1]`).
That part's not a problem; if this has language support, it could be much more explicit: "if the end parameter was not set".
True. I was trying to preserve the "skip this argument" property, but it might make more sense to call __delayed__ only when the argument is omitted. This might make it possible for defaults with __delayed__ methods to actually be evaluated in the function's scope, which would make it more compatible with the current PEP 671.
AND it becomes impossible to have an object with this method as an early default - that's the sentinel problem.
That's true. I guess my point is that these *are* early defaults, but act very much like late defaults. Functions or function calls just treat these early defaults specially because they have a __delayed__ method. I agree it's not perfect, but is there a context where you'd actually want to have an early default that is one of these objects? The point to add a method to an early default that makes the early default behave like a late default. So this feels like expected behavior...?
The use of locals() (as an argument to __get__) is rather ugly, and probably prevents name lookup optimization.
Yes. It also prevents use of anything other than locals. For instance, you can't have global helper functions, or anything like that; you could use something like len() from the builtins, but you couldn't use a function defined in the same module. Passing both globals and locals would be better, but still imperfect; and it incurs double lookups every time.
That wasn't my intent. The __delayed__ method is still a function, and has its own locals, nonlocals, and globals. It can still call len (as my example code did) -- it's just the len visible from the __delayed__ function, not the len visible from the function with the default parameter. It's true that this approach would prevent implementing something like this: ``` def foo(a => (b := 5)): nonlocal b ``` I'm not sure that that is particularly important: I just wanted the default expression to be able to access the arguments and the surrounding scopes.
Sure. Explore anything you like! But I don't think that this is any less ugly than either the status quo or PEP 671, both of which involve actual real code being parsed by the compiler.
This proposal was meant to help define what the compiler with PEP 671 parsed code *into*. Erik -- Erik Demaine | edemaine@mit.edu | http://erikdemaine.org/
On Sun, Oct 31, 2021 at 9:09 AM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
I'm -100 now on "deferred evaluation, but contorted to be useless outside of argument declarations."
At first I thought it might be harmless, but nothing I really care about. After the discussion, I think the PEP would be actively harmful to future Python features.
It's nothing more than an implementation detail. If you want to suggest - or better yet, write - an alternative implementation, I would welcome it. Can you explain how this is "actively harmful"? ChrisA
On Sat, Oct 30, 2021, 6:29 PM Chris Angelico <rosuav@gmail.com> wrote:
At first I thought it might be harmless, but nothing I really care about. After the discussion, I think the PEP would be actively harmful to future Python features.
It's nothing more than an implementation detail. If you want to suggest -
or better yet, write - an alternative implementation, I would welcome it. Can you explain how this is "actively harmful"?
Both the choice of syntax and the discussion of proposed implementation (both yours and Steven's) would make it more difficult later to advocate and implement a more general "deferred" mechanism in the future. If you were proposing the form that MAL and I proposed (and a few others basically agreed) of having a keyword like 'defer' that could, in concept, only be initially available in function signatures but later be extended to other contexts, I wouldn't see a harm. Maybe Steven's `@name=` could accommodate that too. I'm not sure what I think of a general statement like: @do_later = fun1(data) + fun2(data) I.e. we expect to evaluate the first class object `do_later` in some other context, but only if requested within a program branch where `data` is in scope. The similarity to decorators feels wrong, even though I think it's probably not ambiguous syntactically. In a sense, the implementation doesn't matter as much if the syntax is something that could be used more widely. Clearly, adding something to the dunders of a function object isn't a general mechanism, but if behavior was kept consistent, the underlying implementation could change in principle. Still, the check for a sentinel in the first few lines of a function body is easy and fairly obvious, as well as long-standing. New syntax for a trivial use is just clutter and cognitive burden for learners and users.
On Sun, Oct 31, 2021 at 12:31 PM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
On Sat, Oct 30, 2021, 6:29 PM Chris Angelico <rosuav@gmail.com> wrote:
At first I thought it might be harmless, but nothing I really care about. After the discussion, I think the PEP would be actively harmful to future Python features.
It's nothing more than an implementation detail. If you want to suggest - or better yet, write - an alternative implementation, I would welcome it. Can you explain how this is "actively harmful"?
Both the choice of syntax and the discussion of proposed implementation (both yours and Steven's) would make it more difficult later to advocate and implement a more general "deferred" mechanism in the future.
If you were proposing the form that MAL and I proposed (and a few others basically agreed) of having a keyword like 'defer' that could, in concept, only be initially available in function signatures but later be extended to other contexts, I wouldn't see a harm. Maybe Steven's `@name=` could accommodate that too.
I'm not sure what I think of a general statement like:
@do_later = fun1(data) + fun2(data)
I.e. we expect to evaluate the first class object `do_later` in some other context, but only if requested within a program branch where `data` is in scope.
The problem here is that you're creating an object that can be evaluated in someone else's scope. I'm not creating that. I'm creating something that gets evaluated in its own scope - the function currently being defined. If you want to create a "deferred" type, go ahead, but it won't conflict with this. There wouldn't be much to gain by restricting it to function arguments. Go ahead and write up a competing proposal - it's much more general than this.
The similarity to decorators feels wrong, even though I think it's probably not ambiguous syntactically.
The way you've written it, it's bound to an assignment, which seems very odd. Are you creating an arbitrary object which can be evaluated in some other context? Wouldn't that be some sort of constructor call?
In a sense, the implementation doesn't matter as much if the syntax is something that could be used more widely. Clearly, adding something to the dunders of a function object isn't a general mechanism, but if behavior was kept consistent, the underlying implementation could change in principle.
Still, the check for a sentinel in the first few lines of a function body is easy and fairly obvious, as well as long-standing. New syntax for a trivial use is just clutter and cognitive burden for learners and users.
A trivial use? But a very common one. The more general case would be of value, but would also be much more cognitive burden. But feel free to write something up and see how that goes. Maybe it will make a competing proposal to PEP 671; or maybe it'll end up being completely independent. ChrisA
I'm a bit confused as to why folks are making pronouncements about their support for this PEP before it's even finished, but, oh well. As for what seems like one major issue: Yes, this is a kind of "deferred" evaluation, but it is not a general purpose one, and that, I think, is the strength of the proposal, it's small and specific, and, most importantly, the scope in which the expression will be evaluated is clear and simple. In contrast, a general deferred object would, to me, be really confusing about what scope it would get evaluated in -- I can't even imagine how I would do that -- how the heck am I supposed to know what names will be available in some function scope I pass this thing into??? Also, this would only allow a single expression, not an arbitrary amount of code -- if we're going to have some sort of "deferred object" -- folks will very soon want more than that, and want full deferred function evaluation. So that really is a whole other kettle of fish, and should be considered entirely separately. As for inspect -- yes, it would be great for these late-evaluated defaults to have a good representation there, but I can only see that as opening the door to more featureful deferred object, certainly not closing it. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
Warning: Bear of Very Little Brain talking. Aren't we in danger of missing the wood for the (deferred-evaluation) trees here? Late-bound-arguments defaults are USEFUL. Being able to specify that a default value is a new empty list or a new empty set is USEFUL. Plenty of people have asked on Stack Overflow why it doesn't "work" already. Yes, lots of those people will still ask, but with late bound defaults they can get a completely satisfactory answer. Using None or other sentinel to mean "no value was passed" is basically a hack. A kludge. It is MUCH CLEANER to get a default value when no value is passed without resorting to such a hack. Which is fine if the default value is constant (please don't waste your time quibbling with me if I get the technicalities wrong) and can be evaluated early, but needs a late-bound-default if ... well, if it needs to be evaluated late. Furthermore, having a late-bound default which refers to preceding parameters such as ..., hi:=len(a) ... is totally clear to anyone who understands function parameter default values (i.e. pretty much anybody who can read or write functions). And again VERY convenient. As Chris A has already said, the idiom if param = None: param = [] # or whatever is VERY frequent. And saving 2 lines of code in a particular scenario may be no great virtue, but when it happens so VERY frequently (with no loss of clarity) it certainly is. PEP 671, if accepted, will undoubtedly be USEFUL to Python programmers. As for deferred evaluation objects: First, Python already has various ways of doing deferred evaluation: Lambdas Strings, which can be passed to eval / exec / compile. You can write decorator(s) for functions to make them defer their evaluation. You can write a class of deferred-evaluation objects. None of these ways is perfect. Each have their pros and cons. The bottom line, if I understand correctly (maybe I don't) is that there has to be a way of specifying (implicitly or explicitly) when the (deferred) evaluation occurs, and also what the evaluation context is (e.g. for eval, locals and globals must be specified, either explicitly or implicitly). So maybe it would be nice if Python had its own "proper" deferred-evaluation model. But it doesn't. And as far as I know, there isn't a PEP, either completed, or at least well on the way, which proposes such a model. (Perhaps it's a really difficult problem. Perhaps there is not enough interest for someone to have already tried it. Perhaps there are so many ways of doing it, as CHB says below, that it's hard to decide which. I don't know.) If there were, it would be perfectly reasonable to ask how it would interact with PEP 671. But as it's just vapourware, it seems wrong to stall PEP 671 (how long? indefinitely?) because of the POSSIBILITY that it MIGHT turn out to more convenient for some future model (IF that ever happens) if PEP 671 had been implemented in a different way. [I just saw a post from David Mertz giving an idea, using a "delay" keyword, before I finished composing this email. But that is just a sketch, miles away from a fully-fledged proposal or PEP. And probably just one of very many that have appeared over the years on envelope backs before vanishing into the sunset ...] Best wishes Rob Cliffe On 31/10/2021 02:08, Christopher Barker wrote:
I'm a bit confused as to why folks are making pronouncements about their support for this PEP before it's even finished, but, oh well.
As for what seems like one major issue:
Yes, this is a kind of "deferred" evaluation, but it is not a general purpose one, and that, I think, is the strength of the proposal, it's small and specific, and, most importantly, the scope in which the expression will be evaluated is clear and simple. In contrast, a general deferred object would, to me, be really confusing about what scope it would get evaluated in -- I can't even imagine how I would do that -- how the heck am I supposed to know what names will be available in some function scope I pass this thing into??? Also, this would only allow a single expression, not an arbitrary amount of code -- if we're going to have some sort of "deferred object" -- folks will very soon want more than that, and want full deferred function evaluation. So that really is a whole other kettle of fish, and should be considered entirely separately.
As for inspect -- yes, it would be great for these late-evaluated defaults to have a good representation there, but I can only see that as opening the door to more featureful deferred object, certainly not closing it.
-CHB
-- Christopher Barker, PhD (Chris)
Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
_______________________________________________ Python-ideas mailing list --python-ideas@python.org To unsubscribe send an email topython-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived athttps://mail.python.org/archives/list/python-ideas@python.org/message/DYH5LZ... Code of Conduct:http://python.org/psf/codeofconduct/
On Sun, Oct 31, 2021 at 2:20 PM Rob Cliffe via Python-ideas <python-ideas@python.org> wrote:
As for deferred evaluation objects: First, Python already has various ways of doing deferred evaluation: Lambdas Strings, which can be passed to eval / exec / compile. You can write decorator(s) for functions to make them defer their evaluation. You can write a class of deferred-evaluation objects. None of these ways is perfect. Each have their pros and cons. The bottom line, if I understand correctly (maybe I don't) is that there has to be a way of specifying (implicitly or explicitly) when the (deferred) evaluation occurs, and also what the evaluation context is (e.g. for eval, locals and globals must be specified, either explicitly or implicitly).
That last part is the most important here: it has to be evaluated *in the context of the function*. That's the only way for things like "def f(a, n=len(a)):" to be possible. Every piece of code in Python is executed, if it is ever executed, in the context that it was written in. Before f-strings were implemented, this was debated in some detail, and there is no easy way to transfer a context around usefully. ChrisA
On Sun, Oct 31, 2021 at 02:24:10PM +1100, Chris Angelico wrote:
That last part is the most important here: it has to be evaluated *in the context of the function*. That's the only way for things like "def f(a, n=len(a)):" to be possible.
Agreed so far.
Every piece of code in Python is executed, if it is ever executed, in the context that it was written in.
I don't think that's quite right. We can eval() and exec() source code, ASTs and code objects in any namespace we have access to, including plain old dicts, with some limitations. (E.g. we can't get access to other function's namespace, not even if we have their locals() dict. At least not in CPython.) In the case of default expressions: def func(spam=early_expression, @eggs=late_expression): early_expression is evaluated in the scope surrounding func (it has to be since func doesn't exist yet!) and late_expression needs to be evaluated inside func's scope, rather than the scope it was written in. -- Steve
On Sun, Oct 31, 2021 at 6:36 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Sun, Oct 31, 2021 at 02:24:10PM +1100, Chris Angelico wrote:
That last part is the most important here: it has to be evaluated *in the context of the function*. That's the only way for things like "def f(a, n=len(a)):" to be possible.
Agreed so far.
Every piece of code in Python is executed, if it is ever executed, in the context that it was written in.
I don't think that's quite right. We can eval() and exec() source code, ASTs and code objects in any namespace we have access to, including plain old dicts, with some limitations. (E.g. we can't get access to other function's namespace, not even if we have their locals() dict. At least not in CPython.)
True, I was a bit sloppy with my definitions there; let me try that again. Every piece of compiled Python code is executed, if it is ever executed, in a context defined by the location where it was compiled. With eval/exec, they're compiled in their own dedicated context (at least, as of Py3 - I don't think I fully understand what Py2 did there). You can provide a couple of dictionaries to help define that context, but it's still its own dedicated context. ASTs don't have contexts yet, but at the point where you compile it the rest of the way, it gets one. Code objects have their contexts fully defined. To my knowledge, there is no way to run code in any context other than the one it was compiled in, although you can come close by updating a globals dictionary. You can't get closure references (nonlocals) for eval/exec, and I don't think it's possible to finish compiling AST to bytecode in any way that allows you to access more nonlocals.
In the case of default expressions:
def func(spam=early_expression, @eggs=late_expression):
early_expression is evaluated in the scope surrounding func (it has to be since func doesn't exist yet!) and late_expression needs to be evaluated inside func's scope, rather than the scope it was written in.
Actually, by the time you're compiling that line of code, func DOES exist, to some extent. You can't compile a def statement without simultaneously compiling both its body (the scope of func) and its surrounding context (whatever that was written in). It's a little messier now since you can have each of those contexts getting code added to it, but that's still a limited number of options - for instance, you can't have a default expression that gets evaluated in the context of an imported module, nor one that's evaluated in the *caller's* context. (I'm aware that I'm using the word "context" here to mean something that exists at compilation time, and elsewhere I've used the same word to mean something that only exists at run time. Unfortunately, English has only so many ways to express the same sorts of concepts, so we end up reusing. Sorry.) ChrisA
On 10/30/2021 10:08 PM, Christopher Barker wrote:
I'm a bit confused as to why folks are making pronouncements about their support for this PEP before it's even finished, but, oh well. I think it's safe to say people are opposed to the PEP as it current stands, not in it's final, as yet unseen, shape. But I'm willing to use other words that "I'm -1 on PEP 671". You can read my opposition as "as it currently stands, I'm -1 on PEP 671". As for what seems like one major issue:
Yes, this is a kind of "deferred" evaluation, but it is not a general purpose one, and that, I think, is the strength of the proposal, it's small and specific, and, most importantly, the scope in which the expression will be evaluated is clear and simple.
And to me and others, what you see as a strength, and seem opposed to changing, we see as a fatal flaw. What if the walrus operator could only be used in "for" loops? What if f-strings were only available in function parameters? What if decorators could only be used on free-standing functions, but not on object methods? In all of these cases, what could be a general-purpose tool would have been restricted to one specific context. That would make the language more confusing to learn. I feel you're proposing the same sort of thing with late-bound function argument defaults. And I think it's a mistake. If these features had been added in their limited form above, would it be possible to extend them in the future? As they were ultimately implemented, yes, of course. But it's entirely possible that if we were proposing the limited version above we could make a design decision that would prevent them from being more widely used in the future. The most obvious being the syntax used to specify them, but I think that's not the only consideration.
In contrast, a general deferred object would, to me, be really confusing about what scope it would get evaluated in -- I can't even imagine how I would do that -- how the heck am I supposed to know what names will be available in some function scope I pass this thing into??? Also, this would only allow a single expression, not an arbitrary amount of code -- if we're going to have some sort of "deferred object" -- folks will very soon want more than that, and want full deferred function evaluation. So that really is a whole other kettle of fish, and should be considered entirely separately.
And again, this is where we disagree. I think it should be considered in the full context of places it might be useful. I (and I think others) are concerned that we'd be painting ourselves into a corner with this proposal. For example, if the delayed evaluation were available as text via inspect.Signature, we'd be stuck with supporting that forever, even if we later added delayed evaluation objects to the language. I also have other problems with the PEP, not specifically about restricting the scope of where deferred evaluations are allowed. Most importantly, that it doesn't add enough expressiveness to the language to justify its existence as a new feature that everyone would have to learn. But also things such as: Where do exceptions get caught and handled (only by the caller)? How would you pass in "just use the default" from a wrapper function? And others, but even if they were all addressed except for the restricted nature of the feature, I'd still be -1 on the PEP. Eric
On Sun, Oct 31, 2021 at 11:47 PM Eric V. Smith <eric@trueblade.com> wrote:
On 10/30/2021 10:08 PM, Christopher Barker wrote:
I'm a bit confused as to why folks are making pronouncements about their support for this PEP before it's even finished, but, oh well. I think it's safe to say people are opposed to the PEP as it current stands, not in it's final, as yet unseen, shape. But I'm willing to use other words that "I'm -1 on PEP 671". You can read my opposition as "as it currently stands, I'm -1 on PEP 671". As for what seems like one major issue:
Yes, this is a kind of "deferred" evaluation, but it is not a general purpose one, and that, I think, is the strength of the proposal, it's small and specific, and, most importantly, the scope in which the expression will be evaluated is clear and simple.
And to me and others, what you see as a strength, and seem opposed to changing, we see as a fatal flaw.
What if the walrus operator could only be used in "for" loops? What if f-strings were only available in function parameters? What if decorators could only be used on free-standing functions, but not on object methods?
In all of these cases, what could be a general-purpose tool would have been restricted to one specific context. That would make the language more confusing to learn. I feel you're proposing the same sort of thing with late-bound function argument defaults. And I think it's a mistake.
Deferred expressions are not the same as late-bound argument defaults. What is the correct behaviour here? def foo(a=>[1,2,3], b=>len(a)): a.append(4) print(b) And what is the correct behaviour here? def foo(a=defer [1,2,3], b=defer len(a)): a.append(4) print(b) When is 'a' evaluated and the list constructed? When is the length calculated and stored in 'b'? With argument defaults, it's clear: this happens as the function is called. (See other thread for a subtlety about whether this happens during frame construction or as the function-proper begins execution, but that is a minor distinction that doesn't affect non-generators very much.) With deferreds, the usual expectation is that they are evaluated on usage, which could be a very different point. Late-bound defaults are NOT "deferreds but limited to function headers". They are quite different. You can think of them as a sort of deferred expression if that helps, but they're not a specialization of a more general feature. ChrisA
On Sun, Oct 31, 2021, 8:59 AM Chris Angelico
def foo1(a=>[1,2,3], b=>len(a)): a.append(4) print(b)
And what is the correct behaviour here?
def foo2(a=defer [1,2,3], b=defer len(a)): a.append(4) print(b)
This is a nice example. I agree they are different in the natural reading of each. Specifically, suppose both features had been added to the language. I would expect foo1() to print "3" and foo2() to print "4". This is also a good example of why the more general feature is BETTER. It is easy to emulate the foo1() behavior with 'defer', but impossible to emulate foo2() using '=>'. E.g. def foo3(a=defer [1,2,3], b=defer len(a)): # behaves like foo1() b = b # or eval_b = b and use new name in body a.append(4) print(b) Note this: def foo4(a=defer [1,2,3], b=defer len(a)) print(b) # prints 3 In order to print we actually need to walk a DAG. 'b' is an "unevaluated" object, but the interpreter would need to recognize that it depends on unevaluated 'a' ... and so on, however far up the tree it needed to walk to have only regular values (or raise a NameError maybe). This is all precisely prior art, and is what is done by Dask Delayed: https://docs.dask.org/en/stable/delayed.html I think it would be better as actual syntax, but generally Dask already does what I want. The amazingly powerful thing about constructing a DAG of deferred computation is that you can find only intermediate results in a complex tree if that is all you concretely need. I recognize that this is more complex than the niche case of late evaluation of formal parameters. But I consider that niche case trivial, and certainly not worth special syntax. In contrast, the niche case falls out seamlessly from the more general idea. In terms of other prior art, deferred evaluation is the default behavior in Haskell. I admit that I find strictly functional language with no mutability a PITA. But inasmuch as Haskell has some elegance, and sometimes reasonably fast performance, it is largely because delayed evaluation is baked in.
On Mon, Nov 1, 2021 at 2:59 AM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
On Sun, Oct 31, 2021, 8:59 AM Chris Angelico
def foo1(a=>[1,2,3], b=>len(a)): a.append(4) print(b)
And what is the correct behaviour here?
def foo2(a=defer [1,2,3], b=defer len(a)): a.append(4) print(b)
This is a nice example. I agree they are different in the natural reading of each.
Specifically, suppose both features had been added to the language. I would expect foo1() to print "3" and foo2() to print "4".
This is also a good example of why the more general feature is BETTER. It is easy to emulate the foo1() behavior with 'defer', but impossible to emulate foo2() using '=>'.
I'd actually say that this is a good example of why the more general feature is DIFFERENT. The emulation argument is good, but we can already emulate late-binding behaviour using early-binding, and we can emulate deferred evaluation using functions, and so on; the fact that you can emulate one thing with another does not mean that it's of no value to have it. Deferred evaluation has its own set of problems, its own set of features, its own set of edge cases. I strongly encourage you to write up a detailed specification as a completely separate proposal.
E.g.
def foo3(a=defer [1,2,3], b=defer len(a)): # behaves like foo1() b = b # or eval_b = b and use new name in body a.append(4) print(b)
Note this:
def foo4(a=defer [1,2,3], b=defer len(a)) print(b) # prints 3
In order to print we actually need to walk a DAG. 'b' is an "unevaluated" object, but the interpreter would need to recognize that it depends on unevaluated 'a' ... and so on, however far up the tree it needed to walk to have only regular values (or raise a NameError maybe).
This is all precisely prior art, and is what is done by Dask Delayed: https://docs.dask.org/en/stable/delayed.html
I think it would be better as actual syntax, but generally Dask already does what I want.
The amazingly powerful thing about constructing a DAG of deferred computation is that you can find only intermediate results in a complex tree if that is all you concretely need.
I recognize that this is more complex than the niche case of late evaluation of formal parameters. But I consider that niche case trivial, and certainly not worth special syntax.
In contrast, the niche case falls out seamlessly from the more general idea.
All this is excellent and very useful, but I don't think it's the same thing as function defaults.
In terms of other prior art, deferred evaluation is the default behavior in Haskell. I admit that I find strictly functional language with no mutability a PITA. But inasmuch as Haskell has some elegance, and sometimes reasonably fast performance, it is largely because delayed evaluation is baked in.
When mutability does not exist, deferred evaluation becomes more a matter of optimization. Plus, people code to the language they're working in - if, for example, it's normal to write deeply recursive algorithms in a language that heavily optimizes recursion, that doesn't necessarily mean that it's better to write those same algorithms the same way in other languages. If both PEP 671 and some form of deferred expression were both accepted, I think their best interaction would be in introspection (help(), inspect, etc) - the descriptive part of a late-evaluated default could be reconstructed from the AST on demand. ChrisA
On Mon, Nov 1, 2021 at 5:15 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 1/11/21 4:59 am, David Mertz, Ph.D. wrote:
b = b
I don't want to live in a universe where this could be anything other than a no-op in Python.
Be careful what you say: there are some technicalities. If you mean that it won't change the behaviour of the object referred to by b, then I absolutely agree, but there are ways that this can be more than a no-op. Notably, it has very good meaning as a keyword argument (it means "pass b along, named b"), and as a function parameter (meaning "accept b, defaulting to b from the outer scope"); and even as a stand-alone statement, it isn't technically meaningless (it'll force b to be a local). But yes, I agree that I don't want this to force the evaluation of something, which continues to be called b. Even though that's technically possible already if you have a weird namespace, I wouldn't call that a good way to write code. ChrisA
Agreed, class namespaces are weird. :-) On Sun, Oct 31, 2021 at 23:38 Chris Angelico <rosuav@gmail.com> wrote:
On Mon, Nov 1, 2021 at 5:15 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 1/11/21 4:59 am, David Mertz, Ph.D. wrote:
b = b
I don't want to live in a universe where this could be anything other than a no-op in Python.
Be careful what you say: there are some technicalities. If you mean that it won't change the behaviour of the object referred to by b, then I absolutely agree, but there are ways that this can be more than a no-op. Notably, it has very good meaning as a keyword argument (it means "pass b along, named b"), and as a function parameter (meaning "accept b, defaulting to b from the outer scope"); and even as a stand-alone statement, it isn't technically meaningless (it'll force b to be a local).
But yes, I agree that I don't want this to force the evaluation of something, which continues to be called b. Even though that's technically possible already if you have a weird namespace, I wouldn't call that a good way to write code.
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/HQI3UL... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido (mobile)
On Mon, Nov 1, 2021 at 5:57 PM Guido van Rossum <guido@python.org> wrote:
Agreed, class namespaces are weird. :-)
Ah yes, I forgot about class namespaces. I was thinking about deliberately wonky namespaces where the ns dict has a __missing__ method or something, but inside a class, "b = b" actually has a very useful meaning :) It still doesn't change the behaviour of the object b though. ChrisA
On 2021-10-31 05:57, Chris Angelico wrote:
Deferred expressions are not the same as late-bound argument defaults.
You keep saying this, but I still don't get the point. Descriptors are not the same as properties, but we can implement properties with descriptors. Decorators are not the same as callables, but we can implement decorators with callables. The plus sign is not the same as an __add__ method, but we can define the implementation of + in terms of __add__ (and __radd__) methods on objects. As far as I can tell the only real difference is that you seem very intent on the idea that the default argument is ONLY evaluated right at the beginning of the function and not later. But so what? A general deferred expression would still allow you to do that, it would just also allow you to evaluate the deferred expression at some other point. That might be useful or it might not, but I don't see how it prevents a deferred object from serving the function of a late-bound default. The question isn't whether deferred objects and late-bound defaults "are the same", but whether one can provide a more general framework within which the other can be expressed. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Mon, Nov 1, 2021 at 5:03 AM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-31 05:57, Chris Angelico wrote:
Deferred expressions are not the same as late-bound argument defaults.
You keep saying this, but I still don't get the point. Descriptors are not the same as properties, but we can implement properties with descriptors. Decorators are not the same as callables, but we can implement decorators with callables. The plus sign is not the same as an __add__ method, but we can define the implementation of + in terms of __add__ (and __radd__) methods on objects.
As far as I can tell the only real difference is that you seem very intent on the idea that the default argument is ONLY evaluated right at the beginning of the function and not later. But so what? A general deferred expression would still allow you to do that, it would just also allow you to evaluate the deferred expression at some other point. That might be useful or it might not, but I don't see how it prevents a deferred object from serving the function of a late-bound default. The question isn't whether deferred objects and late-bound defaults "are the same", but whether one can provide a more general framework within which the other can be expressed.
The whole point of deferred expressions is the time delay, but function defaults have to be evaluated immediately upon entering the function. You're contorting all manner of things to try to come up with something that would work, and the result is, in my opinion, quite inelegant; it doesn't do everything that function defaults need to, requires inordinate amounts of code to accomplish simple tasks, and makes implications that are entirely unnecessary. Deferred expressions of various kinds are certainly possible, but they are not a useful implementation of argument defaults, due to those exact contortions. ChrisA
On 2021-10-31 05:57, Chris Angelico wrote:
Deferred expressions are not the same as late-bound argument defaults.
You keep saying this, but I still don't get the point. Descriptors are not the same as properties, but we can implement properties with descriptors. Decorators are not the same as callables, but we can implement decorators with callables. The plus sign is not the same as an __add__ method, but we can define the implementation of + in terms of __add__ (and __radd__) methods on objects.
As far as I can tell the only real difference is that you seem very intent on the idea that the default argument is ONLY evaluated right at the beginning of the function and not later. But so what? A general deferred expression would still allow you to do that, it would just also allow you to evaluate the deferred expression at some other point. Wonderful! +1,000,000. So are you going to write a PEP explaining exactly how you propose to implement deferred expressions and how they would be used? And perhaps
On 31/10/2021 18:00, Brendan Barnwell wrote: provide a reference implementation? Then we can see how it works, and how it can be used to replace or improve PEP 671? Say, next month? I really don't want to be rude, but I can't think of anything more appropriate to express my frustration/impatience (and I'm not the one doing the work! I can only guess how Chris A feels) with this foot-dragging, than the adage (possibly from poker): "Put up or shut up". The first car would never have been built (in 1885 or thereabouts) if the investors had insisted it wasn't worth doing unless it had air bags, satellite GPS, in-car radio and cruise control. PEP 671 will be USEFUL to Python programmers. We want it! (When do we want it? Now!) Best wishes Rob Cliffe
[snip]
On Sun, Oct 31, 2021, 5:39 PM Rob Cliffe via Python-ideas
PEP 671 will be USEFUL to Python programmers. We want it! (When do we want it? Now!)
This feels dishonest. I believe I qualify as a Python programmer. I started using Python 1.4 in 1998. The large majority of my work life since then had been programming Python. I've written a bunch of books on Python. I was a director of the PSF. I DO NOT want PEP 671, and do not feel it would be a net benefit to Python programmers. I am not on the SC, so indeed I won't make the decision. But I *will* continue to teach Python. New syntax adds a burden to learners, and it should not be introduced without sufficient benefit to merit that burden. This proposal does not come close to that threshold. I was the first in this thread to propose a far more general capability that I think WOULD meet the cost/benefit balance... and I proposed it because I think there is a way to meet the niche need that also has wide enough application to warrant new syntax. Python isn't Perl. Not every piece of syntax that *someone* might occasionally use for a narrow need should be included in the language.
On Mon, Nov 1, 2021 at 8:56 AM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
I am not on the SC, so indeed I won't make the decision. But I *will* continue to teach Python. New syntax adds a burden to learners, and it should not be introduced without sufficient benefit to merit that burden. This proposal does not come close to that threshold.
How do you currently teach about mutable argument defaults? With this proposal, it will become trivially easy to have an argument default that's evaluated fresh every time. Does that not count as beneficial? ChrisA
On Sun, Oct 31, 2021, 6:11 PM Chris Angelico
On Mon, Nov 1, 2021 at 8:56 AM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
I am not on the SC, so indeed I won't make the decision. But I *will* continue to teach Python. New syntax adds a burden to learners, and it should not be introduced without sufficient benefit to merit that burden. This proposal does not come close to that threshold.
How do you currently teach about mutable argument defaults? With this proposal, it will become trivially easy to have an argument default that's evaluated fresh every time. Does that not count as beneficial?
I teach folks to use a sentinel. Yes, it is genuinely a thing to learn, but it takes far less mental effort than special syntax and a different evaluation model. At least 99% of the time, the None sentinel is fine and the best choice.... Yes, I know there are RARE cases where None isn't a good sentinel, but I can't recall the last time I encountered that situation. I've myself made errors with mutable defaults. Perhaps not for a few years, but certainly years after I should have known better. Yes, there genuinely is a possible bug with using defaults. However, I believe that having two different kinds of default parameter bindings would lead to a much larger number of bugs than the status quo. I think this would be true even if the syntax made the distinction obvious... And the '=>' syntax is far from intuitive to start with. It *could* be memorized, but it's definitely not intuitive. This isn't just beginners either. For example, in a few days, I'm giving a short talk about the pitfalls of using lru_cache with mutable arguments to folks at my work. They have, typically, 5-10 years Python experience, yet that's an error I've found in production code. This really isn't the same issue as this PEP, but it's an example of where just a little extra complexity gets experienced developers confused.
On 2021-10-31 15:08, Chris Angelico wrote:
On Mon, Nov 1, 2021 at 8:56 AM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
I am not on the SC, so indeed I won't make the decision. But I *will* continue to teach Python. New syntax adds a burden to learners, and it should not be introduced without sufficient benefit to merit that burden. This proposal does not come close to that threshold.
How do you currently teach about mutable argument defaults? With this proposal, it will become trivially easy to have an argument default that's evaluated fresh every time. Does that not count as beneficial?
This is a good question. When I've taught Python, I teach about mutable argument defaults by building on what's been taught about mutable objects in general. As we know, there are many things about mutable objects that can confuse newbies (or even people with some experience), and many of them have nothing to do with default arguments. For instance, people are sometimes surprised if they do x = [1, 2, 3] some_function(x) . . . and then find that x has changed in the calling environment because `some_function` mutated it. Or sometimes they're surprised "from the inside" because they're the one writing `some_function` and they mutate the argument and didn't realize that could disrupt other code that calls their function. And so on. Once people understand the general need to be careful about mutable objects and where they're mutated, using them as object defaults is not really a huge additional obstacle. Basically you just have to make clear that defaults are evaluated only once, when they write the function, and not again and again each time it is called. If people understand that, they will basically understand mutable argument defaults, because that is essentially the same situation in the example above, and various other cases. It just means "be careful when you mutate a mutable object someone else might be using that object too". Of course, they will forget this and make mistakes. So having late-bound defaults would be a bit of a help. But my point is that many of the problems learners have with mutable default arguments are really problems with mutable objects in general, and having late-bound defaults won't help with those more general confusions. So the situation is the same as before: yes, there will be a bit of a benefit, but the benefit is limited. Also, from a teaching perspective, there is an important cost as well, which is that you have to teach students about the new syntax and how it differs from early-bound defaults, and students have to develop the skill of reading function signatures that are now (potentially) more complex than they were before. Based on my own (admittedly limited) experience I'm not sure if late-bound defaults would be a net win in terms of teaching ease. Right now the situation is actually not too crazy to explain because the handling of mutable defaults (i.e., the "if arg is None" stuff) goes into the function body and can be described as part of other general "prep work" that functions may need to do at the beginning (like, say, converting inputs to lowercase or doing some sanity checks on arguments). But the new proposal is something that would actually have to be learned as a separate thing because of its in-between nature (i.e., now the argument list becomes a mix of things, some of which execute in the defining context and some in the calling context). -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
I definitely agree with that sentiment, with beginners I don't even talk about function defaults at first, and when I do, it's when we have already have a talk about mutables so I can just say that you almost never want a mutable default but rather use None as a sentinel. It's not that hard and it serves as a reminder of how mutables work, so it's actually good for teaching! I don't look forward to having to add yet another side note about syntactic sugar that does not really add much value (it saves a few characters but it's less clear and relying on code to document the parameters is a bit meh imo). Because I won't burden beginners who are already having to ingest a lot of thing with a new model of evaluation. I guess an alternative could be to only teach late binding but since all the code written so far is early bound, it's not practical. Cheers, E On Mon, 1 Nov 2021, 01:05 Brendan Barnwell, <brenbarn@brenbarn.net> wrote:
On 2021-10-31 15:08, Chris Angelico wrote:
On Mon, Nov 1, 2021 at 8:56 AM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
I am not on the SC, so indeed I won't make the decision. But I *will* continue to teach Python. New syntax adds a burden to learners, and it should not be introduced without sufficient benefit to merit that burden. This proposal does not come close to that threshold.
How do you currently teach about mutable argument defaults? With this proposal, it will become trivially easy to have an argument default that's evaluated fresh every time. Does that not count as beneficial?
This is a good question. When I've taught Python, I teach about mutable argument defaults by building on what's been taught about mutable objects in general. As we know, there are many things about mutable objects that can confuse newbies (or even people with some experience), and many of them have nothing to do with default arguments. For instance, people are sometimes surprised if they do
x = [1, 2, 3] some_function(x)
. . . and then find that x has changed in the calling environment because `some_function` mutated it. Or sometimes they're surprised "from the inside" because they're the one writing `some_function` and they mutate the argument and didn't realize that could disrupt other code that calls their function. And so on.
Once people understand the general need to be careful about mutable objects and where they're mutated, using them as object defaults is not really a huge additional obstacle. Basically you just have to make clear that defaults are evaluated only once, when they write the function, and not again and again each time it is called. If people understand that, they will basically understand mutable argument defaults, because that is essentially the same situation in the example above, and various other cases. It just means "be careful when you mutate a mutable object someone else might be using that object too".
Of course, they will forget this and make mistakes. So having late-bound defaults would be a bit of a help. But my point is that many of the problems learners have with mutable default arguments are really problems with mutable objects in general, and having late-bound defaults won't help with those more general confusions.
So the situation is the same as before: yes, there will be a bit of a benefit, but the benefit is limited. Also, from a teaching perspective, there is an important cost as well, which is that you have to teach students about the new syntax and how it differs from early-bound defaults, and students have to develop the skill of reading function signatures that are now (potentially) more complex than they were before.
Based on my own (admittedly limited) experience I'm not sure if late-bound defaults would be a net win in terms of teaching ease. Right now the situation is actually not too crazy to explain because the handling of mutable defaults (i.e., the "if arg is None" stuff) goes into the function body and can be described as part of other general "prep work" that functions may need to do at the beginning (like, say, converting inputs to lowercase or doing some sanity checks on arguments). But the new proposal is something that would actually have to be learned as a separate thing because of its in-between nature (i.e., now the argument list becomes a mix of things, some of which execute in the defining context and some in the calling context).
-- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/OVITAF... Code of Conduct: http://python.org/psf/codeofconduct/
On Mon, Nov 01, 2021 at 09:39:01AM +0100, Evpok Padding wrote:
I don't look forward to having to add yet another side note about syntactic sugar that does not really add much value (it saves a few characters but it's less clear
This proposal is not about saving a few characters. We could keep the PEP and change the syntax to use a long keyword: def func(arg=late_binding_through_delayed_evaluation expression) (where "expression" is an actual expression) and it would still have the same benefits. Just more awkward to type :-)
and relying on code to document the parameters is a bit meh imo).
Do you think it is a problem that help() can introspect function signatures and report what the actual defaults are, rather than whatever lies are put in the docstring? I think that is a fantastic feature for Python, but it only applies to early defaults. def func(arg=''): """Return a thing. Arguments: arg is a string, defaults to space. """ When possible, the single source of truth for a function's defaults should be the actual parameter defaults, regardless of when the default is evaluated.
Because I won't burden beginners who are already having to ingest a lot of thing with a new model of evaluation.
We're not proposing this feature for the benefit of newbies and beginners. Python is remarkably beginner friendly, but it's not Scratch. The first edition of Learning Python (Mark Lutz and David Ascher) didn't introduce function defaults until page 122, right at the end of Chapter Four. And of course they had to mention the mutable object gotcha. We do people a disservice if we don't introduce at least the concept of when the default is evaluated. If we don't, they will invariably trip into the "mutable default" gotcha on their own and confuse themselves. I don't think the distinction between early and late binding is a hard concept to teach. What does this do? def func(arg=print("Hello")): return arg func() func() That's all you need to show to demonstrate early binding. Now this: def func(arg=late print("Hello")): return arg func() func() Having *both* options available, and teaching it as a choice, will (I think) make it easier to teach, not harder. "Write `arg=default` if you want the default to be evaluated just once, and `arg=late default` if you want the default to be evaluated each time it is needed. If you are not sure which one you need, use the first." That's the sort of advice I would have loved when I was a newbie. Short, simple, straight to the point, and not reliant on knowing what "is None" means. -- Steve
Steven D'Aprano writes:
"Write `arg=default` if you want the default to be evaluated just once, and `arg=late default` if you want the default to be evaluated each time it is needed. If you are not sure which one you need, use the first."
Which of course is ambiguous, since the argument may be referenced many times in the function body or only late in the body. Someone who doesn't yet understand how early binding of mutable defaults works is also somewhat likely to misunderstand "when needed" as a promise that resolves to an object the first time it is referenced, or as a thunk that gets run every time it is referenced, both of which are incorrect. What you should write to be (more) accurate is Write `arg=default` if you want the default to be evaluted when the function is defined [and the value to be stored in the function to be used when it is called], and `arg=late default` if you want the default to be evaluated each time the function is called. If you are not sure which one you need, ask someone to help you, because using the wrong one is a common source of bugs. The part in brackets is a gloss that might be helpful to the beginner, who might experience a WTF at "evaluated when defined" To be honest, I was surprised you chose early binding for "when in doubt". I would expect that "early binding when late is appropriate" is a much more common bug for beginners than "late binding when early is appropriate". Of course I may be biased because it's the only bug now, but I would think that would continue to be true for beginners if late binding is made available. Especially when they're testing code in the interactive interpreter. There are probably many programs where the function is called with the argument missing only once, in which case it doesn't matter, until you invoke the function repeatedly in the same context.
On Wed, Nov 3, 2021 at 6:01 PM Stephen J. Turnbull <stephenjturnbull@gmail.com> wrote:
Steven D'Aprano writes:
"Write `arg=default` if you want the default to be evaluated just once, and `arg=late default` if you want the default to be evaluated each time it is needed. If you are not sure which one you need, use the first."
Which of course is ambiguous, since the argument may be referenced many times in the function body or only late in the body. Someone who doesn't yet understand how early binding of mutable defaults works is also somewhat likely to misunderstand "when needed" as a promise that resolves to an object the first time it is referenced, or as a thunk that gets run every time it is referenced, both of which are incorrect.
People will often have completely wrong understandings about things. While it's possible to avoid some of those, we can't hold the language back because *someone who doesn't understand* might misunderstand. Mutable objects in general tend to be misunderstood. Python has a strong policy of evaluating things immediately, or being very clear about when they will be evaluated (eg generators when you next() them). Resolving a promise to an object on first reference would break that pattern completely, so I would expect most people to assume that "each time it is needed" means "each time you omit the argument". And if they don't, they'll figure it out and go digging. Or not, as the case may be; I've seen misunderstandings linger in people's brains for a long time without ever being disproven. (My own brain included.)
What you should write to be (more) accurate is
Write `arg=default` if you want the default to be evaluted when the function is defined [and the value to be stored in the function to be used when it is called], and `arg=late default` if you want the default to be evaluated each time the function is called. If you are not sure which one you need, ask someone to help you, because using the wrong one is a common source of bugs.
The part in brackets is a gloss that might be helpful to the beginner, who might experience a WTF at "evaluated when defined"
And that's what happens when you need to be pedantically correct. Not particularly useful to a novice, especially with the FUD at the end.
To be honest, I was surprised you chose early binding for "when in doubt". I would expect that "early binding when late is appropriate" is a much more common bug for beginners than "late binding when early is appropriate". Of course I may be biased because it's the only bug now, but I would think that would continue to be true for beginners if late binding is made available. Especially when they're testing code in the interactive interpreter. There are probably many programs where the function is called with the argument missing only once, in which case it doesn't matter, until you invoke the function repeatedly in the same context.
There's a performance cost to late binding when the result will always be the same. The compiler could optimize "=>constant" to "=constant" at compilation time [1], but if it's not actually a compile-time constant, only a human can know that that can be done. But then the question becomes: should we recommend late-binding by default, with early-binding as an easy optimization? I'm disinclined to choose at this point, and will leave that up to educators and style guide authors. ChrisA [1] Technically, there'd still be a difference, but only if you mess with the function's dunders. So for safety, probably the compiler should never optimize it.
But then the question becomes: should we recommend late-binding by default, with early-binding as an easy optimization?
Given the decades of code that is using early binding now, I think it would be a really bad idea not to teach that as the default— folks absolutely need to understand early binding and it’s limitations. But at the end of the day, other than the legacy, I think having a late binding option will be a bit easier for newbies, but not radically different. -CHB I'm disinclined
to choose at this point, and will leave that up to educators and style guide authors.
ChrisA [1] Technically, there'd still be a difference, but only if you mess with the function's dunders. So for safety, probably the compiler should never optimize it. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/K2UD46... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
Chris Angelico writes:
While it's possible to avoid some of those, we can't hold the language back because *someone who doesn't understand* might misunderstand.
Opposing the proposal wasn't the point of quoting Steve, the point was to provide IMO improved language in case the proposal gets adopted. So far, I oppose this proposal because I don't need it, I don't see that anybody else needs it *enough to add syntax*, and I especially don't see that anybody else needs it enough to add syntax that in the opinion of some folks who know this stuff way better than me *might get in the way of a more general feature in the nearish future*. (I know you differ on that point and I understand why, I just don't yet agree.) None of that has anything to do with user misunderstanding, it's a different assessment of the benefits and costs of adoption.
Resolving a promise to an object on first reference would break that pattern completely, so I would expect most people to assume that "each time it is needed" means "each time you omit the argument".
Back to the educational issues: We're not discussing most people. I understand that the new syntax is well-defined and not hard to understand. To me, most people aren't an educational issue. We're discussing programmers new to all patterns Pythonic. You're already breaking the pattern of immediate evaluation (if they understand it), and the example of generators (well-understood before default arguments are?) shows that objects that appear to be defined may be evaluated when called for. Come to think of it, I don't think next() is the comparable point, it's when you call the generator function to get an iterable object. IMO, that's a better argument for your point of view.
And that's what happens when you need to be pedantically correct. Not particularly useful to a novice, especially with the FUD at the end.
I have no idea what you mean by "that's what happens," except that apparently you don't like it. As I see it, a novice will know what a function definition is, and where it is in her code. She will know what a function call is, and where it is in her code. She will have some idea of the order in which things "get done" (in Python, "executed", but she may or may not understand that def is an executable statement). She can see the items in question, or see that they're not there when the argument is defaulted. To me, that concrete explanation in terms of the code on the screen will be more useful *to the novice* than the more abstract "when needed". As for the "FUD", are you implying that you agree with Steve's proposed text? So that if the programmer is unsure, it's perfectly OK to use early binding, no bugs there?
There's a performance cost to late binding when the result will always be the same. The compiler could optimize "=>constant" to "=constant" at compilation time [1], but if it's not actually a compile-time constant, only a human can know that that can be done.
We're talking about folks new to the late-binding syntax and probably to Python who are in doubt. I don't think performance over correctness is what we want to emphasize here.
But then the question becomes: should we recommend late-binding by default, with early-binding as an easy optimization? I'm disinclined to choose at this point, and will leave that up to educators and style guide authors.
I think most educators will go with "when in doubt, ask a mentor if available, or study harder if not -- anyway, you'll get it soon, it's not that hard", or perhaps offer a much longer paragraph of concrete advice. Style guide authors should not touch it, because it's not a matter of style when either choice is a potential bug. And that's why I think the benefits are basically limited to introspecting the deferred object, whether it's a special deferred object or an eval-able equivalent string. The choice is unpleasant for proponents: if you choose a string, Eric's "but then we have to support strings even if we get something better" is a criticism, and if you choose a descriptor-like protocol, David's criticism that it can and should be more general comes to bear. You say that there's a deficiency with a generic deferred, in that in your plan late-bound argument expressions can access nonlocals -- but that's not a problem if the deferred is implemented as a closure. (This is an extension of a suggestion by Steven d'Aprano.) I would expect that anyway if generic deferreds are implemented, since they would be objects that could be returned from functions and "ejected" from the defining namespace. It seems to me that access to nonlocals is also a problem for functions with late-bound argument defaults, if such a function is returned as the value of the defining function. I suppose it would have to be solved in the same way, by making that function a closure. So the difference is whether the closure is in the default itself, or in the function where the default is defined. But the basic issue, and its solution, is the same. That might be "un-Pythonic" for deferreds, but at the moment I can't see why it's more un-Pythonic for deferreds than for local functions as return values. Regards, Steve
On Thu, Nov 4, 2021 at 5:28 AM Stephen J. Turnbull <stephenjturnbull@gmail.com> wrote:
And that's what happens when you need to be pedantically correct. Not particularly useful to a novice, especially with the FUD at the end.
I have no idea what you mean by "that's what happens," except that apparently you don't like it.
What I mean is that pedantically correct language inevitably ends up way too verbose to be useful in an educational context. (Please explain the behaviour of "yield from" in a generator. Ensure that you are absolutely precisely correct.)
As I see it, a novice will know what a function definition is, and where it is in her code. She will know what a function call is, and where it is in her code. She will have some idea of the order in which things "get done" (in Python, "executed", but she may or may not understand that def is an executable statement).
Given the number of people who assume that function definitions are declarations, it's clear that some things simply have to be learned.
She can see the items in question, or see that they're not there when the argument is defaulted. To me, that concrete explanation in terms of the code on the screen will be more useful *to the novice* than the more abstract "when needed".
As for the "FUD", are you implying that you agree with Steve's proposed text? So that if the programmer is unsure, it's perfectly OK to use early binding, no bugs there?
If the programmer is unsure, go ahead and pick something, then move on. It's better to just try something and go than to agonize over which one you should use. Tell people that something is crucially important to get right, and they're more likely to be afraid of it. Give them a viable default (pun partly intended) and it's far less scary.
There's a performance cost to late binding when the result will always be the same. The compiler could optimize "=>constant" to "=constant" at compilation time [1], but if it's not actually a compile-time constant, only a human can know that that can be done.
We're talking about folks new to the late-binding syntax and probably to Python who are in doubt. I don't think performance over correctness is what we want to emphasize here.
Maybe, but there's also a lot of value in defaulting to the fast option. For instance, in a lot of situations, these two will behave identically: for key in some_dict: for key in list(some_dict): We default to iterating over the object itself, even though that could break if you mutate the dict during the loop. The slower and less efficient form is reserved for situations where it matters. That said, it might be better in this case to recommend late-binding by default. But most importantly, either recommended default is better than making things sound scary.
But then the question becomes: should we recommend late-binding by default, with early-binding as an easy optimization? I'm disinclined to choose at this point, and will leave that up to educators and style guide authors.
I think most educators will go with "when in doubt, ask a mentor if available, or study harder if not -- anyway, you'll get it soon, it's not that hard", or perhaps offer a much longer paragraph of concrete advice. Style guide authors should not touch it, because it's not a matter of style when either choice is a potential bug.
I would initially just recommend early-binding by default, since it's going to have better cross-version compatibility. By the time that's no longer a consideration, I personally, and the world in general, will have a lot more experience with the feature, so we'll be able to make a more informed decision.
You say that there's a deficiency with a generic deferred, in that in your plan late-bound argument expressions can access nonlocals -- but that's not a problem if the deferred is implemented as a closure. (This is an extension of a suggestion by Steven d'Aprano.) I would expect that anyway if generic deferreds are implemented, since they would be objects that could be returned from functions and "ejected" from the defining namespace.
If the deferred is implemented as a closure, it would be useless for this proposal. Look at the clunky proposals created to support the bisect example, and the weird messes to do things that, with a little bit of compiler support, are just ordinary variable references.
It seems to me that access to nonlocals is also a problem for functions with late-bound argument defaults, if such a function is returned as the value of the defining function. I suppose it would have to be solved in the same way, by making that function a closure. So the difference is whether the closure is in the default itself, or in the function where the default is defined. But the basic issue, and its solution, is the same. That might be "un-Pythonic" for deferreds, but at the moment I can't see why it's more un-Pythonic for deferreds than for local functions as return values.
Not sure what you mean, but the way I've implemented it, if you refer to a nonlocal in a late-bound default, it makes a closure just the same as if you referred to it in the function body. This is another reason that generic deferreds wouldn't work, since the compiler knows about the late default while compiling the *parent* function (and can thus make closure cells as appropriate). ChrisA
Chris Angelico writes:
What I mean is that pedantically correct language inevitably ends up way too verbose to be useful in an educational context.
Nonsense. If you leave out the part in brackets and the "FUD", it's *much* shorter than what Steve wrote, and more accurate. What would require much more verbiage is a proper explanation of the behavior of mutable objects and how to figure out when you want early binding and when you want late binding. But none of us even tried to do that, so you can't hang that on me.
That said, it might be better in this case to recommend late-binding by default. But most importantly, either recommended default is better than making things sound scary.
So is no recommended default. Why not just tell the student the fact that early binding is faster?
If the deferred is implemented as a closure, it would be useless for this proposal.
I don't believe that, because this is exactly how Common Lisp handles late-bound default expressions, by creating a closure. In fact there's no dedicated syntax for early-bound defaults at all; you just quote them, and the quote expression gets evaluated. (The compiler is allowed, but not required, to optimize the evaluation away if it can prove that this does not ever affect the value of the argument.)
Chris Angelico writes:
While it's possible to avoid some of those, we can't hold the language back because *someone who doesn't understand* might misunderstand.
Opposing the proposal wasn't the point of quoting Steve, the point was to provide IMO improved language in case the proposal gets adopted.
So far, I oppose this proposal because I don't need it, I don't see that anybody else needs it *enough to add syntax*, and I especially don't see that anybody else needs it enough to add syntax that in the opinion of some folks who know this stuff way better than me *might get in the way of a more general feature in the nearish future*. Some people clearly need it, viz. those who use a default such as `[]` and then ask on Stack Overflow why they get surprising results. With late binding you can do anything that you can do with early binding, but not vice versa. And IMO late binding is actually more intuitive - YMMV. You say you don't need late binding, but I would be very surprised if you never found a situation where it was useful. (Starting with avoiding the 'if x==None: x==[]' idiom, but I am sure there will be others, like the `hi:=len(a)` example.) In short, if it was available, I think you would find uses for it. has shown no signs of emerging in the last decade or more, and
(I know you differ on that point and I understand why, I just don't yet agree.) None of that has anything to do with user misunderstanding, it's a different assessment of the benefits and costs of adoption.
Resolving a promise to an object on first reference would break that pattern completely, so I would expect most people to assume that "each time it is needed" means "each time you omit the argument".
Back to the educational issues: We're not discussing most people. I understand that the new syntax is well-defined and not hard to understand. To me, most people aren't an educational issue.
We're discussing programmers new to all patterns Pythonic. You're already breaking the pattern of immediate evaluation (if they understand it), and the example of generators (well-understood before default arguments are?) shows that objects that appear to be defined may be evaluated when called for. Come to think of it, I don't think next() is the comparable point, it's when you call the generator function to get an iterable object. IMO, that's a better argument for your point of view. Not sure what your point is. If people understand (in the example of generators) that objects that appear to be defined may be evaluated when called for, they shouldn't have too much difficulty understanding late-bound defaults. If they don't, they may still find late-bound defaults intuitive. And if not, well, everyone has to start learning somewhere.
And that's what happens when you need to be pedantically correct. Not particularly useful to a novice, especially with the FUD at the end.
I have no idea what you mean by "that's what happens," except that apparently you don't like it. As I see it, a novice will know what a function definition is, and where it is in her code. She will know what a function call is, and where it is in her code. She will have some idea of the order in which things "get done" (in Python, "executed", but she may or may not understand that def is an executable statement). She can see the items in question, or see that they're not there when the argument is defaulted. To me, that concrete explanation in terms of the code on the screen will be more useful *to the novice* than the more abstract "when needed". (Semi-jocular point) I know you're trying not to be sexist, and yet
On 03/11/2021 18:28, Stephen J. Turnbull wrote: there is no indication that it will in the "nearish future". (Define "nearish": 1 year, 5, 10, 20 ... never?) is a *very* tricky thing indeed to specify, and understand, and use is arguably a niche use case that not many people will want and if one day it finally appears, can probably be reconciled with an long-implemented PEP 671 anyway. In short, this "more general feature" is a myth. A phantom. And, frankly, an excuse to argue against a PEP which will have immediate benefit to some (lots of?) Python programmers. perhaps in a way you are. Can we adopt a convention of using the male pronoun for novice programmers and the female for experienced ones? After all, in the very early days of computing, virtually all the coders were (underpaid) women. 😁
As for the "FUD", are you implying that you agree with Steve's proposed text? So that if the programmer is unsure, it's perfectly OK to use early binding, no bugs there?
There's a performance cost to late binding when the result will always be the same. The compiler could optimize "=>constant" to "=constant" at compilation time [1], but if it's not actually a compile-time constant, only a human can know that that can be done.
We're talking about folks new to the late-binding syntax and probably to Python who are in doubt. I don't think performance over correctness is what we want to emphasize here.
But then the question becomes: should we recommend late-binding by default, with early-binding as an easy optimization? I'm disinclined to choose at this point, and will leave that up to educators and style guide authors.
I think most educators will go with "when in doubt, ask a mentor if available, or study harder if not -- anyway, you'll get it soon, it's not that hard", or perhaps offer a much longer paragraph of concrete advice. Style guide authors should not touch it, because it's not a matter of style when either choice is a potential bug.
And that's why I think the benefits are basically limited to introspecting the deferred object, whether it's a special deferred object or an eval-able equivalent string. The choice is unpleasant for proponents: if you choose a string, Eric's "but then we have to support strings even if we get something better" is a criticism, and if you choose a descriptor-like protocol, David's criticism that it can and should be more general comes to bear.
You say that there's a deficiency with a generic deferred, in that in your plan late-bound argument expressions can access nonlocals -- but that's not a problem if the deferred is implemented as a closure. (This is an extension of a suggestion by Steven d'Aprano.) I would expect that anyway if generic deferreds are implemented, since they would be objects that could be returned from functions and "ejected" from the defining namespace.
It seems to me that access to nonlocals is also a problem for functions with late-bound argument defaults, if such a function is returned as the value of the defining function. I suppose it would have to be solved in the same way, by making that function a closure. So the difference is whether the closure is in the default itself, or in the function where the default is defined. But the basic issue, and its solution, is the same. That might be "un-Pythonic" for deferreds, but at the moment I can't see why it's more un-Pythonic for deferreds than for local functions as return values.
Regards, Steve
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/K5YXB7... Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, Nov 3, 2021, 9:19 PM Rob Cliffe via Python-ideas
Some people clearly need it, viz. those who use a default such as `[]` and then ask on Stack Overflow why they get surprising results.
This is silly. All those folks on StackOverflow are told "use a sentinel." The fact beginners can make a mistake doesn't mean a feature is wrong, it means beginners are beginners. They don't NEED it, there are existing solutions. Even though I don't support this proposal, there are things that beginners ask about that we don't NEED but are still worth adding. For example, even though I was only lukewarm in support of the walrus operator, I agree it makes a some code constructs more concise and more readable. But it WAS new syntax to do the same thing that was already possible with an extra line or two before. I recognize that in many ways this proposal is similar. It's extra syntax to make a certain coding pattern shorter. I don't believe that's absurd, I just think the balance tips the other way. What this covers is less important than what the walrus operator covers, because all syntax proposed is uglier and less intuitive than walrus, and because it may obstruct a much more important general feature is like to have added. With late binding you can do anything that you can do with early binding,
but not vice versa. And IMO late binding is actually more intuitive - YMMV.
This seems exactly opposite the real situation. Late binding is completely and straightforwardly handled by a sentinel. Yes, it doesn't make the automatic help() that pretty. Yes it takes an extra line in the body. But the semantics are available. In contrast, how can a late binding call POSSIBLY know what the default value was at the point of function definition?! x = 1234 def foo(a, b=x): # ... whatever x = 567 foo(88) That definition-time value of 'x' is just lost. I don't consider that behavior especially important, I admit. There are plenty of names, and if you want one not to change, don't change it. Indeed, if Python 0.9 had come with late binding, my feelings about Python and it's popularity would probably be nearly identical. But it didn't. So now we are discussing confusing and subtle syntax variations for a niche use case, and I don't believe that's worthwhile.
On Thu, Nov 4, 2021 at 12:42 PM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
This seems exactly opposite the real situation. Late binding is completely and straightforwardly handled by a sentinel. Yes, it doesn't make the automatic help() that pretty. Yes it takes an extra line in the body. But the semantics are available.
In contrast, how can a late binding call POSSIBLY know what the default value was at the point of function definition?!
x = 1234 def foo(a, b=x): # ... whatever
x = 567 foo(88)
That definition-time value of 'x' is just lost.
No one is suggesting removing early-binding from the language. But even if we were, it wouldn't be hard to solve this purported impossibility with a decorator. For instance, the classic "loop to make a bunch of functions" problem could be solved this way: def snapshot(*values): def wrapper(f): @functools.wraps(f) def wrapped(*a, **kw): return f(*a, **kw, snapshot=values) return wrapped return wrapper for n in range(10): @snapshot(n) def func(btn, *, snapshot): print("Clicked on", snapshot[0]) new_button(onclick=func) Tada, early binding created by closure. Pretty much by definition, nothing that we create is truly new; it's just a question of how awkward it is to spell something. But the existing spelling for argument defaults will continue to have the existing semantics. That isn't changing. All that's changing is that there will be a new way to spell parameters with defaults, which will have slightly different semantics. (If Python had chosen late-binding and hadn't had decorators, it still wouldn't have been too hard to do things - all you'd need is an explicit closure at the definition site, which would, of course, achieve the same goal. But I think the snapshot decorator is more elegant, since it can be reused.) ChrisA
For example, even though I was only lukewarm in support of the walrus operator, I agree it makes a some code constructs more concise and more readable. But it WAS new syntax to do the same thing that was already possible with an extra line or two before.
It's extra syntax to make a certain coding pattern shorter. I don't believe that's absurd, I just think the balance tips the other way.
It’s a little more than just shorter. There is no way to universally spell “not specified”: None works fine in most cases, but not all. Custom sentinels can be confusing to users, etc. All that being said, like any other PEP, there are two questions: 1) will this be an improvement? 2) if so, is it worth the churn? And the SC will need to make those decisions. FWIW, I’m not totally sure where I come down on (2) myself. because it may obstruct a much more important general feature is like to
have added.
Could someone please flesh out this objection? I can’t see at all why having late bound defaults will obstruct the addition of a general purpose deferred evaluation system. Except maybe because we could no longer use late bound defaults as a use case for a deferred object. But as Chris A has made clear, a general purpose deferred object isn’t a great fit for this use case anyway. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Thu, Nov 4, 2021 at 3:56 PM Christopher Barker <pythonchb@gmail.com> wrote:
All that being said, like any other PEP, there are two questions:
1) will this be an improvement? 2) if so, is it worth the churn?
And the SC will need to make those decisions.
FWIW, I’m not totally sure where I come down on (2) myself.
To try to help people make their decisions on that point, allow me to try to summarize the churn that will be involved. 1) The grammar for function signatures becomes a little more complicated. To be fair, most of that complication won't actually come up (for instance, you'd never have both an annotation AND a type comment, even though the grammar says you might), but it's extra for tools to have to cope with. 2) Any tool that does introspection of functions (or anything that uses inspect.Signature objects) will need to be updated. 2a) Not just help() and friends; this includes things like clize, which uses defaults to configure argparse. 3) As with any change, documentation and recommendations will have to depend on the version ("for compatibility, use X, otherwise, use Y") 4) Anything that synthesizes function objects will need to consider how to handle late defaults. I don't think any of it is particularly onerous, but there are quite a few little places to check. ChrisA
Rob Cliffe via Python-ideas writes:
Some people clearly need it, viz. those who use a default such as `[]` and then ask on Stack Overflow why they get surprising results.
That's not a "need". That's a "misunderstanding".
You say you don't need late binding, but I would be very surprised if you never found a situation where it was useful. (Starting with avoiding the 'if x==None: x==[]' idiom,
I'm perfectly happy with 'x = [] if x is None else x'. I've been typing that or 'x = x or []' for 25 years; I'm not going to stop now just because new syntax has been added.
In short, if it was available, I think you would find uses for it.
Speak for yourself, not for me, please. For me, this is not even a "nice to have". I am pretty sure I will only use it if I contribute to a project where defaulting to None and testing in the body is ruled out by the style guide. What little doubt I have is that there is something that this is genuinely needed for, which is introspecting the default expression for a late-bound default. In the case of the '<expr> if x is None else x' idiom, you'd have to disassemble the byte code, if you can find it. Here, you'd have the default expression represented in some accessible form. But I have never yet introspected a function's arguments' defaults yet, so I suspect I won't do that for late-bound defaults either.
Come to think of it, I don't think next() is the comparable point, it's when you call the generator function to get an iterable object. IMO, that's a better argument for your point of view.
Not sure what your point is.
It's exactly what I wrote. In using a generator, next() corresponds to the reference to the argument in the body, while calling the generator function to get the generator's iterator corresponds to the evaluation of the default expression to bind to the argument.
(Semi-jocular point) I know you're trying not to be sexist, and yet perhaps in a way you are.
Point taken, even though it's expressed passive-aggressively. The "semi-jocular" just makes it worse, by the way. All you had to do is ask.
On 11/3/21 12:13 AM, Chris Angelico wrote:
Python has a strong policy of evaluating things immediately, or being very clear about when they will be evaluated
Putting the marker in the middle of the name binding expression is not "very clear" -- particularly since the advice is "ne spaces around '=' in function headers and calls". -- ~Ethan~
On Thu, Nov 4, 2021 at 5:54 AM Ethan Furman <ethan@stoneleaf.us> wrote:
On 11/3/21 12:13 AM, Chris Angelico wrote:
Python has a strong policy of evaluating things immediately, or being very clear about when they will be evaluated
Putting the marker in the middle of the name binding expression is not "very clear" -- particularly since the advice is "ne spaces around '=' in function headers and calls".
Not sure what you mean, but the distinction, if I'm interpreting your statement correctly, is actually the same as there will be with different languages. For instance, I tried this in JavaScript (specifically in Node.js):
function f(a=console.log("Evaluating a")) { ... console.log("Function body begins"); ... console.log("a is", a); ... } undefined f() Evaluating a Function body begins a is undefined undefined
Evaluating the argument default as the function begins makes perfect sense. Evaluating the argument default when you *refer to* the variable in question does NOT make sense. (In the JS example, that would be having "Function body begins" before "Evaluating a".) Proposals to have generic deferreds that get calculated when referenced would be incredibly surprising. "Immediately", when code is written in the function header, can be interpreted as "when the function is created" or "when the function is called", but should not be interpreted as "when you use this variable". ChrisA
On 11/3/21 2:31 PM, Chris Angelico wrote:
On Thu, Nov 4, 2021 at 5:54 AM Ethan Furman wrote:
On 11/3/21 12:13 AM, Chris Angelico wrote:
Python has a strong policy of evaluating things immediately, or being very clear about when they will be evaluated
Putting the marker in the middle of the name binding expression is not "very clear" -- particularly since the advice is "no spaces around '=' in function headers and calls".
[typo above fixed: 'ne' -> 'no'
Not sure what you mean,
I mean the same thing that D'Aprano has reiterated several times: def do_something_fun(target:Any, action:int=-1, permissions:int=>target.perm): pass vs def do_something_fun(target:Any, action:int=-1, @permissions:int=target.perm): pass Having the `@` in front instead of buried in the middle is clear, and just like the * and ** in `*args` and `**kwds` signals that those are different types of variables, the @ in `@permissions` signals that `permissions` is a different kind of variable -- and yes, the fact that it is late-bound does make it different; to claim otherwise is akin to claiming that `args` and `kwds` aren't different because in the end they are just names bound to objects.
[snip javascript example]
Is your javascript example trying to show that putting the sigil in front is nonsensical? If no, then what? If yes, then it is plain that you and I simply disagree and neither of us is going to convince the other. -- ~Ethan~
On Thu, Nov 4, 2021 at 9:33 AM Ethan Furman <ethan@stoneleaf.us> wrote:
On 11/3/21 2:31 PM, Chris Angelico wrote:
On Thu, Nov 4, 2021 at 5:54 AM Ethan Furman wrote:
On 11/3/21 12:13 AM, Chris Angelico wrote:
Python has a strong policy of evaluating things immediately, or being very clear about when they will be evaluated
Putting the marker in the middle of the name binding expression is not "very clear" -- particularly since the advice is "no spaces around '=' in function headers and calls".
[typo above fixed: 'ne' -> 'no'
Not sure what you mean,
I mean the same thing that D'Aprano has reiterated several times:
def do_something_fun(target:Any, action:int=-1, permissions:int=>target.perm): pass
vs
def do_something_fun(target:Any, action:int=-1, @permissions:int=target.perm): pass
Having the `@` in front instead of buried in the middle is clear, and just like the * and ** in `*args` and `**kwds` signals that those are different types of variables, the @ in `@permissions` signals that `permissions` is a different kind of variable -- and yes, the fact that it is late-bound does make it different; to claim otherwise is akin to claiming that `args` and `kwds` aren't different because in the end they are just names bound to objects.
[snip javascript example]
Is your javascript example trying to show that putting the sigil in front is nonsensical? If no, then what? If yes, then it is plain that you and I simply disagree and neither of us is going to convince the other.
It's demonstrating that a plain equals sign can mean late-binding in some languages and early-binding in others. Both of those are perfectly normal interpretations. You were quoting something where I was talking about deferreds that would be evaluated on usage, potentially much later in the function, and I'm saying that that makes no sense. I'm also saying that there is no difference between the variables, only the defaults, and therefore that they shouldn't be adorned in this way. But that's clearly something where neither of us is going to convince the other. ChrisA
On Mon, Nov 1, 2021 at 7:39 PM Evpok Padding <evpok.padding@gmail.com> wrote:
I definitely agree with that sentiment, with beginners I don't even talk about function defaults at first, and when I do, it's when we have already have a talk about mutables so I can just say that you almost never want a mutable default but rather use None as a sentinel. It's not that hard and it serves as a reminder of how mutables work, so it's actually good for teaching!
Or you can just say "but rather use => when defining the default", and then you don't have to explain more things like whether to use "== None" or "is None" just to show how to have a default that builds a new thing. ChrisA
Taking a step back: Suppose Python didn't have default values AT ALL for function parameters? Say that unpassed parameters were always set to some sentinel value (maybe None, maybe some special value NotPassed). Would we want to add them to the language? Surely almost everybody would say yes. (I can't believe anyone would be happy with removing them now.) Then there would be a discussion about whether the defaults should be calculated only once (i.e. early-bound) or on every function call (i.e. late-bound). Historically the decision was made to make them early-bound. I don't how that decision was arrived at, it was before my (Python) time. But consider this: AFAICS, **everything* you can do with early binding, you can do with late binding, but *not* vice versa*. (To simulate early binding if you actually only have late binding, simply put the default value in a global variable which you never change, and use that global variable as your default value. As is commonly done today.) So PEP 671 merely attempts to restore functionality that was (regrettably IMO) left out as a result of that early decision. Best wishes Rob Cliffe
On Fri, Nov 5, 2021 at 7:36 AM Rob Cliffe via Python-ideas <python-ideas@python.org> wrote:
But consider this: AFAICS, *everything* you can do with early binding, you can do with late binding, but *not* vice versa. (To simulate early binding if you actually only have late binding, simply put the default value in a global variable which you never change, and use that global variable as your default value. As is commonly done today.)
Everything you can do with either, you can do with the other. You just demonstrated one way, and if globals won't work, closures will. It's all about expressiveness and clarity of intent. ChrisA
Rob Cliffe via Python-ideas writes:
So PEP 671 merely attempts to restore functionality that was (regrettably IMO) left out as a result of that early decision.
This is a *new* feature, which adds syntax. A lot of contributors to this thread think it's useful enough to overcome the normal Pythonic reluctance to add (1) new features that are syntactic sugar for one-line statements, and (2) new syntax. Others disagree. This utility is either going to be enough to convince the SC, or it's not. It's clear that the battle lines are being drawn.[1] So let's stop trying to convince each other of whether this is a good proposal or not, and turn to making it the best proposal possible so as to give proponents the best chance of getting it in, and to give opponents the least unpalatable misfeature ;-) if it does get in. Still on the agenda as far as I can see: 1. Syntax. The proposals I can recall are a. x=>default b. *x=default c. x=@default d. maybe putting * or @ on the opposite component in b and c? e. a keyword before default such as "late" or "defer". Others? I believe Chris currently favors a. 2. The implementation. a. Keep an abstract representation of the default expression as a string in a dunder, and prefix the compiled body with code to evaluate it in the appropriate namespace. b. As in a, but the abstract representation is an AST or similar. c. Wrap the evaluation in a function (or function-like object) and invoke it before the compiled body (this was suggested by Steven d'Aprano as a compromise, I believe). d. Wrap the evalution in a general-purpose deferred object (this is not in the scope of PEP 671, discussion below). I believe Chris's current reference implementation is a (or if I got that wrong, closer to a than any of the others). It would be helpful to the discussion if Chris starts by striking any of the above that he's unwilling to implement. A question for Chris: In your proposal, as I understand it, an expensive default expression would always be evaluated, even if it's not always needed. Eg, in this toy example: def foo(x:int=>expensive()): if delphic_oracle(): return x else: return 0 expensive() is always evaluated. In that (presumably quite rare) case, we'd just use a sentinel instead, of course. I have further two comments, which are mostly addressed to Steve, I guess. First, I don't really understand Steve's intended difference between 2c and 2d. Second, as I understand them, both 2c and 2d encapsulate the expression in bytecode, so the nice property of introspectability of the expression is lost. I guess you can decompile it more easily than if it's just interpolated into the function body, but if it's implemented as a closure, don't we lose the identity of the identifiers in the expression? And if it's not (eg, the function-like thing encapsulates an abstract representation of the expression rather than bytecode that computes it), what's the point of 2c? I don't see how it has any advantage over 2a or 2b. Footnotes: [1] Nobody's right, if everybody's wrong. -- Stephen Stills You can never have too many Steves in a discussion!
On Sat, Nov 6, 2021 at 2:57 AM Stephen J. Turnbull <stephenjturnbull@gmail.com> wrote:
Still on the agenda as far as I can see:
1. Syntax. The proposals I can recall are a. x=>default b. *x=default c. x=@default d. maybe putting * or @ on the opposite component in b and c? e. a keyword before default such as "late" or "defer". Others? I believe Chris currently favors a.
Yes, I'm currently favouring "x=>default", though weakly; but I strongly favour syntax options that change only the part around the equals sign (no adornment before the variable name, no adornment after the expression). There are a few syntaxes listed in the PEP, and there's a plan in progress to strengthen one of those syntaxes somewhat.
2. The implementation. a. Keep an abstract representation of the default expression as a string in a dunder, and prefix the compiled body with code to evaluate it in the appropriate namespace. b. As in a, but the abstract representation is an AST or similar. c. Wrap the evaluation in a function (or function-like object) and invoke it before the compiled body (this was suggested by Steven d'Aprano as a compromise, I believe). d. Wrap the evalution in a general-purpose deferred object (this is not in the scope of PEP 671, discussion below). I believe Chris's current reference implementation is a (or if I got that wrong, closer to a than any of the others).
It would be helpful to the discussion if Chris starts by striking any of the above that he's unwilling to implement.
Sure. Let's see. a. This is what's currently implemented, plus using Ellipsis in __defaults__ as a marker that there needs to be a default expression. It's a little bit complicated, but it does mean that the vast majority of functions aren't significantly affected by this proposal. b. Less preferred than a, due to the higher cost of retaining the AST, but I'd be fine with this conceptually. c. While this is philosophically interesting, I'm not sure how it would be implemented, so I'd have to see someone else's implementation before I can truly judge it. d. Definitely not, and if someone else wants it, it can be a competing proposal. So: a and b are yes, c is dubious, d is not.
A question for Chris: In your proposal, as I understand it, an expensive default expression would always be evaluated, even if it's not always needed. Eg, in this toy example:
def foo(x:int=>expensive()): if delphic_oracle(): return x else: return 0
expensive() is always evaluated. In that (presumably quite rare) case, we'd just use a sentinel instead, of course.
It will be evaluated even if it's not referred to in the body, but only if the argument is omitted. There is a guarantee that, once the function body begins executing, all arguments (whether given values or populated from defaults) have been assigned. So, yes, if you want conditional evaluation, you do need to use a sentinel.
I have further two comments, which are mostly addressed to Steve, I guess. First, I don't really understand Steve's intended difference between 2c and 2d. Second, as I understand them, both 2c and 2d encapsulate the expression in bytecode, so the nice property of introspectability of the expression is lost. I guess you can decompile it more easily than if it's just interpolated into the function body, but if it's implemented as a closure, don't we lose the identity of the identifiers in the expression? And if it's not (eg, the function-like thing encapsulates an abstract representation of the expression rather than bytecode that computes it), what's the point of 2c? I don't see how it has any advantage over 2a or 2b.
I'm not entirely sure either. There are a few concepts that could be described, but without getting some implementation going, it'll be hard to judge them.
Footnotes: [1] Nobody's right, if everybody's wrong. -- Stephen Stills You can never have too many Steves in a discussion!
We do have a good few. Not quite as many Chrises, although I'm more likely to see Chris replying to Chris replying to Chris on a nerdy mailing list than I am anywhere else! My current implementation does have one somewhat annoying flaw: the functionality of late-bound defaults is buried in the function's bytecode, but the description of it is a function dunder, and can be changed. I'd be open to suggestions that would make this a feature of the code object instead, thus preventing desynchronization. ChrisA
My "vote" if one has to be chosen: #1: x=defer default #2: @x=default #3: x=@default #4: x=>default #5:. *x=default Explicit is better than implicit.
On Sat, Nov 6, 2021 at 10:46 AM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
My "vote" if one has to be chosen:
Preferences are very important. This isn't a "vote" in the sense that the one with the most choices will be selected, but I always want to hear people's preferences.
#1: x=defer default #2: @x=default #3: x=@default #4: x=>default #5:. *x=default
I don't like "defer" because it implies things that aren't true, and I really don't like *x=default since it would be very confusing with *x meaning "collect zero or more positional args", but the others are all at least somewhat viable.
Explicit is better than implicit.
That's what everyone says. Even people who are advocating precisely opposite viewpoints. :) ChrisA
On 05/11/2021 15:57, Stephen J. Turnbull wrote:
Still on the agenda as far as I can see:
1. Syntax. The proposals I can recall are a. x=>default b. *x=default c. x=@default d. maybe putting * or @ on the opposite component in b and c? e. a keyword before default such as "late" or "defer". Others? I believe Chris currently favors a. Please. I have more than once advocated x:=default (and there is no clash with the walrus operator, even if others have said/implied that there is). Rob Cliffe
On Mon, Nov 8, 2021 at 11:22 PM Rob Cliffe via Python-ideas < python-ideas@python.org> wrote:
I have more than once advocated x:=default (and there is no clash with the walrus operator, even if others have said/implied that there is).
not a clash, but you could have a walrus in the default expression, which could be pretty visually confusing. On the other hand, maybe that's a really bad idea anyway. And otherwise I like it. BTW, => would have a similar problem if it's adopted as a shorter way to spell lambda. And that would be worse, as putting a lambda in a default depression might be good style in some cases :-) -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Wed, Nov 10, 2021 at 6:02 PM Christopher Barker <pythonchb@gmail.com> wrote:
On Mon, Nov 8, 2021 at 11:22 PM Rob Cliffe via Python-ideas <python-ideas@python.org> wrote:
I have more than once advocated x:=default (and there is no clash with the walrus operator, even if others have said/implied that there is).
not a clash, but you could have a walrus in the default expression, which could be pretty visually confusing. On the other hand, maybe that's a really bad idea anyway. And otherwise I like it.
BTW, => would have a similar problem if it's adopted as a shorter way to spell lambda. And that would be worse, as putting a lambda in a default depression might be good style in some cases :-)
In both cases, it would be confusing to a human, but not technically ambiguous. I'm not sure how important that will be - neither case seems particularly common, and if you do need to do it, you can always parenthesize a bit. Using the walrus operator in a default expression would be VERY weird (why on earth would you be assigning in the middle of default arg handling?!?), but if you really want it, sure! Using a hypothetical lambda function in an argument default wouldn't be unreasonable, but the number of times you'd also need that to be late-bound would be extremely few. Normally you'd get something like this: def merge_objects(stuff, key=lambda item: item.id): ... where the default key function doesn't need to refer to any of the other parameters, so it can be early-bound. But maybe you want to be able to do something weird like: def merge_objects(stuff, idfield="id", match=>lambda item: item[idfield]): ... in which case that might end up being spelled "match=>item => item[idfield]", but aside from that, it's unlikely to cause major problems. (For the most part, lambda functions will be used when *calling* that sort of function, and there's no ambiguity there, since you'll only ever use "=" for keyword arguments, or nothing at all for positional.) My view on this is: All variants of spelling that involve changes to the equals sign are one group of options, and it's my favoured group (whether it's "=>", ":=", "=:", etc). All variants that involve adornments elsewhere ("@var=dflt", "var=@dflt@", "var=`dflt`") are less appealing to me. The ":=" syntax is listed in the current PEP, but I'll adjust the wording of things a bit to make that group clearer. ChrisA
On Wed, Nov 10, 2021 at 6:02 PM Christopher Barker <pythonchb@gmail.com> wrote:
On Mon, Nov 8, 2021 at 11:22 PM Rob Cliffe via Python-ideas <python-ideas@python.org> wrote:
I have more than once advocated x:=default (and there is no clash with the walrus operator, even if others have said/implied that there is).
not a clash, but you could have a walrus in the default expression, which could be pretty visually confusing. On the other hand, maybe that's a really bad idea anyway. And otherwise I like it.
BTW, there is one other small wrinkle with the := spelling, which is that it's very similar to annotation syntax: def spam(a:int=1): ... def ham(a:=1): ... Again, not a fundamental problem to the parser, since an empty expression isn't a valid annotation, but could be confusing. I don't think we're going to get away from that confusion. There are just too many things we want to do with the equals sign, and only so many keys on most people's keyboards. ChrisA
Chris, Will we able to use late-bound arguments in dataclass when it’s creating the __init__ function? @dataclass class C: x: int y: int ls: list[int] => [x, y]
On 10 Nov 2021, at 11:25 AM, Chris Angelico <rosuav@gmail.com> wrote:
On Wed, Nov 10, 2021 at 6:02 PM Christopher Barker <pythonchb@gmail.com <mailto:pythonchb@gmail.com>> wrote:
On Mon, Nov 8, 2021 at 11:22 PM Rob Cliffe via Python-ideas <python-ideas@python.org> wrote:
I have more than once advocated x:=default (and there is no clash with the walrus operator, even if others have said/implied that there is).
not a clash, but you could have a walrus in the default expression, which could be pretty visually confusing. On the other hand, maybe that's a really bad idea anyway. And otherwise I like it.
BTW, there is one other small wrinkle with the := spelling, which is that it's very similar to annotation syntax:
def spam(a:int=1): ... def ham(a:=1): ...
Again, not a fundamental problem to the parser, since an empty expression isn't a valid annotation, but could be confusing.
I don't think we're going to get away from that confusion. There are just too many things we want to do with the equals sign, and only so many keys on most people's keyboards.
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org <mailto:python-ideas@python.org> To unsubscribe send an email to python-ideas-leave@python.org <mailto:python-ideas-leave@python.org> https://mail.python.org/mailman3/lists/python-ideas.python.org/ <https://mail.python.org/mailman3/lists/python-ideas.python.org/> Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/5WOGHM... <https://mail.python.org/archives/list/python-ideas@python.org/message/5WOGHMGGCKANNQA4RV53MS6PCVS2RIBN/> Code of Conduct: http://python.org/psf/codeofconduct/ <http://python.org/psf/codeofconduct/>
On Fri, Nov 26, 2021 at 6:22 PM Abdulla Al Kathiri <alkathiri.abdulla@gmail.com> wrote:
Chris,
Will we able to use late-bound arguments in dataclass when it’s creating the __init__ function?
@dataclass class C: x: int y: int ls: list[int] => [x, y]
With that syntax, no, because there's no object that can be stored in an annotation dictionary that would represent the code construct to create that effect. But the __init__ function is constructed with exec(), and that means that, in theory, dataclasses._field_init could be enhanced to have some way to indicate this - or, possibly, to *always* use late-bound defaults, since it appears to use sentinels for everything. I don't know enough about the workings of dataclasses.dataclass to be able to say for sure, but a cursory glance does suggest that, in some way, this should be possible. It may require stringifying the default though. ls: list[int] = "[x, y]" ChrisA
dataclasses use Field objects that can be created automatically, but also you can specify them if you need to do something special. And one of the special things you can do is set a default constructor -- I'm sure that could be extended to support early bound defaults. -CHB On Thu, Nov 25, 2021 at 11:40 PM Chris Angelico <rosuav@gmail.com> wrote:
On Fri, Nov 26, 2021 at 6:22 PM Abdulla Al Kathiri <alkathiri.abdulla@gmail.com> wrote:
Chris,
Will we able to use late-bound arguments in dataclass when it’s creating
the __init__ function?
@dataclass class C: x: int y: int ls: list[int] => [x, y]
With that syntax, no, because there's no object that can be stored in an annotation dictionary that would represent the code construct to create that effect.
But the __init__ function is constructed with exec(), and that means that, in theory, dataclasses._field_init could be enhanced to have some way to indicate this - or, possibly, to *always* use late-bound defaults, since it appears to use sentinels for everything.
I don't know enough about the workings of dataclasses.dataclass to be able to say for sure, but a cursory glance does suggest that, in some way, this should be possible. It may require stringifying the default though.
ls: list[int] = "[x, y]"
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/YVLHVQ... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
Yeah it makes sense the default_factory argument in the field object could be utilized to support early bound defaults.
On 26 Nov 2021, at 10:42 PM, Christopher Barker <pythonchb@gmail.com> wrote:
dataclasses use Field objects that can be created automatically, but also you can specify them if you need to do something special. And one of the special things you can do is set a default constructor -- I'm sure that could be extended to support early bound defaults.
-CHB
On Thu, Nov 25, 2021 at 11:40 PM Chris Angelico <rosuav@gmail.com <mailto:rosuav@gmail.com>> wrote: On Fri, Nov 26, 2021 at 6:22 PM Abdulla Al Kathiri <alkathiri.abdulla@gmail.com <mailto:alkathiri.abdulla@gmail.com>> wrote:
Chris,
Will we able to use late-bound arguments in dataclass when it’s creating the __init__ function?
@dataclass class C: x: int y: int ls: list[int] => [x, y]
With that syntax, no, because there's no object that can be stored in an annotation dictionary that would represent the code construct to create that effect.
But the __init__ function is constructed with exec(), and that means that, in theory, dataclasses._field_init could be enhanced to have some way to indicate this - or, possibly, to *always* use late-bound defaults, since it appears to use sentinels for everything.
I don't know enough about the workings of dataclasses.dataclass to be able to say for sure, but a cursory glance does suggest that, in some way, this should be possible. It may require stringifying the default though.
ls: list[int] = "[x, y]"
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org <mailto:python-ideas@python.org> To unsubscribe send an email to python-ideas-leave@python.org <mailto:python-ideas-leave@python.org> https://mail.python.org/mailman3/lists/python-ideas.python.org/ <https://mail.python.org/mailman3/lists/python-ideas.python.org/> Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/YVLHVQ... <https://mail.python.org/archives/list/python-ideas@python.org/message/YVLHVQK27TN6M5OKZKXJGUEZX47BRHGK/> Code of Conduct: http://python.org/psf/codeofconduct/ <http://python.org/psf/codeofconduct/>
-- Christopher Barker, PhD (Chris)
Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/KTBG6L... Code of Conduct: http://python.org/psf/codeofconduct/
On 31/10/2021 21:54, David Mertz, Ph.D. wrote:
On Sun, Oct 31, 2021, 5:39 PM Rob Cliffe via Python-ideas
PEP 671 will be USEFUL to Python programmers. We want it! (When do we want it? Now!)
This feels dishonest. I believe I qualify as a Python programmer. I started using Python 1.4 in 1998. The large majority of my work life since then had been programming Python. I've written a bunch of books on Python. I was a director of the PSF.
I DO NOT want PEP 671, and do not feel it would be a net benefit to Python programmers. Apologies. I meant to use jocular language to emphasize my point. Obviously that wasn't clear and you took me literally. I accept that some people do not want this PEP. Rob Cliffe
I am not on the SC, so indeed I won't make the decision. But I *will* continue to teach Python. New syntax adds a burden to learners, and it should not be introduced without sufficient benefit to merit that burden. This proposal does not come close to that threshold.
I was the first in this thread to propose a far more general capability that I think WOULD meet the cost/benefit balance... and I proposed it because I think there is a way to meet the niche need that also has wide enough application to warrant new syntax.
Python isn't Perl. Not every piece of syntax that *someone* might occasionally use for a narrow need should be included in the language.
On 2021-10-31 14:38, Rob Cliffe via Python-ideas wrote:
Wonderful! +1,000,000. So are you going to write a PEP explaining exactly how you propose to implement deferred expressions and how they would be used? And perhaps provide a reference implementation? Then we can see how it works, and how it can be used to replace or improve PEP 671? Say, next month? I really don't want to be rude, but I can't think of anything more appropriate to express my frustration/impatience (and I'm not the one doing the work! I can only guess how Chris A feels) with this foot-dragging, than the adage (possibly from poker): "Put up or shut up".
No, I'm not going to do that, because as I've said repeatedly throughout this thread, I don't see late-bound defaults as a pressing problem. There is no urgency whatsoever. I do think some kind of deferred evaluation would be nice, but I'm not saying "we need to not do PEP 671 right away because we should instead do this other thing right away". We don't need to take any action on this matter at all. I would rather we do nothing for another 10 years than adopt the current proposal. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Sun, 31 Oct 2021 at 12:45, Eric V. Smith <eric@trueblade.com> wrote:
I think it's safe to say people are opposed to the PEP as it current stands, not in it's final, as yet unseen, shape. But I'm willing to use other words that "I'm -1 on PEP 671". You can read my opposition as "as it currently stands, I'm -1 on PEP 671".
Same for me.
As for what seems like one major issue:
Yes, this is a kind of "deferred" evaluation, but it is not a general purpose one, and that, I think, is the strength of the proposal, it's small and specific, and, most importantly, the scope in which the expression will be evaluated is clear and simple.
And to me and others, what you see as a strength, and seem opposed to changing, we see as a fatal flaw.
What if the walrus operator could only be used in "for" loops? What if f-strings were only available in function parameters? What if decorators could only be used on free-standing functions, but not on object methods?
In all of these cases, what could be a general-purpose tool would have been restricted to one specific context. That would make the language more confusing to learn. I feel you're proposing the same sort of thing with late-bound function argument defaults. And I think it's a mistake.
I agree with Eric. I can see the value in a small and specific proposal, if there's a small and specific issue involved. But as the discussions have progressed, it seems to me that the small and specific issue that we *thought* was involved, has wider and more general implications: 1. There's a broader question of being able to tell if an argument was left unspecified in the call. Defaulting to None, and using a sentinel value, both try to address this in specific limited ways. PEP 671 is addressing the same issue, but from a different angle, because the code to supply the late bound default has to say, in effect, "if this argument wasn't supplied, evaluate the late-bound expression and use that as a default". That suggests to me that a better mechanism than "use (*args, **kwargs) and check if the argument was supplied" would be generally useful, and PEP 671 might just be another workaround that could be better fixed by addressing that need directly. 2. The question of when the late-bound default gets evaluated, and in what context, leads straight into the deferred expression debate that's been ongoing for years, in many different contexts. Maybe PEP 671 is just another example of something that wouldn't be an issue if we had deferred expressions. Sure, "now is better than never" might apply here - endlessly sticking with the status quo because we can't work out the details of the grand solution to everything is a classic "perfect is the enemy of the good" situation. But equally, maybe what we have already is *good enough*, and there's no real rush to solve just this one piece of the puzzle. It's tempting to solve the bit that we can see clearly right now, but that shouldn't blind us to the possibility of a more flexible solution that addresses the issue as part of a more general problem.
In contrast, a general deferred object would, to me, be really confusing about what scope it would get evaluated in -- I can't even imagine how I would do that -- how the heck am I supposed to know what names will be available in some function scope I pass this thing into??? Also, this would only allow a single expression, not an arbitrary amount of code -- if we're going to have some sort of "deferred object" -- folks will very soon want more than that, and want full deferred function evaluation. So that really is a whole other kettle of fish, and should be considered entirely separately.
And again, this is where we disagree. I think it should be considered in the full context of places it might be useful. I (and I think others) are concerned that we'd be painting ourselves into a corner with this proposal. For example, if the delayed evaluation were available as text via inspect.Signature, we'd be stuck with supporting that forever, even if we later added delayed evaluation objects to the language.
This is for me the main issue where I think "constraining the design of the broader feature later" really hurts. There's absolutely no doubt in my mind that *if* we already had deferred expressions, then we'd expose late-bound defaults as delayed expressions in the function's signature. If we implement them as strings just because we don't yet have delayed expressions, then when we do look at designing delayed expressions, we're stuck with the backward compatibility problem of how we fit them into function signatures without breaking all the code that's been written to expect a string. Yes, that's all 15 obscure utilities with a total user base of about 5 people ;-), I completely accept that this is a very niche issue - but backward compatibility often ends up concerned with such details, and they can be the hardest ones to fix.
I also have other problems with the PEP, not specifically about restricting the scope of where deferred evaluations are allowed. Most importantly, that it doesn't add enough expressiveness to the language to justify its existence as a new feature that everyone would have to learn. But also things such as: Where do exceptions get caught and handled (only by the caller)? How would you pass in "just use the default" from a wrapper function? And others, but even if they were all addressed except for the restricted nature of the feature, I'd still be -1 on the PEP.
These all seem like reasonable questions to me, and it feels to me that they are the sort of thing that is getting sidelined by the big deferred expression debate. Even if that debate ends up with a resolution, these questions will remain - and I can easily imagine everyone being so burned out at that point that we end up with sub-optimal, or even broken, answers to those questions, or possibly they just get completely forgotten about. Anyway, this discussion will no doubt take its course whatever I say. And once we have a final PEP, we can raise these issues again if they haven't been addressed. But until a new version of the PEP is released that addresses the points made so far, I don't think I have anything else to add, and the mountain of email is getting so big that I'm not even trying to follow the details of the arguments any more. So I'll wait and see where this ends up before commenting any more. Paul
On Sat, Oct 30, 2021, 9:40 PM Chris Angelico <rosuav@gmail.com> wrote:
I'm not sure what I think of a general statement like:
@do_later = fun1(data) + fun2(data)
I.e. we expect to evaluate the first class object `do_later` in some other context, but only if requested within a program branch where `data` is in scope.
If you want to create a "deferred" type, go ahead, but it won't conflict with this. There wouldn't be much to gain by restricting it to function arguments.
I agree there's no gain in restricting deferred computation to function arguments, but that's EXACTLY what your proposal is. The way you've written it, it's bound to an assignment, which seems very
odd. Are you creating an arbitrary object which can be evaluated in some other context? Wouldn't that be some sort of constructor call?
It's true I don't particularly like the @ syntax. I was just speculating on continuity with Steven's syntax. Here's my general proposal, which I actually want, but indeed don't have an implementation for. I think a soft keyword is best, such as 'defer' or 'later', but let's call it 'delay' for now to avoid the prior hang up on the word. (A) def foo(a: list, size: int = delay len(a)) -> None: print("The list has length", size) Whenever a name is referenced in this future Python, the interpreter first asks if it is a special delayed object. If not, do exactly what is done now. However, if it *IS* that special kind of object, instead do something akin to 'eval()'. No, the delayed object probably shouldn't just contain a string, but perhaps a chunk of compiled bytecode. (B) So what if we don't want to evaluate the delayed object? Either the same or a different keyword can do that: def foo(a: list, size: int = delay len(a)) -> None: a.append(42) bar(a, delay size) def bar(a, the_length): print("The expanded list has length", the_length) (C) What if we want to be more general with delaying (and potentially skipping) actions? expensive1 = delay big_computation(data) expensive2 = delay slow_lookup(data) def get_answer(data): # use globals for the example, less so in practice if approximate_compute_cost(data) > 1_000_000: return expensive2 else: return expensive1 That's it. It covers everything in your PEP, and great deal more that is far more important, all using the same syntax. A few special functions or operations should be able to look at the delayed object without evaluating it. I'm not sure details, but e.g.
print(delay expensive1) <Delayed compution of 'big_computation(data)'>
On Sun, Oct 31, 2021 at 1:28 PM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
On Sat, Oct 30, 2021, 9:40 PM Chris Angelico <rosuav@gmail.com> wrote:
I'm not sure what I think of a general statement like:
@do_later = fun1(data) + fun2(data)
I.e. we expect to evaluate the first class object `do_later` in some other context, but only if requested within a program branch where `data` is in scope.
If you want to create a "deferred" type, go ahead, but it won't conflict with this. There wouldn't be much to gain by restricting it to function arguments.
I agree there's no gain in restricting deferred computation to function arguments, but that's EXACTLY what your proposal is.
That's if you were to create a deferred *type*. An object which can be evaluated later. My proposal is NOT doing that, because there is no object that represents the unevaluated expression. You can't pull that expression out and evaluate it somewhere else. It wouldn't be meaningful.
The way you've written it, it's bound to an assignment, which seems very odd. Are you creating an arbitrary object which can be evaluated in some other context? Wouldn't that be some sort of constructor call?
It's true I don't particularly like the @ syntax. I was just speculating on continuity with Steven's syntax.
Here's my general proposal, which I actually want, but indeed don't have an implementation for. I think a soft keyword is best, such as 'defer' or 'later', but let's call it 'delay' for now to avoid the prior hang up on the word.
(A)
def foo(a: list, size: int = delay len(a)) -> None: print("The list has length", size)
Whenever a name is referenced in this future Python, the interpreter first asks if it is a special delayed object. If not, do exactly what is done now. However, if it *IS* that special kind of object, instead do something akin to 'eval()'.
Okay. Picture this situation: def ctx(): dflt, scratch = 1, 2 def set_default(x): nonlocal dflt; dflt = x def frob1(a=>dflt): print(a) def frob2(a = delay dflt): print(a) return set_default, frob With frob1, the compiler knows exactly which names mean which variables, just as in all current code. In this case, it knows that dflt is a nonlocal, and CPython will use LOAD_DEREF to look it up. With frob2, a is some opaque object that happens to be a deferred expression. That expression could be evaluated in any context. The compiler has to snapshot every variable in every containing scope (in this case, 'scratch'), in case the evaluation of a might happen to refer to them. In fact, it's worse. *EVERY* closure has to retain *EVERY* containing variable, just in case it's used in this way. Either that, or these delayed expressions are just eval'd strings, and need to be explicitly passed their globals and locals, which fails for closures anyway. In contrast, frob1 puts the code right there in the function, so everything behaves correctly.
No, the delayed object probably shouldn't just contain a string, but perhaps a chunk of compiled bytecode.
I don't know about other Python implementations, but in CPython, bytecode needs to know what kind of name a thing represents - LOAD_FAST, LOAD_DEREF, LOAD_GLOBAL - and for closures (dereferences), it needs to ensure that the surrounding context uses DEREF as well, when it otherwise would use FAST. Plus, every name reference in CPython bytecode is a lookup to a table of names (so it'll say, for instance, LOAD_GLOBAL 3 and the third entry in the name list might be "print", so it'll look up the global named print). CPython bytecode isn't well suited to this. So it would have to be some other sort of bytecode - something that retains very little context, and only has a basic parse. In fact, I think the AST is probably the closest (at least in CPython - again, I don't know other Pythons) to what you want here. It's basically the same thing as source code, but tokenized and turned into a logical tree. Unfortunately, AST can't be executed as such, and needs to be compiled into something ready to use.
So what if we don't want to evaluate the delayed object? Either the same or a different keyword can do that:
def foo(a: list, size: int = delay len(a)) -> None: a.append(42) bar(a, delay size)
def bar(a, the_length): print("The expanded list has length", the_length)
If it's the same keyword, you have a fundamental ambiguity: does that mean "use the name size in the target context", or "use the deferred object with its existing names"?
What if we want to be more general with delaying (and potentially skipping) actions?
expensive1 = delay big_computation(data) expensive2 = delay slow_lookup(data)
def get_answer(data): # use globals for the example, less so in practice if approximate_compute_cost(data) > 1_000_000: return expensive2 else: return expensive1
That's it. It covers everything in your PEP, and great deal more that is far more important, all using the same syntax.
It's a very different proposal. I don't think it's as related as it seems. For one thing, part of the point of these delayed expressions is that they are, well, delayed. Consider this difference: def spam1(a, n=>len(a)): a.append(3) print(n) def spam2(a, n=delay len(a)): a.append(4) print(n) A delayed expression should be evaluated at the point where it is used. A default argument expression should be evaluated as part of the function header. And it gets worse with multiple evaluations: def add_twice(item, lst=>[]): lst.append(item) lst.append(item) return lst def add_twice(item, lst=defer []): # as above My expectation from the first is that, if you don't specify a second argument, a new empty list is constructed, appended to twice, and then returned. With the defer expression, does it collapse to a value? And if so, when? Please, please, start a new discussion about delayed evaluation. I would love to participate. But I don't think it's a generalization of default argument expressions. ChrisA
On Sat, Oct 30, 2021, 11:03 PM Chris Angelico
That's if you were to create a deferred *type*. An object which can be evaluated later. My proposal is NOT doing that, because there is no object that represents the unevaluated expression. You can't pull that expression out and evaluate it somewhere else. It wouldn't be meaningful.
Put this way, I admit it is intelligible. Maybe it even reduces the order of magnitude of my opposition to it. What you are trying to create remains a "unit of computation" even though it is not itself a Python object. As Brendan points out, this is a fundamental change to Python's object model. It amounts to saying "I want to perform some computation within the body of a function, but I don't want to WRITE it in the body of a function." That feels like an anti-goal to me. But it's true that the code fragment 'if a is None: a = calculated()' is also not itself a first class object. So a synonym for that I can understand, albeit dislike. I suppose I don't *hate* something like this, which can be done with no syntax change: @rewrite def foo(a, size=Late("len(a)")): print("The list is length", size) In the functions I actually use and write, the problem all of this is trying to solve is either trivial or irrelevant. In libraries like Pandas, for example, there are often tens of arguments with defaults of None... But it is rarely a single default to resolve. Rather, there is complex logic about the interaction of which arguments are provided or absent, leading to numerous code paths to configure the actual behavior of a call. For these, the logic of missing arguments must live in the body, where it naturally belongs.
On Sun, Oct 31, 2021 at 2:52 PM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
On Sat, Oct 30, 2021, 11:03 PM Chris Angelico
That's if you were to create a deferred *type*. An object which can be evaluated later. My proposal is NOT doing that, because there is no object that represents the unevaluated expression. You can't pull that expression out and evaluate it somewhere else. It wouldn't be meaningful.
Put this way, I admit it is intelligible. Maybe it even reduces the order of magnitude of my opposition to it.
What you are trying to create remains a "unit of computation" even though it is not itself a Python object. As Brendan points out, this is a fundamental change to Python's object model.
It amounts to saying "I want to perform some computation within the body of a function, but I don't want to WRITE it in the body of a function." That feels like an anti-goal to me.
Not quite; I want to perform it within the *context* of the function, but not in the body. A function's context (its scope, available variables, etc, etc, etc) is where most of its body is executed, but for example, a list comprehension is part of the function's body while not being in the same context (it's in a subcontext defined by a nested function).
I suppose I don't *hate* something like this, which can be done with no syntax change:
@rewrite def foo(a, size=Late("len(a)")): print("The list is length", size)
In the functions I actually use and write, the problem all of this is trying to solve is either trivial or irrelevant.
Thing is, that can't actually be done, because there's no way to make that able to see the function's parameters without some VERY weird shenanigans. You'd have to completely recreate the argument-to-parameter allocation, apply all preceding defaults, and then have a set of locals which can be passed to eval(); and then having done all that, deconstruct your locals into *a,**kw to pass back to the original function. Meanwhile, you have to capture any nonlocals, in case len isn't actually a global. In contrast, having this as a compiler construct is simple: In the context of the function, if the parameter hasn't been given a value, evaluate the expression and assign. (Side point: The current reference implementation allows assignment expressions inside default argument expressions, mainly because I didn't go to the effort of blocking them. But if you ever ACTUALLY do this, then..... *wat*)
In libraries like Pandas, for example, there are often tens of arguments with defaults of None... But it is rarely a single default to resolve. Rather, there is complex logic about the interaction of which arguments are provided or absent, leading to numerous code paths to configure the actual behavior of a call.
For these, the logic of missing arguments must live in the body, where it naturally belongs.
There will always be cases too complicated for simple default expressions, just as there are already cases too complicated for simple default values. They will continue to be handled by the body of the function, as they currently are. This proposal isn't a replacement for all argument processing. ChrisA
On Sun, Oct 31, 2021 at 03:10:56PM +1100, Chris Angelico wrote:
(Side point: The current reference implementation allows assignment expressions inside default argument expressions, mainly because I didn't go to the effort of blocking them. But if you ever ACTUALLY do this, then..... *wat*)
Why? The result is well-defined. It might not be *good* code, but let's not be overzealous with banning legal code just because it is bad :-) That's the job of linters and code reviews. In an early bound default, the walrus operator binds to a name in the scope the expression is evaluated in, i.e. the surrounding scope. So a top level function with a walrus: string = "hello" def func(spam=(a:=len(string)), eggs=2**a): return (spam, eggs) binds to a global variable `a`. Similarly, a walrus inside a late bound default should bind to a local variable. The walrus doesn't get executed until the function namespace is up and running, and the walrus is a binding operation, so it should count as a local variable. -- Steve
On Sun, Oct 31, 2021 at 6:50 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Sun, Oct 31, 2021 at 03:10:56PM +1100, Chris Angelico wrote:
(Side point: The current reference implementation allows assignment expressions inside default argument expressions, mainly because I didn't go to the effort of blocking them. But if you ever ACTUALLY do this, then..... *wat*)
Why? The result is well-defined. It might not be *good* code, but let's not be overzealous with banning legal code just because it is bad :-)
It's just because of the lack of clarity with where the variable is being defined. If we declare that argument expressions are evaluated in some sort of subcontext (like list comps are), then assignment expressions would no longer change the function's namespace. (Not to mention the confusion with early-bound expressions, where it would add a name to the surrounding scope - not a problem for the compiler, but confusing for humans.) By permitting it now, we make it harder to make such a change in the future. That said, though, I think that that change would be unlikely to be of value (with comprehensions, there's the inherent use of the iteration variable, which doesn't apply here), so it probably won't matter.
That's the job of linters and code reviews.
In an early bound default, the walrus operator binds to a name in the scope the expression is evaluated in, i.e. the surrounding scope. So a top level function with a walrus:
string = "hello" def func(spam=(a:=len(string)), eggs=2**a): return (spam, eggs)
binds to a global variable `a`.
Similarly, a walrus inside a late bound default should bind to a local variable. The walrus doesn't get executed until the function namespace is up and running, and the walrus is a binding operation, so it should count as a local variable.
Yeah, that's how it's currently implemented. If we're willing to define that as a permanent feature, then the definition is clear, and it is indeed the job of linters and code review to reject this kind of horrible code :) ChrisA
On Sat, Oct 30, 2021 at 09:31:38PM -0400, David Mertz, Ph.D. wrote:
Both the choice of syntax and the discussion of proposed implementation (both yours and Steven's) would make it more difficult later to advocate and implement a more general "deferred" mechanism in the future.
The choice of syntax would be independent of a more general deferred mechanism. Even if we had such a mechanism (and we do... see below) we would still want a short, readable way to mark parameters to the interpreter to use them. We already have various such forms (generator comprehensions, Futures?, Promises?), including functions themselves. We can defer evaluating an expression by putting it in a function: x + 1 # eagerly evaluated right now obj = lambda: x + 1 # defer evaluation ... obj() # and evaluate or a string (evaluate it later with eval() or exec()). So, right now, we could implement delayed evaluation of defaults: # I want a new list each time. func(arg=lambda: []): arg = arg() This is inconvenient and annoying. The signature is obfuscated. The calling convention becomes: result = func(lambda: [1, 2, 3]) instead of the more natural `func([1, 2, 3])`. What this proposal brings is a way of keeping the most natural signature, the most natural calling convention, and still automatically delaying the evaluation. All we need is one tiny bit of new syntax: def func(@arg=expression) # this is the best :-) plus some backend stuff in the interpreter, a few dunders in function objects. Maybe a new inspect function. A more general mechanism for deferring the execution of code is interesting but also a much bigger problem to solve. We already have at least one way to do it, provided the user is willing to explicitly call the function object, or eval() the string. Unless the deferred object gives us some wildly powerful new functionality, such as automagic evaluation on need (and that is a hard problem), it isn't clear why we would bother.
I'm not sure what I think of a general statement like:
@do_later = fun1(data) + fun2(data)
do_later = lambda: fun1(data) + fun2(data) # much later result = do_later() -- Steve
On Tue, 26 Oct 2021, Steven D'Aprano wrote:
def func(x=x, y=>x) # or func(x=x, @y=x)
This makes me think of a "real" use-case for assigning all early-bound defaults before late-bound defaults: consider using closure hacks (my main use of early-bound defaults) together with late-bound defaults, as in ``` for i in range(n): def func(arg := expensive(i), i = i): ... ``` I think it's pretty common to put closure hacks at the end, so they don't get in the way of the caller. (The intent is that the caller never specifies those arguments.) But then it'd be nice to be able to use those variables in the late-bound defaults. I can't say this is beautiful code, but it is an application and would probably be convenient. On Tue, 26 Oct 2021, Eric V. Smith wrote:
Among my objections to this proposal is introspection: how would that work? The PEP mentions that the text of the expression would be available for introspection, but that doesn't seem very useful.
I think what would make sense is for code objects to be visible, in the same way as `func.__code__`. But it's definitely worth fleshing out whether: 1. Late-bound defaults are in `func.__defaults__` and `func.__kwdefaults__` -- where code objects are treated as special kind of default values. This seems problematic because we can't distinguish between a late-bound default and an early-bound default that is a code object. or 2. There are new defaults like `func.__late_defaults__` and `func.__late_kwdefaults__`. The issue here is that it's not clear in what order to mix `func.__defaults__` and `func.__late_defaults` (each a tuple). Perhaps most natural is to add a new introspection object, say LateDefault, that can take place as a default value (but can't be used as an early-bound default?), and has a __code__ attribute. --- By the way, another thing missing from the PEP: presumably lambda expressions can also have late-bound defaults? On Tue, 26 Oct 2021, Marc-Andre Lemburg wrote:
Now, it may not be obvious, but the key advantage of such deferred objects is that you can pass them around, i.e. the "defer os.listdir(DEFAULT_DIR)" could also be passed in via another function.
Are deferred code pieces are dynamically scoped, i.e., they are evaluated in whatever scope they end up getting evaluated? That would certainly interesting, but also kind of dangerous (about as dangerous as eval), and I imagine fairly prone to error if they get passed around a lot. If they're *not* dynamically scoped, then I think they're equivalent to lambda, and then they don't solve the default parameter problem, because they'll be evaluated in the function's enclosing scope instead of the function's scope. Erik -- Erik Demaine | edemaine@mit.edu | http://erikdemaine.org/
On Wed, Oct 27, 2021 at 3:39 AM Erik Demaine <edemaine@mit.edu> wrote:
On Tue, 26 Oct 2021, Steven D'Aprano wrote:
def func(x=x, y=>x) # or func(x=x, @y=x)
This makes me think of a "real" use-case for assigning all early-bound defaults before late-bound defaults: consider using closure hacks (my main use of early-bound defaults) together with late-bound defaults, as in
``` for i in range(n): def func(arg := expensive(i), i = i): ... ```
I think it's pretty common to put closure hacks at the end, so they don't get in the way of the caller. (The intent is that the caller never specifies those arguments.) But then it'd be nice to be able to use those variables in the late-bound defaults.
I can't say this is beautiful code, but it is an application and would probably be convenient.
Got any realistic examples? Seems very hackish to me.
Perhaps most natural is to add a new introspection object, say LateDefault, that can take place as a default value (but can't be used as an early-bound default?), and has a __code__ attribute.
Yeah, I'll have to play with this. It may be necessary to wrap *both* types of default such that you can distinguish them. Effectively, instead of a tuple of values, you'd have a tuple of defaults, each one stating whether it's a value or a code block. ChrisA
On 26.10.2021 18:36, Erik Demaine wrote:
On Tue, 26 Oct 2021, Marc-Andre Lemburg wrote:
Now, it may not be obvious, but the key advantage of such deferred objects is that you can pass them around, i.e. the "defer os.listdir(DEFAULT_DIR)" could also be passed in via another function.
Are deferred code pieces are dynamically scoped, i.e., they are evaluated in whatever scope they end up getting evaluated? That would certainly interesting, but also kind of dangerous (about as dangerous as eval), and I imagine fairly prone to error if they get passed around a lot.
Yes, they would work more or less like copy & pasting the deferred code into a new context and running it there. Sure, you can abuse this, but the function running the deferred can make sure that it's working in a trusted environment.
If they're *not* dynamically scoped, then I think they're equivalent to lambda, and then they don't solve the default parameter problem, because they'll be evaluated in the function's enclosing scope instead of the function's scope.
Indeed. Lambdas are similar, but not the same. The important part is running the code in a different context. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Oct 26 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
Hi Chris, I feel like we're pretty close to agreement. :-) The only difference is that I still lean toward allowing one of the two left-to-right options, and not trying to raise SyntaxErrors. I feel like detecting this kind of bad code belongs more with a linter than the programming language itself. But you're definitely right that it's easier to give permissions later than take them away, and there are two natural left-to-right orders... Speaking of implementation as Guido just raised, maybe going with what makes the most sense in the implementation would be fitting here? I'm guessing it's left-to-right overall (among all arguments), which is also the simpler-to-explain rule. I would actually find it pretty weird for references to arguments to the right to make sense even if they could... Actually, if we use the left-to-right overall order, this is the more conservative choice. If code worked with that order, and we later decided that the two-pass default assignment is better, it would be backward-compatible (except that some previously failing code would no longer fail). On Tue, 26 Oct 2021, Chris Angelico wrote:
Personally, I'd expect to use late-bound defaults almost all or all the time; [...]
Interesting. In many cases, the choice will be irrelevant, and early-bound is more efficient. There aren't many situations where early-bind semantics are going to be essential, but there will be huge numbers where late-bind semantics will be unnecessary.
Indeed; you could even view those cases as optimizations, and convert late-bound immutable constants into early-bound defaults. (This optimization would only be completely equivalent if we stick to a global left-to-right ordering, though.)
A key difference from the PEP is that JavaScript doesn't have the notion of "omitted arguments"; any omitted arguments are just passed in as `undefined`; so `f()` and `f(undefined)` always behave the same (triggering default argument behavior).
Except when it doesn't, and you have to use null instead... I have never understood those weird inconsistencies!
Heh, yes, it can get confusing. But in my experience, all of JavaScript's built-in features treat `undefined` as special; it's the initial value of variables, it's the value for omitted arguments; etc. `null` is just another sentinal value, often preferred by programmers perhaps because it's shorter and/or better known. Also, confusingly, `undefined == null`. Eh, and `null ?? 5` acts the same as `undefined ?? 5` -- never mind. :-)
There is a subtlety mentioned in the case of JavaScript, which is that the default value expressions are evaluated in their own scope:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/...
Yeah, well, JS scope is a weird mess of historical artifacts. Fortunately, we don't have to be compatible with it :)
That is true, but default values aren't part of the original history; they were added in ECMAscript 5 in 2009. So they probably had some issues in mind here, as it seems like added complexity, so was probably an intentional addition.
This is perhaps worth considering for the Python context. I'm not sure this is as important in Python, because UnboundLocalError exists (so attempts to access things in the function's scope will fail), but perhaps I'm missing a ramification...
Hmm. I think the only way it could possibly matter would be something like this:
def f(x=>spam): global spam spam += 1
Unsure what this should do. A naive interpretation would be this:
def f(x=None): if x is None: x = spam global spam spam += 1
and would bomb with SyntaxError. But perhaps it's better to permit this, on the understanding that a global statement anywhere in a function will apply to late-bound defaults; or alternatively, to evaluate the arguments in a separate scope. Or, which would be a simpler way of achieving the same thing: all name lookups inside function defaults come from the enclosing scope unless they are other arguments. But maybe that's unnecessarily complicated.
Inspired by your example, here's one that doesn't even involve `global`: ``` spam = 5 def f(x := spam): spam = 10 f() ``` Does this fail (UnboundLocalError or SyntaxError or whatever) or succeed with x set to 5? If we think of the default arguments getting evaluated in their own scope, is its parent scope the function's scope or its enclosing scope? The former is closer to the `if x is None` behavior we're replacing, while the latter is a bit closer to the current semantics of default arguments. I think this is very confusing code, so it's not particularly important to make either choice, but we need to make a decision. The less permissive thing seems to be using the function's scope (and fail), so perhaps that's a better choice. On the other hand, given that `global spam` and `nonlocal spam` would just be preventing `spam` from being defined in the function's scope, it seems more reasonable for your example to work, just like the following should: ``` spam = 5 def f(x := spam): print(x, spam) # 5 5 f() ``` Here's another example where it matters whether the default expressions are computed within their own scope: ``` def f(x := (y := 5)): print(x) # 5 print(y) # 5??? f() ``` I feel like we don't want to allow accessing `y` in the body of `f` here, because whether `y` is bound depends on whether `x` was passed. (If `x` is passed, `y` won't get assigned.) This would suggest evaluating default expressions in their own scope would be beneficial. Intuitively, the parens are indicating a separate scope, in the same way that `(x for x in it)` creates its own scope and thus doesn't leak `x`. On the other hand, `((y := x) for x in it)` does seem to leak `y`, so I'm not really sure what would be best / most consistent here. Erik -- Erik Demaine | edemaine@mit.edu | http://erikdemaine.org/
On Tue, Oct 26, 2021 at 5:50 AM Erik Demaine <edemaine@mit.edu> wrote:
But you're definitely right that it's easier to give permissions later than take them away, and there are two natural left-to-right orders...
Speaking of implementation as Guido just raised, maybe going with what makes the most sense in the implementation would be fitting here? I'm guessing it's left-to-right overall (among all arguments), which is also the simpler-to-explain rule. I would actually find it pretty weird for references to arguments to the right to make sense even if they could...
Actually, if we use the left-to-right overall order, this is the more conservative choice. If code worked with that order, and we later decided that the two-pass default assignment is better, it would be backward-compatible (except that some previously failing code would no longer fail).
Maybe I'm overthinking the parallels with existing idioms. There is no current idiom which can be described as having a perfect correlation, so maybe it's best to just describe all of them as very rough approximations, and think solely about the behaviour of a function in a post-PEP-671 world. Let's try this. * Function parameters * A function receives some number of parameters as defined by its signature. When a function is called, parameters get assigned their values in the order that they are listed; either they are assigned an argument as given by the caller, or the default given in the signature. For example: def parrot(voltage, state='a stiff', action=>rand_verb(), type='Norwegian Blue'): print("This parrot wouldn't", action) ... parrot('a million', 'bereft of life') This behaves as if you wrote: def parrot(): voltage = 'a million' state = 'bereft of life' action = rand_verb() type = 'Norwegian Blue' print("This parrot wouldn't", action) ... *** We can continue to bikeshed the precise syntax, but I think the semantics of pure left-to-right make very good sense. Will have to get an implementation together before being sure, though.
On Tue, 26 Oct 2021, Chris Angelico wrote:
Personally, I'd expect to use late-bound defaults almost all or all the time; [...]
Interesting. In many cases, the choice will be irrelevant, and early-bound is more efficient. There aren't many situations where early-bind semantics are going to be essential, but there will be huge numbers where late-bind semantics will be unnecessary.
Indeed; you could even view those cases as optimizations, and convert late-bound immutable constants into early-bound defaults. (This optimization would only be completely equivalent if we stick to a global left-to-right ordering, though.)
Yeah. And that's another good reason to keep the two default syntaxes as similar as possible - no big keyword adornment like "deferred" - so that code isn't unnecessarily ugly for using more early-bound defaults. ChrisA
My 2¢ (perhaps it should be 3¢ as I've already contributed 2¢). Chris A did ask/ "do Python core devs agree with less-skilled Python programmers on the intuitions?"/ so putting myself firmly in the second camp (though I have been using Python for over a decade) here are my thoughts in case they have some slight value. Again, +1 on the PEP. The absence of late-binding argument defaults is a gap in Python. Whether it is a serious enough gap to warrant plugging it is of course a matter of opinion. IMO most people find late binding more natural (and probably more useful) than early binding. Witness the number of Stack Overflow questions about it. Yes, there would be more questions asking what the difference is, if late binding were provided, but hey, people have to learn to use the tools in their box. Syntax bikeshedding: I still favour var := expr IMO the similarity to early binding syntax is a good thing (or at least not a bad thing). Just as the walrus operator is similar to `=` - after all they are both a form of assignment. As is `=` in a function signature. I see no need to add a new symbol. I don't like a keyword (hard or soft). It's verbose. It's unnecessary. And if it's `defer`, I find it too reminiscent of Twisted's deferreds (which I always have trouble getting my head round, although I've used them many times), suggesting that the expression is actually a thunk, or some async feature, or something else weird and wonderful. I don't think argument defaults should be allowed to refer to later arguments (or of course the current argument). That's making the interpreter's task too complicated, not to mention (surely?) inefficient. And it's confusing. And it's a restriction which could be removed later if desirable. (Actually I think I have written code where either, but not both, of 2 arguments were mandatory, but I can't recall the details. I can live with having to code this explicitly, using arg=None or some such.) I don't think it's a huge deal whether attempting it causes a SyntaxError or a runtime (UnboundLocalError?) error, though if it can be a SyntaxError that's obviously quicker to debug. Although as Steven said: /Why would this be a "hard-to-track-down" bug? You get an// // UnboundLocalError telling you exactly what the problem is.// // // UnboundLocalError: local variable 'b' referenced before assignment /(and presumably the line number)/ /I don't think making it a SyntaxError is 100% "breaking new ground" [contra Guido], as e.g. def f(): x = x+1 global y is not a SyntaxError, but if you change `y` to `x` it is. I respectfully disagree with Marc-Andre Lemburg: /"Explicit is better than implicit" and this is too much "implicit"// // for my taste.// // // For simple use cases, this may save a few lines of code, but as soon// // as you end up having to think whether the expression will evaluate to// // the right value at function call time, the scope it gets executed// // in, what to do with exceptions, etc., you're introducing too much// // confusion with this syntax.// // // Example:// // def process_files(processor, files=>os.listdir(DEFAULT_DIR)):/ (a) Why is a late-bound default any more implicit than an early-bound default? Why is a late-bound default more confusing than an early-bound default? Why should there be more confusion over an early-bound default evaluated in the outer/global scope than an late-bound default evaluated in the function scope? (b) It's unfair to denigrate the proposal of late-bound defaults by showing how it can be abused in an example where the default value can vary wildly (and might not even be under the programmer's control). Any feature can be abused. You always have the status quo option of explicitly coding what you mean rather than using (any kind of) defaults. I agree with Chris A here: /Having a new category of function parameters would make these calls// // even more complicated. It also overemphasizes, in my opinion, the// // difference between ways that optional arguments are provided with// // their values./ though truth to tell this is mainly, as a Bear of Little Brain, the existing categories with the `/` and '*' separators are quite enough to confuse me already. Of the two options given at some point in the thread by Chris A: /1) Arguments are defined left-to-right, each one independently of each other// // 2) Early-bound arguments and those given values are defined first,// // then late-bound arguments// // // The first option is much easier to explain, but will never give useful// // results for out-of-order references (unless it's allowed to refer to// // the containing scope or something). The second is closer to the "if x// // is None: x = y + 1" equivalent, but is harder to explain.// /I prefer 1). Easier to understand and debug in examples with side-effects such as def f(a := enter_codes(), b = assign_targets(), c := unlock_missiles(), d = FIRE()): (not that this is something to be particularly encouraged). Re Guido's suggestions: /Maybe you can't combine early and late binding defaults in the same signature.// // Or maybe all early binding defaults must precede all late binding defaults./ I don't like the first. While it is always safer to forbid something first and maybe allow it later, IMO mixing early and late binding is something that will inevitably be wanted sooner or later. And I hazard a guess that it wouldn't (much) simplify the implementation to forbid it. As to the second: As far as I can see it would have the same effect as straight L-to-R evaluation, except that it would allow a late-binding default to refer to a subsequent early-binding default, e.g. def f(a := b+1, b = b_default): I don't feel strongly about this. But L-to-R is nice and simple, and I would reverse the parameter order here to make it work (and be more comprehensible). Best wishes Rob Cliffe
On Tue, Oct 26, 2021 at 11:44 AM Rob Cliffe via Python-ideas <python-ideas@python.org> wrote:
I prefer 1). Easier to understand and debug in examples with side-effects such as def f(a := enter_codes(), b = assign_targets(), c := unlock_missiles(), d = FIRE()): (not that this is something to be particularly encouraged).
It's worth noting that this would call the functions at different times; assign_targets and FIRE would be called when the function is defined, despite not entering the codes and unlocking the missiles until you actually call f(). The difference between early evaluation and late evaluation is that one retains the *value* and the other retains the *expression*. So it's something like: _b_default = assign_targets(); _d_default = FIRE() def f(a, b, c, d): if a is not set: a = enter_codes() if b is not set: b = _b_default if c is not set: c = unlock_missiles() if d is not set: d = _d_default ChrisA
On 2021-10-26 at 12:12:47 +1100, Chris Angelico <rosuav@gmail.com> wrote:
On Tue, Oct 26, 2021 at 11:44 AM Rob Cliffe via Python-ideas <python-ideas@python.org> wrote:
I prefer 1). Easier to understand and debug in examples with side-effects such as def f(a := enter_codes(), b = assign_targets(), c := unlock_missiles(), d = FIRE()): (not that this is something to be particularly encouraged).
It's worth noting that this would call the functions at different times; assign_targets and FIRE would be called when the function is defined, despite not entering the codes and unlocking the missiles until you actually call f().
So much for evaluating default values from left to right; this could be trouble even if the functions themsevles don't have side effects, but merely access data that has been mutated between function definition time and function call time. Requiring that all late bound defaults come after all early bound defaults (which has already come up as a possibility) seems like a reasonable solution.
The difference between early evaluation and late evaluation is that one retains the *value* and the other retains the *expression*. So it's something like:
_b_default = assign_targets(); _d_default = FIRE() def f(a, b, c, d): if a is not set: a = enter_codes() if b is not set: b = _b_default if c is not set: c = unlock_missiles() if d is not set: d = _d_default
Is the phrase/concept "retains the expression" new? Unless it's really a new concept, is there an existing way to say that?
On Tue, Oct 26, 2021 at 12:40 PM <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:
On 2021-10-26 at 12:12:47 +1100, Chris Angelico <rosuav@gmail.com> wrote:
On Tue, Oct 26, 2021 at 11:44 AM Rob Cliffe via Python-ideas <python-ideas@python.org> wrote:
I prefer 1). Easier to understand and debug in examples with side-effects such as def f(a := enter_codes(), b = assign_targets(), c := unlock_missiles(), d = FIRE()): (not that this is something to be particularly encouraged).
It's worth noting that this would call the functions at different times; assign_targets and FIRE would be called when the function is defined, despite not entering the codes and unlocking the missiles until you actually call f().
So much for evaluating default values from left to right; this could be trouble even if the functions themsevles don't have side effects, but merely access data that has been mutated between function definition time and function call time. Requiring that all late bound defaults come after all early bound defaults (which has already come up as a possibility) seems like a reasonable solution.
The difference between early evaluation and late evaluation is that one retains the *value* and the other retains the *expression*. So it's something like:
_b_default = assign_targets(); _d_default = FIRE() def f(a, b, c, d): if a is not set: a = enter_codes() if b is not set: b = _b_default if c is not set: c = unlock_missiles() if d is not set: d = _d_default
Is the phrase/concept "retains the expression" new? Unless it's really a new concept, is there an existing way to say that?
It's sloppy terminology, so the expression is probably new. You'll find similar phenomena in lambda functions, comprehensions, and the like, where you provide an expression that gets evaluated later; but nothing's really "retained", it's really just that the code is run at a particular time. The compiler looks over all of the source code and turns it into something runnable (in the case of CPython, that's bytecode). The code for early-evaluated defaults is part of the execution of the "def" statement; late-evaluated defaults is part of the call to the function itself. Here's an example: def f(): print("Start") def g(x=q()): print("Inside") print("Done") And here's the disassembly, with my annotations:
dis.dis(f) 2 0 LOAD_GLOBAL 0 (print) 2 LOAD_CONST 1 ('Start') 4 CALL_FUNCTION 1 6 POP_TOP
-- here we start defining the function -- 3 8 LOAD_GLOBAL 1 (q) 10 CALL_FUNCTION 0 12 BUILD_TUPLE 1 -- default args are stored in a tuple -- 14 LOAD_CONST 2 (<code object g at 0x7f7bdb4098b0, file "<stdin>", line 3>) 16 MAKE_FUNCTION 1 (defaults) 18 STORE_FAST 0 (g) -- the code is already compiled, so it just attaches the defaults to the existing code object -- 5 20 LOAD_GLOBAL 0 (print) 22 LOAD_CONST 3 ('Done') 24 CALL_FUNCTION 1 26 POP_TOP 28 LOAD_CONST 0 (None) 30 RETURN_VALUE -- this is the body of g() -- Disassembly of <code object g at 0x7f7bdb4098b0, file "<stdin>", line 3>: 4 0 LOAD_GLOBAL 0 (print) 2 LOAD_CONST 1 ('Inside') 4 CALL_FUNCTION 1 6 POP_TOP 8 LOAD_CONST 0 (None) 10 RETURN_VALUE -- by the time we get into this block of code, args have already been set -- Late-evaluated defaults would slip in just before the print("Inside") line. Technically there's no "expression" that gets "retained", since it's just a matter of where the bytecode gets placed; but in terms of explaining it usefully, the sloppy description is far easier to grok than a detailed look at bytecode - plus, the bytecode is implementation-specific, and not mandated by the language. ChrisA
On 2021-10-26 at 12:51:43 +1100, Chris Angelico <rosuav@gmail.com> wrote:
On Tue, Oct 26, 2021 at 12:40 PM <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:
On 2021-10-26 at 12:12:47 +1100, Chris Angelico <rosuav@gmail.com> wrote:
The difference between early evaluation and late evaluation is that one retains the *value* and the other retains the *expression*. So it's something like:
_b_default = assign_targets(); _d_default = FIRE() def f(a, b, c, d): if a is not set: a = enter_codes() if b is not set: b = _b_default if c is not set: c = unlock_missiles() if d is not set: d = _d_default
Is the phrase/concept "retains the expression" new? Unless it's really a new concept, is there an existing way to say that?
It's sloppy terminology, so the expression is probably new. You'll find similar phenomena ...
I get the feature (and I've stated my opinion thereof), and I understand what you meant. I guess I learned a long time ago not to make up new words for existing things, or to reuse old words for new things. You're staying pretty focused (good job, and thank you!), but there's enough random ideas floating around this thread that I thought I'd say something.
On Tue, Oct 26, 2021 at 1:20 PM <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:
On 2021-10-26 at 12:51:43 +1100, Chris Angelico <rosuav@gmail.com> wrote:
On Tue, Oct 26, 2021 at 12:40 PM <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:
On 2021-10-26 at 12:12:47 +1100, Chris Angelico <rosuav@gmail.com> wrote:
The difference between early evaluation and late evaluation is that one retains the *value* and the other retains the *expression*. So it's something like:
_b_default = assign_targets(); _d_default = FIRE() def f(a, b, c, d): if a is not set: a = enter_codes() if b is not set: b = _b_default if c is not set: c = unlock_missiles() if d is not set: d = _d_default
Is the phrase/concept "retains the expression" new? Unless it's really a new concept, is there an existing way to say that?
It's sloppy terminology, so the expression is probably new. You'll find similar phenomena ...
I get the feature (and I've stated my opinion thereof), and I understand what you meant. I guess I learned a long time ago not to make up new words for existing things, or to reuse old words for new things. You're staying pretty focused (good job, and thank you!), but there's enough random ideas floating around this thread that I thought I'd say something.
Yep, it's absolutely worth speaking up if someone says something unclear :) Forcing me to explain myself in detail is a good thing, no need to feel bad for doing it. ChrisA
On 26/10/2021 02:12, Chris Angelico wrote:
On Tue, Oct 26, 2021 at 11:44 AM Rob Cliffe via Python-ideas <python-ideas@python.org> wrote:
I prefer 1). Easier to understand and debug in examples with side-effects such as def f(a := enter_codes(), b = assign_targets(), c := unlock_missiles(), d = FIRE()): (not that this is something to be particularly encouraged).
It's worth noting that this would call the functions at different times; assign_targets and FIRE would be called when the function is defined, despite not entering the codes and unlocking the missiles until you actually call f().
The difference between early evaluation and late evaluation is that one retains the *value* and the other retains the *expression*. So it's something like:
_b_default = assign_targets(); _d_default = FIRE() def f(a, b, c, d): if a is not set: a = enter_codes() if b is not set: b = _b_default if c is not set: c = unlock_missiles() if d is not set: d = _d_default
ChrisA You're right, I wasn't thinking clearly. It would have been a better example if I had used late binding for ALL arguments. (Still, I averted a nuclear war, albeit accidentally. 😁) Rob
On Tue, Oct 26, 2021 at 01:32:58AM +0100, Rob Cliffe via Python-ideas wrote:
Syntax bikeshedding: I still favour var := expr
That clashes with the walrus operator. Remember that the walrus operator can appear inside the expression: var:=spam+eggs:=(something+other) or eggs Modifying the assignment symbol is wrong. This is not a new kind of assignment, it should use the same `=` regular assignment. We are tagging the parameter to use late-binding, not using a different sort of assignment. The tag should be on the parameter name, not the assignment.
IMO the similarity to early binding syntax is a good thing (or at least not a bad thing).
Right, because binding is binding, and we should use the same `=`.
Just as the walrus operator is similar to `=` - after all they are both a form of assignment.
But the walrus is a different form of assignment, it is an expression, not a statement. Function parameter defaults are not literally statements, they are declarations, which are a kind of statement.
I don't think argument defaults should be allowed to refer to later arguments (or of course the current argument). That's making the interpreter's task too complicated, not to mention (surely?) inefficient. And it's confusing.
Worst case, the interpreter has to do two passes over the parameters instead of one. The inefficiency is negligible. As for confusing, I think you are conflating "it's new" for "it is confusing". You aren't confused. I doubt that anyone capable of writing a Python function would be confused by the concept: def func(a=1, @b=c+1, c=2): is no more confusing than the status quo: def func(a=1, b=None, c=2): if b is None: b = c + 1 If you can understand the second, you can understand the first. All you have to remember is that: 1. positional arguments are bound to parameters first, left to right; 2. keyword arguments are bound to parameters next; 3. regular (early bound) defaults are bound next; 4. and lastly, late-bound defaults are bound. Easey-peasey. I really wish people would stop assuming that fellow Python coders are knuckle-dragging troglodytes incapable of learning behaviour equivalent to behaviour they have already learned: The status quo: 1. positional arguments are bound to parameters first, left to right; 2. keyword arguments are bound to parameters next; 3. regular (early bound) defaults are bound last. All we're doing is adding one more step. If that is confusing to people, wait until you discover classes and operator precedence! x = 2*3**4 - 1 Have some faith that coders aren't idiots. There are genuinely confusing features that are *inherently* complicated and complex, like threading, asynchronous code, metaclasses, the descriptor class, and we cope. But the idea that people won't be able to wrap their brains around the interpreter assigning defaults in four passes rather than three is not credible. 99% of the time you won't even think about it, and the one time in a hundred you do, it is simple. Early binding defaults are bound first, late binding defaults are bound as late as possible. (Maybe even as late as *on need* rather than before the body of the function is entered. That would be really nice, but maybe too hard to implement.) -- Steve
On Tue, Oct 26, 2021 at 12:56:06PM +1100, Steven D'Aprano wrote:
Have some faith that coders aren't idiots. There are genuinely confusing features that are *inherently* complicated and complex, like threading, asynchronous code, metaclasses, the descriptor class, and we cope.
Sorry, that was a typo, I mean the descriptor *protocol*. -- Steve
On Tue, Oct 26, 2021 at 01:32:58AM +0100, Rob Cliffe via Python-ideas wrote:
Syntax bikeshedding: I still favour var := expr That clashes with the walrus operator. Remember that the walrus operator can appear inside the expression:
var:=spam+eggs:=(something+other) or eggs That is a SyntaxError. I'm not sure what you mean, my best effort to make it legal is (var:=spam+(eggs:=(something+other) or eggs)) And I don't understand what point you're making here. Yes, the walrus operator can appear in various places, how is that relevant? You could write def f(a := (b := c)): which might be a tad confusing but would be unambiguous and legal, just as def f(a = (b := c)): is currently legal (I tested it). I don't see a clash.
Modifying the assignment symbol is wrong. This is not a new kind of assignment, it should use the same `=` regular assignment. We are tagging the parameter to use late-binding, not using a different sort of assignment. The tag should be on the parameter name, not the assignment. With respect, it IS a new kind of assignment. One which happens at a different time (and whose value may vary in multiple calls of the function). The value (however calculated) is assigned to the
On 26/10/2021 02:56, Steven D'Aprano wrote: parameter. Once assigned, the parameter and its value are indistinguishable from ones that used early binding, or indeed had a value passed by the caller. It is not a new kind of parameter (in that sense).
IMO the similarity to early binding syntax is a good thing (or at least not a bad thing). Right, because binding is binding, and we should use the same `=`.
See above.
Just as the walrus operator is similar to `=` - after all they are both a form of assignment. But the walrus is a different form of assignment, it is an expression, not a statement.
Function parameter defaults are not literally statements, they are declarations, which are a kind of statement.
I don't think argument defaults should be allowed to refer to later arguments (or of course the current argument). That's making the interpreter's task too complicated, not to mention (surely?) inefficient. And it's confusing. Worst case, the interpreter has to do two passes over the parameters instead of one. The inefficiency is negligible. Perhaps I wasn't clear. When I said 'inefficiency', I meant to refer to cases like def f(a := b+1, b := e+1, c := a+1, d := 42, e := d+1) where late-binding defaults are allowed to refer to subsequent arguments. Here Python has to work out to assign first to d, then e,
As for confusing, I think you are conflating "it's new" for "it is confusing". You aren't confused. I doubt that anyone capable of writing a Python function would be confused by the concept:
def func(a=1, @b=c+1, c=2):
is no more confusing than the status quo:
def func(a=1, b=None, c=2): if b is None: b = c + 1 Confusing is perhaps the wrong word. I think the first example IS harder to read. When you read the first, you have to read a, then b,
True of course. But it is also an assignment, in that the value on the RHS of the walrus is assigned to the variable on the LHS, just as with a regular assignment. then b, then a, and finally c, which AFAICS requires multiple passes. But if it can all be worked out at compile time, it's not a runtime efficiency problem, though to my mind really obfuscated (why not write the arguments in the order they are intended to be calculated?). But to play Devil's Advocate for a moment, here is a possible use case: def DrawCircle(centre, radius := circumference / TWO_PI, circumference := radius * TWO_PI): # Either radius or circumference can be passed, whichever is more convenient then 'oh what is the default value of b, I'll look at c', then skip back to b to see what the intention is, then forward again to c because you're interested in that too. It would be better written as def func(a=1, c=2, @b=c+1): There is some to-and-froing in the second example too, but the function header has fewer symbols and is easier to take in. The information is presented in more, smaller chunks (3 lines instead of 1). (Of course, this kind of argument could be used against all argument defaults (including early-bound ones), and a lot of other convenient language features as well. We have to use our common sense/intuition/judgement to decide when conciseness outweighs explicitness.)
If you can understand the second, you can understand the first. All you have to remember is that:
1. positional arguments are bound to parameters first, left to right;
2. keyword arguments are bound to parameters next;
3. regular (early bound) defaults are bound next;
4. and lastly, late-bound defaults are bound.
Easey-peasey.
I really wish people would stop assuming that fellow Python coders are knuckle-dragging troglodytes incapable of learning behaviour equivalent to behaviour they have already learned:
I don't. But equally I don't want to make their lives harder than they need be. If it turns out that binding all early-bound defaults before binding all late-bound defaults is the best solution (and there is support for this position), fine. It makes some cases legal and working that wouldn't be otherwise. My initial reaction (preferring strict L-to-R evaluation of all defaults) was very likely wrong.
The status quo:
1. positional arguments are bound to parameters first, left to right;
2. keyword arguments are bound to parameters next;
3. regular (early bound) defaults are bound last.
All we're doing is adding one more step. If that is confusing to people, wait until you discover classes and operator precedence!
x = 2*3**4 - 1
FWIW I have no trouble parsing that (although I came up with an incorrect answer of 80 because I forgot to actually do the `2*` 🙁).
[snip]
(Maybe even as late as *on need* rather than before the body of the function is entered. That would be really nice, but maybe too hard to implement.)
Never mind hard-to-implement, it would be REALLY confusing. When the default value is calculated up front, you know what it is (so to speak). If it could be calculated deep inside the function, perhaps in multiple places giving different answers, perhaps used repeatedly in a loop (but only calculated the first time), perhaps not calculated at all in some code paths, that's obfuscation and abuse. Better to use a sentinel value. Or something. Rob Cliffe
On Tue, Oct 26, 2021 at 08:59:51AM +0100, Rob Cliffe via Python-ideas wrote:
And I don't understand what point you're making here. Yes, the walrus operator can appear in various places, how is that relevant? You could write def f(a := (b := c)): which might be a tad confusing but would be unambiguous and legal, just as def f(a = (b := c)): is currently legal (I tested it). I don't see a clash.
If we have a choice between a dozen syntax variants that are not confusing, and one which is confusing, why would we prefer to pick the one that is confusing over any of the others?
Perhaps I wasn't clear. When I said 'inefficiency', I meant to refer to cases like def f(a := b+1, b := e+1, c := a+1, d := 42, e := d+1) where late-binding defaults are allowed to refer to subsequent arguments. Here Python has to work out to assign first to d, then e, then b, then a, and finally c, which AFAICS requires multiple passes.
No, I don't think we need to do anything that intricate. It is not the responsibility of the interpreter to **make it work** no matter what, any more than we expect the interpreter to make this work: a = b + 1 b = e + 1 c = a + 1 d = 42 # <<<<<-----start here e = d + 1 It is enough to have a simple rule: - bind early bound defaults left to right first; - bind late bound defaults left to right next. (That's my preference.) Even simpler would be a strictly left-to-right single pass but that would be, I think, too simple. YMMV. I'm not prepared to fight to the death over that one :-) -- Steve
On Sun, Oct 31, 2021 at 7:10 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Oct 26, 2021 at 08:59:51AM +0100, Rob Cliffe via Python-ideas wrote:
And I don't understand what point you're making here. Yes, the walrus operator can appear in various places, how is that relevant? You could write def f(a := (b := c)): which might be a tad confusing but would be unambiguous and legal, just as def f(a = (b := c)): is currently legal (I tested it). I don't see a clash.
If we have a choice between a dozen syntax variants that are not confusing, and one which is confusing, why would we prefer to pick the one that is confusing over any of the others?
Perhaps I wasn't clear. When I said 'inefficiency', I meant to refer to cases like def f(a := b+1, b := e+1, c := a+1, d := 42, e := d+1) where late-binding defaults are allowed to refer to subsequent arguments. Here Python has to work out to assign first to d, then e, then b, then a, and finally c, which AFAICS requires multiple passes.
No, I don't think we need to do anything that intricate. It is not the responsibility of the interpreter to **make it work** no matter what, any more than we expect the interpreter to make this work:
a = b + 1 b = e + 1 c = a + 1 d = 42 # <<<<<-----start here e = d + 1
It is enough to have a simple rule:
- bind early bound defaults left to right first; - bind late bound defaults left to right next.
(That's my preference.) Even simpler would be a strictly left-to-right single pass but that would be, I think, too simple. YMMV. I'm not prepared to fight to the death over that one :-)
My current implementation uses the two-pass system, but only because that's simpler to code. Philosophically, I would prefer a one-pass system. What I will say is: I'm not going to lock the language into the two-pass system, and other Python implementations, or future versions of CPython, should be free to go one-pass. In any case, I have yet to see any examples that depend on two-pass that I wouldn't have rejected in code review. ChrisA
On 31/10/2021 08:05, Steven D'Aprano wrote:
On Tue, Oct 26, 2021 at 08:59:51AM +0100, Rob Cliffe via Python-ideas wrote:
And I don't understand what point you're making here. Yes, the walrus operator can appear in various places, how is that relevant? You could write def f(a := (b := c)): which might be a tad confusing but would be unambiguous and legal, just as def f(a = (b := c)): is currently legal (I tested it). I don't see a clash. If we have a choice between a dozen syntax variants that are not confusing, and one which is confusing, why would we prefer to pick the one that is confusing over any of the others?
"confusing" is a subjective term. So is "evocative" - see below. Consider the colon. Currently it has various meanings, including - introducing an indented suite - slicing - dictionary displays - use in annotations (I think) and others I don't remember right now, probably half-a-dozen in all. But we don't get confused when we (very commonly) see it used in 2 different ways in the same line: if not sys.argv[1:]: # 1st example I came across in the stdlib nor would we if it were used in 3 or more ways in the same line. Because we're familiar with it. Now def f(a := (b := c)): might be a tad confusing AT FIRST, until we got used to it. But - this would be very uncommon, and IMO in most cases bad code - there would never be *more* than 2 meanings for `:=` in the same line, because there would only BE two meanings for it. (I refrain from calling it `the walrus operator` here because the first one would be better called `the late default assignment operator` or some such.) Meanwhile, BEFORE we got used to it, I maintain that the similarity of `:=` to the early default assignment operator, viz. `=`, not to mention to the real walrus operator, is definitely evocative of some sort of (default?) value being given (somehow, sometime) to the parameter. Whereas adding (somewhere) a symbol such as '@' or '?' conveys nothing to me. YMMV, naturally. And of course it keeps that symbol free for another future use. Best wishes Rob Cliffe
On 26/10/2021 02:56, Steven D'Aprano wrote:
On Tue, Oct 26, 2021 at 01:32:58AM +0100, Rob Cliffe via Python-ideas wrote:
Syntax bikeshedding: I still favour var := expr That clashes with the walrus operator. Remember that the walrus operator can appear inside the expression:
var:=spam+eggs:=(something+other) or eggs
Sorry, I finally understood your point. I still don't see a problem. This would be legal: def f(var:=spam+(eggs:=(something+other) or egg)): just as def f(var=spam+(eggs:=(something+other) or egg)): is currently legal. (The extra pair of parentheses I added are necessary.) Rob Cliffe
On 2021-10-25 18:56, Steven D'Aprano wrote:
Modifying the assignment symbol is wrong. This is not a new kind of assignment, it should use the same `=` regular assignment. We are tagging the parameter to use late-binding, not using a different sort of assignment. The tag should be on the parameter name, not the assignment.
I agree with half of this :-). I agree that it's not a new kind of assignment. But I don't think we're tagging the parameter to use late binding. We're tagging the default value itself to not be evaluated right now (i.e., at function definition time) but later (at call time). To me this is another thing that suggests a more general deferred-evaluation system is the best way to handle this. If we're tagging the parameter to not be evaluated "right now", why must we restrict "right now" to be "the time when we're defining a function" and restrict this to apply to function parameters rather than, well, anything? Why not just say we can tag stuff as not being evaluated right now and then later evaluate it? -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On 25 Oct 2021, at 08:08, Steven D'Aprano <steve@pearwood.info> wrote:
I would say that it makes most sense to assign early-bound defaults first, then late-bound defaults, specifically so that late-bound defaults can refer to early-bound ones:
def func(x=0, @y=x+1)
So step 3 above should become:
In this case you do not need a new rule to make it work as in the left-to-right order x = 0 first. def func(@y=x+1, @x=0): Is it unreasonable to get a UnboundLocal or SyntaxError for this case? I'm not convinced that extra rules are needed. Barry
On Mon, Oct 25, 2021 at 02:59:02PM +0100, Barry Scott wrote:
def func(@y=x+1, @x=0):
Is it unreasonable to get a UnboundLocal or SyntaxError for this case?
I think that UnboundLocalError is fine, if the caller doesn't supply x. So all of these cases will succeed: func(21, 20) func(21, x=20) func(y=21, x=20) func(x=20, y=21) func(x=20) and I think that the only[1] case that fails is: func() An UnboundLocalError here is perfectly fine. That error is conceptually the same as this: def func(): y = x + 1 x = 0 and we don't try to make that a syntax error. [1] To be pedantic, there are other cases like func(x=20, x=20) and func(1, 2, 3, 4) that also fail. But you knew what I meant :-) -- Steve
On 2021-10-24 11:23, Chris Angelico wrote:
Ah, but is it ALL argument defaults, or only those that are late-evaluated? Either way, it's going to be inconsistent with itself and harder to explain. That's what led me to change my mind.
I don't understand what this means. The ones that are early-evaluated were already evaluated at function definition time. From the PEP:
Function default arguments can be defined using the new ``=>`` notation::
def bisect_right(a, x, lo=0, hi=>len(a), *, key=None): def connect(timeout=>default_timeout): def add_item(item, target=>[]):
The expression is saved in its source code form for the purpose of inspection, and bytecode to evaluate it is prepended to the function's body.
To me that last part clearly indicates the way things should go. They should go exactly like they currently go with the `if arg is None` idiom. The code that gets prepended to the beginning of the function should be exactly equivalent (or as exactly as something can be equivalent to pseudocode :-) to: for arg in arglist: if arg is_undefined: arg = eval(late_evaluated_default) Arguments that don't have any kind of default won't reach this stage, because the function should fail with a TypeError (missing argument) before even getting to evaluating late-bound arguments. Arguments that have early-bound defaults also won't reach this stage, because they can't be "undefined" --- either a value was passed, or the early-bound default was used. A strict left-to-right evaluation order seems by far the easiest to explain, and easily allows for the kind of mutually-referential cases under discussion. If the only problem is that UnboundLocalError is a weird error, well, that's a small price to pay. If possible it would be nice to detect if the UnboundLocalError was referring to another late-bound argument in the signature and give a nicer error message. But UnboundLocalError makes more sense than SyntaxError for sure. Of course, as I said, I don't support this proposal at all, but I appear to be in the minority on that, and if does go through I think it would be even worse if it raises SyntaxError. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Sun, Oct 24, 2021 at 9:58 AM Chris Angelico <rosuav@gmail.com> wrote:
def puzzle(*, a=>b+1, b=>a+1): return a, b
Aside: In a functional programming language a = b + 1 b = a + 1 would be a syntax (or at least compile time) error.
but it's a NameError in Python, yes? or maybe an UnboundLocalError, depending on context.
There are two possibilities: either it's a SyntaxError, or it's a run-time UnboundLocalError if you omit both of them (in which case it would be perfectly legal and sensible if you specify one of them).
This, indeed, would be similar to the behaviour of the current idiom we're trying to replace: In [42]: def fun(a=None, b=None): ...: a = a if a is not None else b + 1 ...: b = b if b is not None else a + 1 ...: print(f'{a=} {b=}') similar in the sense that whether it works or not depends on how the function is called. In [45]: fun() --------------------------------------------------------------------------- TypeError: unsupported operand type(s) for +: 'NoneType' and 'int' In [46]: fun(3) a=3 b=4
I'm currently inclined towards SyntaxError,
Except that it's not invalid syntax -- as far as syntax is concerned, you can put any valid expression in there -- it's only "bad" if it happens to use a name that is also being late bound. Is there any other place in Python where a syntax error depends on what names are used (other than keyworks, of course)? I'll leave it to folks that understand the implementation a lot better than I, but how hard would it be to make that trigger a SyntaxError? So I vote for UnboundLocalError
since permitting it would
open up some hard-to-track-down bugs, but am open to suggestions about how it would be of value to permit this.
Even if not, maybe it should raise UnboundLocalError, rather than SyntaxError I like how it closely mirrors the current idiom, and I don't think saying that it's evaluated left to right is all that complex. But no, I can't think of a use case :-) -CHB
On Mon, Oct 25, 2021 at 4:24 PM Christopher Barker <pythonchb@gmail.com> wrote:
On Sun, Oct 24, 2021 at 9:58 AM Chris Angelico <rosuav@gmail.com> wrote:
def puzzle(*, a=>b+1, b=>a+1): return a, b
Aside: In a functional programming language a = b + 1 b = a + 1 would be a syntax (or at least compile time) error.
but it's a NameError in Python, yes? or maybe an UnboundLocalError, depending on context.
There are two possibilities: either it's a SyntaxError, or it's a run-time UnboundLocalError if you omit both of them (in which case it would be perfectly legal and sensible if you specify one of them).
This, indeed, would be similar to the behaviour of the current idiom we're trying to replace: In [42]: def fun(a=None, b=None): ...: a = a if a is not None else b + 1 ...: b = b if b is not None else a + 1 ...: print(f'{a=} {b=}')
similar in the sense that whether it works or not depends on how the function is called. In [45]: fun() --------------------------------------------------------------------------- TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'
In [46]: fun(3) a=3 b=4
I'm currently inclined towards SyntaxError,
Except that it's not invalid syntax -- as far as syntax is concerned, you can put any valid expression in there -- it's only "bad" if it happens to use a name that is also being late bound. Is there any other place in Python where a syntax error depends on what names are used (other than keyworks, of course)?
In the first draft of the PEP, I left it underspecified, and was mentally assuming that there'd be an UnboundLocalError. Making it a SyntaxError isn't too difficult, since the compiler knows about all arguments as it's building that bytecode. It's like the checks to make sure you don't try to declare one of your parameters as 'global'. There are several reasons that I'm preferring SyntaxError here. 1) No use-case springs to mind where you would want arguments to depend on each other. 2) If, even years in the future, a use-case is found, it's much easier to make the language more permissive than less. 3) Precisely matching the semantics of "if x is None: x = <expr>" would allow right-hand references to early-bound defaults but not other late-bound defaults 4) OTOH, stating that arguments are processed left-to-right implies that *everything* about each arg is processed in that order 5) If the wrong-direction reference is a bug, it's much more helpful to get an early SyntaxError than to get UnboundLocalError later and more rarely. (BTW: I'm seeing only two options here - SyntaxError at compilation time or UnboundLocalError at call time. There's no reason to have an error at function definition time. I could be wrong though.)
I like how it closely mirrors the current idiom, and I don't think saying that it's evaluated left to right is all that complex.
But no, I can't think of a use case :-)
Yeah, it seems perfectly reasonable, but it becomes messy. Consider: def fun(a=1, b=2): print(a, b) If you change one or both of those to late-bound, it doesn't change anything. Great! Now: def fun1(a=>b + 1, b=2): print(a, b) def fun2(a=>b + 1, b=>2): print(a, b) Having b late-bound doesn't change b in any way, but it could make a bizarre difference to a's legality, depending on whether "left to right" means only late-bound defaults, or all argument processing. We've recently had a bit of a thread on python-list about the restrictions on assignment expressions, and tying in with that, the recent relaxing of restrictions on decorators. Both examples give me confidence that restricting "wrong-direction" references is correct, at least for now; if it turns out to be wrong, it can be changed. ChrisA
On Mon, 2021-10-25 at 03:47 +1100, Chris Angelico wrote:
On Mon, Oct 25, 2021 at 3:43 AM Jonathan Fine <jfine2358@gmail.com> wrote:
Hi
Please forgive me if it's not already been considered. Is the following valid syntax, and if so what's the semantics? Here it is:
def puzzle(*, a=>b+1, b=>a+1): return a, b
Aside: In a functional programming language a = b + 1 b = a + 1 would be a syntax (or at least compile time) error.
I was about to ask about this. But also, how does that go together with non-required arguments? def function(arr=>np.asarray(arr)): pass Would seem like something we may be inclined to write instead of: def function(arr): arr = np.asarray(arr) (if that is legal syntax). In that case `arr` is a required parameter. Which then means that you cannot do it for optional parameters?: def function(arr1=>np.asarray(arr), arr2=>something): arr2 = np.asarray(arr2) # in case arr2 was passed in Which is fair, but feels like a slightly weird difference in usage between required and optional arguments?
There are two possibilities: either it's a SyntaxError, or it's a run-time UnboundLocalError if you omit both of them (in which case it would be perfectly legal and sensible if you specify one of them).
I'm currently inclined towards SyntaxError, since permitting it would open up some hard-to-track-down bugs, but am open to suggestions about how it would be of value to permit this.
Not sure that I am scared of this if it gives a clear exception: Parameter `a` was not passed, but it can only be omitted when parameter `b` is passed. Not as clear (or complete) as a custom message, but not terrible? Cheers, Sebastian
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/YVE5PM... Code of Conduct: http://python.org/psf/codeofconduct/
On Mon, Oct 25, 2021 at 3:12 PM Sebastian Berg <sebastian@sipsolutions.net> wrote:
On Mon, 2021-10-25 at 03:47 +1100, Chris Angelico wrote:
On Mon, Oct 25, 2021 at 3:43 AM Jonathan Fine <jfine2358@gmail.com> wrote:
Hi
Please forgive me if it's not already been considered. Is the following valid syntax, and if so what's the semantics? Here it is:
def puzzle(*, a=>b+1, b=>a+1): return a, b
Aside: In a functional programming language a = b + 1 b = a + 1 would be a syntax (or at least compile time) error.
I was about to ask about this. But also, how does that go together with non-required arguments?
def function(arr=>np.asarray(arr)): pass
Would seem like something we may be inclined to write instead of:
def function(arr): arr = np.asarray(arr)
(if that is legal syntax). In that case `arr` is a required parameter. Which then means that you cannot do it for optional parameters?:
This is all about argument defaults, not transforming values that were passed in. So if you pass a value, you always get exactly that value.
def function(arr1=>np.asarray(arr), arr2=>something): arr2 = np.asarray(arr2) # in case arr2 was passed in
Which is fair, but feels like a slightly weird difference in usage between required and optional arguments?
There are two possibilities: either it's a SyntaxError, or it's a run-time UnboundLocalError if you omit both of them (in which case it would be perfectly legal and sensible if you specify one of them).
I'm currently inclined towards SyntaxError, since permitting it would open up some hard-to-track-down bugs, but am open to suggestions about how it would be of value to permit this.
Not sure that I am scared of this if it gives a clear exception:
Parameter `a` was not passed, but it can only be omitted when parameter `b` is passed.
Not as clear (or complete) as a custom message, but not terrible?
A tad complicated and would require some hairy analysis. Also, I don't really want to encourage argument defaults that depend on each other :) ChrisA
On Mon, Oct 25, 2021 at 03:47:29AM +1100, Chris Angelico wrote:
There are two possibilities: either it's a SyntaxError, or it's a run-time UnboundLocalError if you omit both of them (in which case it would be perfectly legal and sensible if you specify one of them).
I'm currently inclined towards SyntaxError, since permitting it would open up some hard-to-track-down bugs, but am open to suggestions about how it would be of value to permit this.
You said it yourself: "perfectly legal and sensible". Why would this be a "hard-to-track-down" bug? You get an UnboundLocalError telling you exactly what the problem is. UnboundLocalError: local variable 'b' referenced before assignment -- Steve
On Sun, Oct 24, 2021 at 05:40:55PM +0100, Jonathan Fine wrote:
Please forgive me if it's not already been considered. Is the following valid syntax, and if so what's the semantics? Here it is:
def puzzle(*, a=>b+1, b=>a+1): return a, b
We can consider that to be syntactic sugar for: def puzzle(*, a=None, b=None): if a is None: a = b+1 if b is None: b = a+1 So that has a perfectly sensible interpretation: - a is optional - b is optional - but you must supply at least one. and should be perfectly legal. I see no reason to prohibit it. (It would be nice if we could give a better exception, rather than just UnboundLocalError, but that's not essential.) -- Steve
On 10/24/21 11:22 PM, Steven D'Aprano wrote:
On Sun, Oct 24, 2021 at 05:40:55PM +0100, Jonathan Fine wrote:
def puzzle(*, a=>b+1, b=>a+1): return a, b
We can consider that to be syntactic sugar for:
def puzzle(*, a=None, b=None): if a is None: a = b+1 if b is None: b = a+1
So that has a perfectly sensible interpretation:
- a is optional - b is optional - but you must supply at least one.
and should be perfectly legal. I see no reason to prohibit it.
(It would be nice if we could give a better exception, rather than just UnboundLocalError, but that's not essential.)
+1
On Sun, Oct 24, 2021, 12:20 AM Chris Angelico
How would it know to look for a and b inside fn2's scope, instead of looking for x inside fn2's scope?
The same way 'eval("a+b")' knows to look in the local scope when evaluated. I mean, of course 'x' could be rebound in some scope before it was evaluated. But a "deferred" object itself would simply represent potential computation that may or may not be performed. If we wanted to use descriptors, and e.g. use 'x.val' rather than plain 'x' , we could do it now with descriptors. But not with plain variable names now.
On Mon, Oct 25, 2021 at 12:47 AM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
On Sun, Oct 24, 2021, 12:20 AM Chris Angelico
How would it know to look for a and b inside fn2's scope, instead of looking for x inside fn2's scope?
The same way 'eval("a+b")' knows to look in the local scope when evaluated.
I mean, of course 'x' could be rebound in some scope before it was evaluated. But a "deferred" object itself would simply represent potential computation that may or may not be performed.
If we wanted to use descriptors, and e.g. use 'x.val' rather than plain 'x' , we could do it now with descriptors. But not with plain variable names now.
Not sure I understand. Your example was something like: def fn2(thing): a, b = 13, 21 x = 5 print("Thing is:", thing) def f(x=defer: a + b): a, b = 3, 5 fn2(defer: x) return x So inside f(), "defer: a + b" will look in f's scope and use the variables a and b from there, but passing a "defer: x" to fn2 will use x from f's scope, and then a and b from fn2's? ChrisA
On Sun, Oct 24, 2021, 10:11 AM Chris Angelico
Not sure I understand. Your example was something like:
def fn2(thing): a, b = 13, 21 x = 5 print("Thing is:", thing)
def f(x=defer: a + b): a, b = 3, 5 fn2(defer: x) return x
So inside f(), "defer: a + b" will look in f's scope and use the variables a and b from there, but passing a "defer: x" to fn2 will use x from f's scope, and then a and b from fn2's?
Yes, basically as you describe. A "deferred object" is basically just a string that knows to wrap itself in eval() when accessed (maybe Steven's "thunk" is a better term... But definitely something first-class, unlike in Algol). So within fn2() the parameter 'thing' is bound to an object like '<defer eval("a+b")>'. Indeed this means that the 'defer:' or 'thunk' or special symbol spelling, has to decide whether the thing being deferred is already itself a deferred object. If so, just pass along the identical object rather than treat it as a new expression. At least that's what would feel most intuitive to me. But I'm deliberately not pushing a specific syntax. For example, if we didn't want to "reuse" the spelling 'defer:' for both creating and passing a deferred object, we could have different spellings. def f(x=defer: a + b): a, b = 3, 5 fn2(noeval: x) return x ... I don't like that spelling, but just showing the concept.
Actually, the "defer:"-syntax is really readable and searchable compared to the cryptic comparison operator used in the proposal. Just thinking towards "googleability". Furthermore, the concept is even more general than parameter definition of functions and methods. I guess a lot of people have already tryied to implement this kind of object various times before (proxy objects, transparent futures etc.) It also would fit the current semantics of default parameters of Python. Cheers Sven On 24.10.21 16:37, David Mertz, Ph.D. wrote:
On Sun, Oct 24, 2021, 10:11 AM Chris Angelico
Not sure I understand. Your example was something like:
def fn2(thing): a, b = 13, 21 x = 5 print("Thing is:", thing)
def f(x=defer: a + b): a, b = 3, 5 fn2(defer: x) return x
So inside f(), "defer: a + b" will look in f's scope and use the variables a and b from there, but passing a "defer: x" to fn2 will use x from f's scope, and then a and b from fn2's?
Yes, basically as you describe. A "deferred object" is basically just a string that knows to wrap itself in eval() when accessed (maybe Steven's "thunk" is a better term... But definitely something first-class, unlike in Algol).
So within fn2() the parameter 'thing' is bound to an object like '<defer eval("a+b")>'.
Indeed this means that the 'defer:' or 'thunk' or special symbol spelling, has to decide whether the thing being deferred is already itself a deferred object. If so, just pass along the identical object rather than treat it as a new expression.
At least that's what would feel most intuitive to me. But I'm deliberately not pushing a specific syntax. For example, if we didn't want to "reuse" the spelling 'defer:' for both creating and passing a deferred object, we could have different spellings.
def f(x=defer: a + b): a, b = 3, 5 fn2(noeval: x) return x
... I don't like that spelling, but just showing the concept.
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/RMHPLS... Code of Conduct: http://python.org/psf/codeofconduct/
On Sun, Oct 31, 2021 at 9:24 PM Sven R. Kunze <srkunze@mail.de> wrote:
Actually, the "defer:"-syntax is really readable and searchable compared to the cryptic comparison operator used in the proposal. Just thinking towards "googleability".
Google's smarter than that. I've searched for symbols before and found plenty of good results. For instance, I can search for information about the @ sign before a function, or independently, for a @ b, and get information about decorators or matrix multiplication. We don't need words - especially not words that will break people's code - in order for people to find information.
Furthermore, the concept is even more general than parameter definition of functions and methods. I guess a lot of people have already tryied to implement this kind of object various times before (proxy objects, transparent futures etc.)
It also would fit the current semantics of default parameters of Python.
It's a completely different feature, and has very different consequences. It is not a complete replacement for default expressions. Notably, it can't refer to anything in the caller's context, without breaking a lot of things about Python's namespacing model. ChrisA
On 31.10.21 12:34, Chris Angelico wrote:
Google's smarter than that. I've searched for symbols before and found plenty of good results. For instance, I can search for information about the @ sign before a function, or independently, for a @ b, and get information about decorators or matrix multiplication. We don't need words - especially not words that will break people's code - in order for people to find information.
It seems we disagree here. :)
It's a completely different feature, and has very different consequences. It is not a complete replacement for default expressions. Notably, it can't refer to anything in the caller's context, without breaking a lot of things about Python's namespacing model.
People on the threads said that they simply want to initialize an empty list [] by a desire to avoid the None scheme. I would rather solve those kind of issues than help to squeeze complicated logic into default parameters. But that's just my take on it looking from testing and maintenance perspective here. Another idea that comes to my mind is that a separate object allows more in terms of the open-closed principle than a fixed syntax used for one single, hopefully best use-case. Thinking here of the call-by-name and call-by-need evaluation. About the namespacing issue: I disagree here because it is always possible to interface these kind of variables explicitly (like we do with globals, locals, builtins, etc.). So, it would be a compatible addition. Still, we talk about default parameters. Best Sven * Trying that searches do not present the word "matrix multiplication"; at least not to me.
On Sun, Oct 31, 2021 at 09:20:32PM +0100, Sven R. Kunze wrote:
People on the threads said that they simply want to initialize an empty list [] by a desire to avoid the None scheme.
I would rather solve those kind of issues than help to squeeze complicated logic into default parameters. But that's just my take on it looking from testing and maintenance perspective here.
Sorry Sven, I'm not sure I understand you. Are you suggesting that it is simpler and easier to create a whole new execution model for Python, involving a general mechanism for delayed evaluation of expressions, to solve the mutable list default problem, than to merely add late-bound defaults to functions? What sort of testing and maintenance perspective are you referring to? Testing and maintenance of the Python interpreter? Or your own code? If it is your own code then this proposal will have no effect on your testing and maintenance. You already have tests, don't you? Then they will work the same way whether your functions use late-bound defaults or not. If you don't have tests, then the tests that you don't have will continue to not work the same as they currently don't work. Whether your functions use the proposed new late bound defaults, or the legacy work-around pseudo-late bound defaults using a sentinel, is a matter of your functions' internal implementation. It's not something that you should write a doctest or unittest or regression test for: def test_late_bound_default_is_used(self): # Test that late-binding is used for defaults. ... Pedantry: you may be able to test for it using introspection on the function object, but *why* would you do it??? The choice between the legacy "check for None" idiom and the proposed new late-binding idiom is an implementation detail. (Except for the case that your API intentionally documents that None is usable as an argument, in which case, simply change nothing and your code will continue to work as it does today. Including your tests that None is actually usable as an argument.)
Another idea that comes to my mind is that a separate object allows more in terms of the open-closed principle than a fixed syntax used for one single, hopefully best use-case. Thinking here of the call-by-name and call-by-need evaluation.
Call-by-name and call-by-need are, as far as I can tell, specific implementations with no intentional behaviour differences that are visible to the caller. Unlike call-by-value and call-by-reference, which do have intentional behaviour differences. Call-by-value makes copies of your arguments; call-by-reference allows you to modify variables in the caller's scope. -- Steve
On Mon, Nov 1, 2021 at 11:26 AM Steven D'Aprano <steve@pearwood.info> wrote:
What sort of testing and maintenance perspective are you referring to? Testing and maintenance of the Python interpreter? Or your own code?
If it is your own code then this proposal will have no effect on your testing and maintenance. You already have tests, don't you? Then they will work the same way whether your functions use late-bound defaults or not. If you don't have tests, then the tests that you don't have will continue to not work the same as they currently don't work.
For the record: Out of 430 test files in the CPython test suite, only eight failed after I finished my first version of the implementation. That's a lot of tests that don't even care. (Some of those failures indicate actual problems that I needed to fix, or still need to. A couple are simple and very rigid tests that check things like the size of a function object. I don't think any of them indicate actual problems in any place other than the code I actually need to change - the parser, executor, and the inspect module.) Most code won't even be aware of this change. ChrisA
Love the proposed syntax, it's clear and visible without being verbose. I was a bit worried with the "=:" syntax as, at first glance, it can be easily mistaken with a regular "=" but you can't miss "=>". Also, I think the syntax fits really well with the meaning of the operator, which is important. I'm also partial to "?=", as a second option, as it reads like a ternary operator (x?=len(y) -> x if x else len(y)) The point about scope for late-binding is also interesting and a good idea. It gives us what I think is powerful syntax for OO: class Foo: def bar(self, baz=>self.baz): ... This is great! +10 from me.
Let's talk about the choice of spelling. So far, most of the suggested syntaxes have modified the assignment symbol `=` using either a prefix or a suffix. I'm going to use '?' as the generic modifier. So the parameter will look like: # Modifier as a prefix param?=expression param:Type?=expression # Modifier as a suffix param=?expression param:Type=?expression One problem with these is that (depending on the symbol used), they can be visually confused with existing or possible future features, such as: * using a colon may be confusable with a type hint or walrus operator; * using a greater-than may be confusable with the proposed Callable sugar -> or lambda sugar => * as well as just plain old greater-than; * using a question mark may be confusable with hypothetical None-aware ??= assignment. By confusable, I don't mean that a sophisticated Python programmer who reads the code carefully with full attention to detail can't work out what it means. I mean that new users may be confused between (say) the walrus `:=` and "reverse walrus" `=:`. Or the harrassed and stressed coder working at 3am while their customer on the other side of the world keep messaging them. We don't always get to work carefully with close attention to detail, so we should prefer syntax that is less likely to be confusable. So far, I dislike all of those syntaxes (regardless of which symbol is used as the modifier). They are all predicated on the idea that this is a new sort of assignment, which I think is the wrong way to think about it. I think that the better way to think about it is one of the following: 1. It's not the assignment that is different, it is the expression being bound. 2. It is not the assignment that is different, it is the parameter. Suggestion #1 suggests that we might want a new kind of expression, which for lack of a better term I'm going to call a thunk (the term is stolen from Algol). Thunks represent a unit of delayed evaluation, and if they are worth doing, they are worth doing anywhere, not just in parameter defaults. So this is a much bigger idea, and a lot more pie-in-the-sky as it relies on thunks being plausible in Python's evaluation model, so I'm not going to talk about #1 here. Suggestion #2 is, I will argue, the most natural way to think about this. It is the parameter that differs: some parameters use early binding, and some use late binding. Binding is binding, regardless of when it is performed. When we do late binding manually, we don't do this: if param is None: param ?= expression # Look, it's LATE BINDING assignment!!! So we shouldn't modify the assignment operator. It's still a binding. What we want to do is tell the compiler that *this parameter* is special, and so the assignment needs to be delayed to function call time rather than function build time. We can do that by tagging the parameter. What's another term for tagging something? Decorating it. That suggests a natural syntax: # arg1 uses early binding, arg2 uses late binding def function(arg1=expression, @arg2=expression): And with type annotations: def function(arg1:Type=expression, @arg2:Type=expression) -> Type: I know it's not an actual decorator, but it suggests the idea that we're decorating the parameter to use late binding instead of early. Advantages: - The modifer is right up front, where it is obvious. - Doesn't look like grit on the monitor. - It can't be confused with anything inside the type hint or the expression. - No matter how ludicrously confusing the annotation or expression gets, the @ modifier still stands out and is obvious. - Forward-compatible: even if we invent a prefix-unary @ operator in the future, this will still work: def function(@param:@Type=@expression) - Likewise for postfix unary operators: def function(@param:Type@=expression@) Here's a trivial advantage: with the "modify the equals sign" syntax, if you decide to copy the assignment outside of the function signature, you are left with a syntax error: def function(param?=expression) # double-click on "param", drag to expression, copy and paste param?=expression # SyntaxError Its not a big deal, but I can see it being a minor annoyance, especially confusing for newbies. But with a leading @ symbol, you can double-click on the param name, drag to the expression, copy and paste, and in most GUI editors, the @ symbol will not be selected or copied. def function(@param=expression) # double-click on "param", drag to expression, copy and paste param=expression # Legal code. (I don't know of any GUI editors that consider @ to be part of a word when double-clicking, although I suppose there might be some.) Disadvantages: - Maybe "at symbol" is clunkier to talk about than "arrow operator" or "reverse walrus"? - Search engines aren't really great at searching for the at symbol: https://www.google.com.au/search?q=python+what+does+%40+mean https://duckduckgo.com/?q=python+what+does+%40+mean DDG gives the top hit a Quora post about the at symbol, but everything else is a miss; Google is even worse. But then any other symbol is going to be subject to the same problem. Looking back at the "modify the equals" syntax, it puts the important information right there in the middle of something which could be an extremely busy chunk of text: param:Optional[Callable[TypeA, TypeB, Bool]]=>lambda a, b: a>lo and b>hi Even if it is syntactically unambiguous, and not confused with anything else, it is still not obvious. It doesn't stand out when skimming the code. And we all sometimes just skim code. @param:Optional[Callable[TypeA, TypeB, Bool]]=lambda a, b: a>lo and b>hi Let's have a look at some real cases from the stdlib: # bisect.py def bisect_right(a, x, lo=0, @hi=len(a), *, key=None): # calendar.py class LocaleTextCalendar(TextCalendar): def __init__(self, firstweekday=0, @locale=_locale.getdefaultlocale()): # copy.py def deepcopy(x, @memo={}, _nil=[]): # pickle.py class _Pickler: def __init__(self, file, @protocol=DEFAULT_PROTOCOL, *, fix_imports=True, buffer_callback=None): (Note: some of these cases may be backwards-incompatible changes, if the parameter is documented as accepting None.) -- Steve
On Sun, Oct 24, 2021 at 3:43 PM Steven D'Aprano <steve@pearwood.info> wrote:
So far, I dislike all of those syntaxes (regardless of which symbol is used as the modifier). They are all predicated on the idea that this is a new sort of assignment, which I think is the wrong way to think about it. I think that the better way to think about it is one of the following:
1. It's not the assignment that is different, it is the expression being bound.
2. It is not the assignment that is different, it is the parameter.
Or 3. It is not the assignment that is different, just when it occurs.
Suggestion #2 is, I will argue, the most natural way to think about this. It is the parameter that differs: some parameters use early binding, and some use late binding. Binding is binding, regardless of when it is performed. When we do late binding manually, we don't do this:
if param is None: param ?= expression # Look, it's LATE BINDING assignment!!!
That's because, by the time you can even think about doing that, ANY assignment is that. So it's a plain equals sign.
- No matter how ludicrously confusing the annotation or expression gets, the @ modifier still stands out and is obvious.
I dispute the value of this. It shouldn't stand out that much, because ultimately, it is still defining an optional parameter and giving it a default value.
Here's a trivial advantage: with the "modify the equals sign" syntax, if you decide to copy the assignment outside of the function signature, you are left with a syntax error:
def function(param?=expression) # double-click on "param", drag to expression, copy and paste
param?=expression # SyntaxError
Yes, but that's true of many things, including dict display, and even assignment expressions.
Disadvantages:
- Maybe "at symbol" is clunkier to talk about than "arrow operator" or "reverse walrus"?
- Search engines aren't really great at searching for the at symbol:
https://www.google.com.au/search?q=python+what+does+%40+mean
https://duckduckgo.com/?q=python+what+does+%40+mean
DDG gives the top hit a Quora post about the at symbol, but everything else is a miss; Google is even worse. But then any other symbol is going to be subject to the same problem.
Searching "python at sign" gives some good results, but the other problem with searching for symbols is that they're used in multiple ways. Ultimately, you won't get what you want by just searching for individual characters. Short of inventing a new language keyword, I don't think we're going to solve that. ChrisA
This should probably reference PEP 661 (Sentinel Values) which is being discussed on Discourse: https://discuss.python.org/t/pep-661-sentinel-values/9126 It's a different proposal, but one of the major motivating use cases (if not the only one) for sentinels is handling function default values that can't be expressed at definition times. So how the two proposals interact should be discussed *somewhere*, IMO. Personally I'd choose to support this proposal, and take the view that it weakens the need for PEP 661 to the point where I'd prefer not to bother with that proposal. Paul On Sun, 24 Oct 2021 at 01:15, Chris Angelico <rosuav@gmail.com> wrote:
Incorporates comments from the thread we just had.
Is anyone interested in coauthoring this with me? Anyone who has strong interest in seeing this happen - whether you've been around the Python lists for years, or you're new and interested in getting involved for the first time, or anywhere in between!
https://www.python.org/dev/peps/pep-0671/
PEP: 671 Title: Syntax for late-bound function argument defaults Author: Chris Angelico <rosuav@gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 24-Oct-2021 Python-Version: 3.11 Post-History: 24-Oct-2021
Abstract ========
Function parameters can have default values which are calculated during function definition and saved. This proposal introduces a new form of argument default, defined by an expression to be evaluated at function call time.
Motivation ==========
Optional function arguments, if omitted, often have some sort of logical default value. When this value depends on other arguments, or needs to be reevaluated each function call, there is currently no clean way to state this in the function header.
Currently-legal idioms for this include::
# Very common: Use None and replace it in the function def bisect_right(a, x, lo=0, hi=None, *, key=None): if hi is None: hi = len(a)
# Also well known: Use a unique custom sentinel object _USE_GLOBAL_DEFAULT = object() def connect(timeout=_USE_GLOBAL_DEFAULT): if timeout is _USE_GLOBAL_DEFAULT: timeout = default_timeout
# Unusual: Accept star-args and then validate def add_item(item, *optional_target): if not optional_target: target = [] else: target = optional_target[0]
In each form, ``help(function)`` fails to show the true default value. Each one has additional problems, too; using ``None`` is only valid if None is not itself a plausible function parameter, the custom sentinel requires a global constant; and use of star-args implies that more than one argument could be given.
Specification =============
Function default arguments can be defined using the new ``=>`` notation::
def bisect_right(a, x, lo=0, hi=>len(a), *, key=None): def connect(timeout=>default_timeout): def add_item(item, target=>[]):
The expression is saved in its source code form for the purpose of inspection, and bytecode to evaluate it is prepended to the function's body.
Notably, the expression is evaluated in the function's run-time scope, NOT the scope in which the function was defined (as are early-bound defaults). This allows the expression to refer to other arguments.
Self-referential expressions will result in UnboundLocalError::
def spam(eggs=>eggs): # Nope
Multiple late-bound arguments are evaluated from left to right, and can refer to previously-calculated values. Order is defined by the function, regardless of the order in which keyword arguments may be passed.
Choice of spelling ------------------
Our chief syntax proposal is ``name=>expression`` -- our two syntax proposals ... ahem. Amongst our potential syntaxes are::
def bisect(a, hi=>len(a)): def bisect(a, hi=:len(a)): def bisect(a, hi?=len(a)): def bisect(a, hi!=len(a)): def bisect(a, hi=\len(a)): def bisect(a, hi=`len(a)`): def bisect(a, hi=@len(a)):
Since default arguments behave largely the same whether they're early or late bound, the preferred syntax is very similar to the existing early-bind syntax. The alternatives offer little advantage over the preferred one.
How to Teach This =================
Early-bound default arguments should always be taught first, as they are the simpler and more efficient way to evaluate arguments. Building on them, late bound arguments are broadly equivalent to code at the top of the function::
def add_item(item, target=>[]):
# Equivalent pseudocode: def add_item(item, target=<OPTIONAL>): if target was omitted: target = []
Open Issues ===========
- yield/await? Will they cause problems? Might end up being a non-issue.
- annotations? They go before the default, so is there any way an anno could want to end with ``=>``?
References ==========
Copyright =========
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/KR2TML... Code of Conduct: http://python.org/psf/codeofconduct/
On Sun, Oct 24, 2021 at 9:34 PM Paul Moore <p.f.moore@gmail.com> wrote:
This should probably reference PEP 661 (Sentinel Values) which is being discussed on Discourse: https://discuss.python.org/t/pep-661-sentinel-values/9126
It's a different proposal, but one of the major motivating use cases (if not the only one) for sentinels is handling function default values that can't be expressed at definition times. So how the two proposals interact should be discussed *somewhere*, IMO.
Personally I'd choose to support this proposal, and take the view that it weakens the need for PEP 661 to the point where I'd prefer not to bother with that proposal.
Good point; I'll add a reference. When I was searching the stdlib for examples of different arg-passing idioms, I came across quite a number of uses of object() which weren't used for argument defaults, so while it's true that PEP 661 would lose one use-case if 671 is accepted, I still think that 661 would have plenty of value. The specific example that PEP 661 cites - traceback.print_exception() - wouldn't be able to be transformed using PEP 671 alternate defaults, unless the API were to be changed somewhat. You can't specify the traceback while leaving the value at its default, and PEP 671 is stateless with regard to multiple parameters. (That said, though: traceback.format_exception_only() could benefit from PEP 671.) ChrisA
Agreed on the point about PEP 661, if this is accepted I don't think it will have much to offer. For what it's worth I'm a very strong +1 on PEP 671 as it stands. Thanks for showing me something I didn't even know I wanted Chris :) I'll confess though I'm not a fan of any of the alternate syntaxes. I think => works really well, particularly if lambdas in the form: `(*args) => expr` are added at some point in the future, because it establishes some common semantics between both uses of the => operator (namely, deferred code execution. In both cases the expression on the right of the => isn't expected to execute when the block itself is executed, but rather at some future point when something is called). I'm not a big fan of =: because we already have := and I can imagine it being quite easy for people to not remember which way around each one is (and it's just generally more effortful to visually parse), since neither of them has a particularly strong intuitive meaning. By contrast => looks visually like an arrow, while <= and >= have the advantage that their ordering corresponds to how you would say it out loud or in your head (less than or equal to/greater than or equal to), so I don't think those symbols have the same issue. I also don't love ?= because ? is one of the last few reasonable symbols we have available for new syntax (*cough* backtick *cough*, not even once). Whether it ends up being used for something like PEP 505: None-aware operators (if that is ever resurrected, I can only dream) or some other future feature we have yet to imagine, I'd prefer the ? symbol remain available without any baggage. I don't think the rationale for this PEP (as much as I agree with it) is quite strong enough to use it up. Anyways, I'm really hoping this gets accepted. Awesome proposal! On Sun, Oct 24, 2021 at 11:36 AM Paul Moore <p.f.moore@gmail.com> wrote:
This should probably reference PEP 661 (Sentinel Values) which is being discussed on Discourse: https://discuss.python.org/t/pep-661-sentinel-values/9126
It's a different proposal, but one of the major motivating use cases (if not the only one) for sentinels is handling function default values that can't be expressed at definition times. So how the two proposals interact should be discussed *somewhere*, IMO.
Personally I'd choose to support this proposal, and take the view that it weakens the need for PEP 661 to the point where I'd prefer not to bother with that proposal.
Paul
On Sun, 24 Oct 2021 at 01:15, Chris Angelico <rosuav@gmail.com> wrote:
Incorporates comments from the thread we just had.
Is anyone interested in coauthoring this with me? Anyone who has strong interest in seeing this happen - whether you've been around the Python lists for years, or you're new and interested in getting involved for the first time, or anywhere in between!
https://www.python.org/dev/peps/pep-0671/
PEP: 671 Title: Syntax for late-bound function argument defaults Author: Chris Angelico <rosuav@gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 24-Oct-2021 Python-Version: 3.11 Post-History: 24-Oct-2021
Abstract ========
Function parameters can have default values which are calculated during function definition and saved. This proposal introduces a new form of argument default, defined by an expression to be evaluated at function call time.
Motivation ==========
Optional function arguments, if omitted, often have some sort of logical default value. When this value depends on other arguments, or needs to be reevaluated each function call, there is currently no clean way to state this in the function header.
Currently-legal idioms for this include::
# Very common: Use None and replace it in the function def bisect_right(a, x, lo=0, hi=None, *, key=None): if hi is None: hi = len(a)
# Also well known: Use a unique custom sentinel object _USE_GLOBAL_DEFAULT = object() def connect(timeout=_USE_GLOBAL_DEFAULT): if timeout is _USE_GLOBAL_DEFAULT: timeout = default_timeout
# Unusual: Accept star-args and then validate def add_item(item, *optional_target): if not optional_target: target = [] else: target = optional_target[0]
In each form, ``help(function)`` fails to show the true default value.
one has additional problems, too; using ``None`` is only valid if None is not itself a plausible function parameter, the custom sentinel requires a global constant; and use of star-args implies that more than one argument could be given.
Specification =============
Function default arguments can be defined using the new ``=>`` notation::
def bisect_right(a, x, lo=0, hi=>len(a), *, key=None): def connect(timeout=>default_timeout): def add_item(item, target=>[]):
The expression is saved in its source code form for the purpose of inspection, and bytecode to evaluate it is prepended to the function's body.
Notably, the expression is evaluated in the function's run-time scope, NOT the scope in which the function was defined (as are early-bound defaults). This allows the expression to refer to other arguments.
Self-referential expressions will result in UnboundLocalError::
def spam(eggs=>eggs): # Nope
Multiple late-bound arguments are evaluated from left to right, and can refer to previously-calculated values. Order is defined by the function, regardless of the order in which keyword arguments may be passed.
Choice of spelling ------------------
Our chief syntax proposal is ``name=>expression`` -- our two syntax
... ahem. Amongst our potential syntaxes are::
def bisect(a, hi=>len(a)): def bisect(a, hi=:len(a)): def bisect(a, hi?=len(a)): def bisect(a, hi!=len(a)): def bisect(a, hi=\len(a)): def bisect(a, hi=`len(a)`): def bisect(a, hi=@len(a)):
Since default arguments behave largely the same whether they're early or late bound, the preferred syntax is very similar to the existing early-bind syntax. The alternatives offer little advantage over the preferred one.
How to Teach This =================
Early-bound default arguments should always be taught first, as they are
Each proposals the
simpler and more efficient way to evaluate arguments. Building on them, late bound arguments are broadly equivalent to code at the top of the function::
def add_item(item, target=>[]):
# Equivalent pseudocode: def add_item(item, target=<OPTIONAL>): if target was omitted: target = []
Open Issues ===========
- yield/await? Will they cause problems? Might end up being a non-issue.
- annotations? They go before the default, so is there any way an anno could want to end with ``=>``?
References ==========
Copyright =========
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/KR2TML... Code of Conduct: http://python.org/psf/codeofconduct/
Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/I7ZLP2... Code of Conduct: http://python.org/psf/codeofconduct/
On Sun, Oct 24, 2021 at 12:39:38PM +0100, Matt del Valle wrote:
I'll confess though I'm not a fan of any of the alternate syntaxes. I think => works really well, particularly if lambdas in the form: `(*args) => expr` are added at some point in the future
So if we have the arrow shortcut for type hints, and an arrow shortcut for lambda, then we can write code like this: def func(arg:int->int=>x=>x+1)->int: "I felt a great disturbance in the Force. As if millions of voices cried out in terror, and were suddenly silenced." I think that the likelihood of using an arrow for lambdas and type hinting is a major point against this proposed arrow syntax. And the arrow points the wrong way for an assignment! When we want to show a name binding, we write: name <- value value -> name we don't have name->value. Example: https://www.r-bloggers.com/2018/09/why-do-we-use-arrow-as-an-assignment-oper... I don't see any language that uses an arrow pointing from the name to value for assignment: https://en.wikipedia.org/wiki/Assignment_%28computer_science%29#Notation https://i.redd.it/e2kmjoxmy7k61.jpg -- Steve
On Sun, 24 Oct 2021, Chris Angelico wrote:
Is anyone interested in coauthoring this with me? Anyone who has strong interest in seeing this happen - whether you've been around the Python lists for years, or you're new and interested in getting involved for the first time, or anywhere in between!
I have a strong interest in seeing this happen, and would be happy to help how I can. Teaching (and using) the behavior of Python argument initializers is definitely a thorn in my side. :-) I'd love to be able to easily initalize an empty list/set/dict. For what it's worth, here are my thoughts on some of the syntaxes proposed so far: * I don't like `def f(arg => default)` exactly because it looks like a lambda, and so I imagine arg is an argument to that lambda, but the intended meaning has nothing to do with that. I understand lambdas give delegation, but in my mind that should look more like `def f(arg = => default)` or `def f(arg = () => default)` -- except these will have a different meaning (arg's default is a function, and they would be evaluated in parent scope not the function's scope) once `=>` is short-hand for lambda. * I find `def f(arg := default)` reasonable. I was actually thinking about this very issue before the thread started, and this was the syntax that came to mind. The main plus for this is that it uses an existing operator (so fewer to learn) and it is "another kind of assignment". The main minus is that it doesn't really have much to do with the walrus operator; we're not using the assigned value inline like `arg := default` would mean outside `def`. Then again, `def f(arg = default)` is quite different from `arg = default` outside `def`. * I find `def f(arg ?= default)` (or `def f(arg ??= default)`) reasonable, exactly because it is similar to None-aware operators (PEP 0505), which is currently/recently under discussion in python-dev). The main complaint about PEP 0505 in those discussions is that it's very None-specific, which feels biased. But the meaning of "omitted value" is extremely clear in a def. If both this were added and PEP 0505 were accepted, `def f(arg ?= default)` would be roughly equivalent to: ``` def f(arg = None): arg ??= default ``` except `def f(arg ?= default)` wouldn't trigger default because in the case of `f(None)`, whereas the above code would. I find this an acceptable difference. (FWIW, I'm also in favor of 0505.) * I also find `def f(@arg = default)` reasonable, though it feels a little inconsistent with decorators. I expect a decorator expression after @, not an argument, more like `def f(@later arg = default)`. * I'm not very familiar with thunks, but they seem a bit too magical for my liking. Evaluating argument defaults only sometimes (when they get read in the body) feels a bit unpredictable. Erik -- Erik Demaine | edemaine@mit.edu | http://erikdemaine.org/
On Sat, Oct 23, 2021 at 5:16 PM Chris Angelico <rosuav@gmail.com> wrote:
# Very common: Use None and replace it in the function def bisect_right(a, x, lo=0, hi=None, *, key=None): if hi is None: hi = len(a)
Note that if this is changed to def bisect_right(a, x, lo=0, hi[whatever]len(a), *, key=None): then you'll lose the ability to pass hi=None to mean the end of the array. For this argument of bisect_right, None actually makes sense since it's used for that purpose in slices, and it would probably be a backward-compatibility problem to drop support for it also. So you will still have to write def bisect_right(a, x, lo=0, hi[whatever]len(a), *, key=None): if hi is None: hi = len(a) Self-referential expressions will result in UnboundLocalError::
def spam(eggs=>eggs): # Nope
That seems like not a rule of its own, but a special case of this rule: deferred arguments that haven't been assigned to yet are unbound (and not, say, set to some implementation-specific value analogous to _USE_GLOBAL_DEFAULT).
def bisect(a, hi=>len(a)):
That looks like an anonymous function that takes a single parameter named hi, ignores it, and returns the length of a free variable named a. It may even mean that in Python someday in expression contexts. But it's definitely not what it means here. hi<=len(a) would make a bit more sense.
def bisect(a, hi=:len(a)):
This one annoys me the least, though I can't say why. def bisect(a, hi?=len(a)):
? seems reasonable for omitted/defaulted arguments, but that's not what this PEP is about. I can't see why call-time evaluation would have a ? and def-time evaluation wouldn't. def bisect(a, hi!=len(a)):
Same basic problem as => (and <=). Is hi equal to len(a), or not? def bisect(a, hi=`len(a)`):
It looks like the backticks are part of the expression rather than the function-parameter syntax, but they aren't, and never could be - even if there was a deferred-evaluation syntax at the expression level, it couldn't be used here, because the lexical environment would be wrong.
Since default arguments behave largely the same whether they're early or late bound,
They seem very different to me. They're evaluated in a different lexical scope, and many use cases for this extension depend on the changed scoping.
On 24 Oct 2021, at 01:13, Chris Angelico <rosuav@gmail.com> wrote:
Specification =============
Function default arguments can be defined using the new ``=>`` notation::
def bisect_right(a, x, lo=0, hi=>len(a), *, key=None): def connect(timeout=>default_timeout): def add_item(item, target=>[]):
The expression is saved in its source code form for the purpose of inspection, and bytecode to evaluate it is prepended to the function's body.
Notably, the expression is evaluated in the function's run-time scope, NOT the scope in which the function was defined (as are early-bound defaults). This allows the expression to refer to other arguments.
Self-referential expressions will result in UnboundLocalError::
def spam(eggs=>eggs): # Nope
Multiple late-bound arguments are evaluated from left to right, and can refer to previously-calculated values. Order is defined by the function, regardless of the order in which keyword arguments may be passed.
Clarification please: What is the bytecode that will be generated? Does the bytecode only run the default code if the argument is missing? And missing is not the same as is None? Also have you add the @var=default suggestion from Stephen to the syntax options. I'm +1 on the @ syntax as it is easier to pick up on and the other reasons that Stephen provided. Barry
On Mon, Oct 25, 2021 at 7:51 PM Barry Scott <barry@barrys-emacs.org> wrote:
Clarification please:
What is the bytecode that will be generated?
Equivalent to: if argument not provided: argument = <expr> except that we don't have a way of saying "not provided".
Does the bytecode only run the default code if the argument is missing?
Yes. It is for default values, not for transforming.
And missing is not the same as is None?
Most assuredly not - that's part of the point. The semantics are closer to the "dedicated sentinel" idiom, but there is no value which can be passed which triggers this.
Also have you add the @var=default suggestion from Stephen to the syntax options. I'm +1 on the @ syntax as it is easier to pick up on and the other reasons that Stephen provided.
Not really a fan, but I guess I can add it as an alternative. ChrisA
I don’t like the => syntax for delayed default argument. It looks like a lambda and it’s confusing. The @ symbol is more readable. Like this @var=len(a) or even var@=len(a). The function decorator changes the behavior of the function. Similarly this @ default argument will change the argument value to this assignment if a value is not supplied. Abdulla Sent from my iPhone
On 25 Oct 2021, at 1:03 PM, Chris Angelico <rosuav@gmail.com> wrote:
On Mon, Oct 25, 2021 at 7:51 PM Barry Scott <barry@barrys-emacs.org> wrote:
Clarification please:
What is the bytecode that will be generated?
Equivalent to:
if argument not provided: argument = <expr>
except that we don't have a way of saying "not provided".
Does the bytecode only run the default code if the argument is missing?
Yes. It is for default values, not for transforming.
And missing is not the same as is None?
Most assuredly not - that's part of the point. The semantics are closer to the "dedicated sentinel" idiom, but there is no value which can be passed which triggers this.
Also have you add the @var=default suggestion from Stephen to the syntax options. I'm +1 on the @ syntax as it is easier to pick up on and the other reasons that Stephen provided.
Not really a fan, but I guess I can add it as an alternative.
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/NLAYWM... Code of Conduct: http://python.org/psf/codeofconduct/
On 24.10.2021 02:13, Chris Angelico wrote:
How to Teach This =================
Early-bound default arguments should always be taught first, as they are the simpler and more efficient way to evaluate arguments. Building on them, late bound arguments are broadly equivalent to code at the top of the function::
def add_item(item, target=>[]):
# Equivalent pseudocode: def add_item(item, target=<OPTIONAL>): if target was omitted: target = []
I would prefer to not go down this path. "Explicit is better than implicit" and this is too much "implicit" for my taste :-) For simple use cases, this may save a few lines of code, but as soon as you end up having to think whether the expression will evaluate to the right value at function call time, the scope it gets executed in, what to do with exceptions, etc., you're introducing too much confusion with this syntax. Exmple: def process_files(processor, files=>os.listdir(DEFAULT_DIR)): Some questions: - What happens if dir does not exist ? How would I be able to process the exception in the context of process_files() ? - Since the same code is valid without the ">", would a user notice that os.listdir() is called in the scope of the function call ? - What if DEFAULT_DIR == '.' ? Would the user notice that the current work dir may have changed compared to when the module with the function was loaded ? Having the explicit code at the start of the function is more flexible and does not introduce such questions. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Oct 25 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
On Mon, Oct 25, 2021 at 10:39 PM Marc-Andre Lemburg <mal@egenix.com> wrote:
I would prefer to not go down this path.
"Explicit is better than implicit" and this is too much "implicit" for my taste :-)
For simple use cases, this may save a few lines of code, but as soon as you end up having to think whether the expression will evaluate to the right value at function call time, the scope it gets executed in, what to do with exceptions, etc., you're introducing too much confusion with this syntax.
It's always possible to be more "explicit", as long as explicit means "telling the computer precisely what to do". But Python has default arguments for a reason. Instead of simply allowing arguments to be optional, and then ALWAYS having code inside the function to provide values when they are omitted, Python allows us to provide actual default values that are visible to the caller (eg in help()). This is a good thing. Is it "implicit"? Yes, in a sense. But it's very clear what happens if the argument is omitted. The exact same thing is true with these defaults; you can see what happens. The only difference is whether it is a *value* or an *expression* that defines the default. Either way, if the argument is omitted, the given default is used instead.
Exmple:
def process_files(processor, files=>os.listdir(DEFAULT_DIR)):
Some questions: - What happens if dir does not exist ? How would I be able to process the exception in the context of process_files() ?
Well, it wouldn't have a try/except around it, so the exception would simply bubble. Presumably someone creating an API like this would not want to process the exception. And for most functions of this nature, I would not expect to see a try/except. How could it handle a failure of that nature? Letting the exception bubble seems to be the perfect non-action to take.
- Since the same code is valid without the ">", would a user notice that os.listdir() is called in the scope of the function call ?
I don't know. Would they? Would it even matter? The timing of it would make a difference, but the scope is unlikely to. (Unless 'os' is a local name within the function, which seems unlikely. Even then, you'd simply get UnboundLocalError.)
- What if DEFAULT_DIR == '.' ? Would the user notice that the current work dir may have changed compared to when the module with the function was loaded ?
Again, I would assume that that is intentional. If you're specifying that the default is late-evaluated, then you're accepting - possibly intending - that DEFAULT_DIR could be changed, the current directory could change, and the directory context could change. That all seems like good API choice.
Having the explicit code at the start of the function is more flexible and does not introduce such questions.
Then use the explicit code! For this situation, it seems perfectly reasonable to write it either way. But for plenty of other examples, it makes a lot of sense to late-bind in a more visible way. It's for those situations that the feature would exist. ChrisA
On 25.10.2021 13:53, Chris Angelico wrote:
On Mon, Oct 25, 2021 at 10:39 PM Marc-Andre Lemburg <mal@egenix.com> wrote:
I would prefer to not go down this path.
"Explicit is better than implicit" and this is too much "implicit" for my taste :-)
For simple use cases, this may save a few lines of code, but as soon as you end up having to think whether the expression will evaluate to the right value at function call time, the scope it gets executed in, what to do with exceptions, etc., you're introducing too much confusion with this syntax.
It's always possible to be more "explicit", as long as explicit means "telling the computer precisely what to do". But Python has default arguments for a reason. Instead of simply allowing arguments to be optional, and then ALWAYS having code inside the function to provide values when they are omitted, Python allows us to provide actual default values that are visible to the caller (eg in help()). This is a good thing. Is it "implicit"? Yes, in a sense. But it's very clear what happens if the argument is omitted. The exact same thing is true with these defaults; you can see what happens.
The only difference is whether it is a *value* or an *expression* that defines the default. Either way, if the argument is omitted, the given default is used instead.
I guess I wasn't clear enough. What I mean with "implicit" is that execution of the expression is delayed by simply adding a ">" to the keyword default parameter definition. Given that this alters the timing of evaluation, a single character does not create enough attention to make this choice explicit. If I instead write: def process_files(processor, files=deferred(os.listdir(DEFAULT_DIR))): it is pretty clear that something is happening at a different time than function definition time :-) Even better: the deferred() object can be passed in as a value and does not have to be defined when defining the function, since the function will obviously know what to do with such deferred() objects.
Having the explicit code at the start of the function is more flexible and does not introduce such questions.
Then use the explicit code! For this situation, it seems perfectly reasonable to write it either way.
But for plenty of other examples, it makes a lot of sense to late-bind in a more visible way. It's for those situations that the feature would exist.
Sure, you can always find examples where late binding may make sense and it's still possible to write explicit code for this as well, but that's not the point. By introducing new syntax, you always increase the potential for readers not knowing about the new syntax, misunderstanding what the syntax means, or even not paying attention to the subtleties it introduces. So whenever new syntax is discussed, I think it's important to look at it from the perspective of a user who hasn't seen it before (could be a programmer new to Python or one who has not worked with the new feature before). In this particular case, I find the syntax not ideal in making it clear that evaluation is deferred. It's also not intuitive where exactly execution will happen (before entering the function, in which order, in a separate scope, etc). Why not turn this into a decorator instead ? @deferred(files=os.listdir(DEFAULT_DIR)) def process_files(processor, files=None): -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Oct 25 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
On Mon, Oct 25, 2021 at 11:20 PM Marc-Andre Lemburg <mal@egenix.com> wrote:
On 25.10.2021 13:53, Chris Angelico wrote:
On Mon, Oct 25, 2021 at 10:39 PM Marc-Andre Lemburg <mal@egenix.com> wrote:
I would prefer to not go down this path.
"Explicit is better than implicit" and this is too much "implicit" for my taste :-)
For simple use cases, this may save a few lines of code, but as soon as you end up having to think whether the expression will evaluate to the right value at function call time, the scope it gets executed in, what to do with exceptions, etc., you're introducing too much confusion with this syntax.
It's always possible to be more "explicit", as long as explicit means "telling the computer precisely what to do". But Python has default arguments for a reason. Instead of simply allowing arguments to be optional, and then ALWAYS having code inside the function to provide values when they are omitted, Python allows us to provide actual default values that are visible to the caller (eg in help()). This is a good thing. Is it "implicit"? Yes, in a sense. But it's very clear what happens if the argument is omitted. The exact same thing is true with these defaults; you can see what happens.
The only difference is whether it is a *value* or an *expression* that defines the default. Either way, if the argument is omitted, the given default is used instead.
I guess I wasn't clear enough. What I mean with "implicit" is that execution of the expression is delayed by simply adding a ">" to the keyword default parameter definition.
Given that this alters the timing of evaluation, a single character does not create enough attention to make this choice explicit.
If I instead write:
def process_files(processor, files=deferred(os.listdir(DEFAULT_DIR))):
it is pretty clear that something is happening at a different time than function definition time :-)
Even better: the deferred() object can be passed in as a value and does not have to be defined when defining the function, since the function will obviously know what to do with such deferred() objects.
Actually, I consider that to be far far worse, since it looks like deferred() is a callable that takes the *result* of calling os.listdir. Maybe it would be different if it were deferred("os.listdir(DEFAULT_DIR)"), but now we're losing a lot of clarity. If it's done with syntax, it can have special behaviour. If it looks like a function call (or class constructor), it doesn't look like it has special behaviour.
Having the explicit code at the start of the function is more flexible and does not introduce such questions.
Then use the explicit code! For this situation, it seems perfectly reasonable to write it either way.
But for plenty of other examples, it makes a lot of sense to late-bind in a more visible way. It's for those situations that the feature would exist.
Sure, you can always find examples where late binding may make sense and it's still possible to write explicit code for this as well, but that's not the point.
By introducing new syntax, you always increase the potential for readers not knowing about the new syntax, misunderstanding what the syntax means, or even not paying attention to the subtleties it introduces.
So whenever new syntax is discussed, I think it's important to look at it from the perspective of a user who hasn't seen it before (could be a programmer new to Python or one who has not worked with the new feature before).
I actually have a plan for that exact perspective. Was going to arrange things tonight, but it may have to wait for later in the week.
In this particular case, I find the syntax not ideal in making it clear that evaluation is deferred. It's also not intuitive where exactly execution will happen (before entering the function, in which order, in a separate scope, etc).
Why not turn this into a decorator instead ?
@deferred(files=os.listdir(DEFAULT_DIR)) def process_files(processor, files=None):
Same reason. This most definitely looks like it has to calculate the directory listing in advance. It also is extremely difficult to explain how this is able to refer to other parameters, such as in the bisect example. It's also extremely verbose, given that it's making a very small difference to the behaviour - all it changes is when something is calculated (and, for technical reasons, where; but I expect that intuition will cover that). That's why I want a syntax that keeps things close to the function header, and keeps it looking like an argument default. When someone looks at the syntax for a 'def' statement, the two ways of doing argument defaults should look more similar than, say, early-bound-default and parameter-annotation. Currently, those two are very similar (just a change of punctuation), but late-bound defaults have to be done in a very different way, either in a decorator or in the function body. That's what I want to improve. ChrisA
On 25.10.2021 14:26, Chris Angelico wrote:
On Mon, Oct 25, 2021 at 11:20 PM Marc-Andre Lemburg <mal@egenix.com> wrote:
On 25.10.2021 13:53, Chris Angelico wrote:
On Mon, Oct 25, 2021 at 10:39 PM Marc-Andre Lemburg <mal@egenix.com> wrote:
I would prefer to not go down this path.
"Explicit is better than implicit" and this is too much "implicit" for my taste :-)
For simple use cases, this may save a few lines of code, but as soon as you end up having to think whether the expression will evaluate to the right value at function call time, the scope it gets executed in, what to do with exceptions, etc., you're introducing too much confusion with this syntax.
It's always possible to be more "explicit", as long as explicit means "telling the computer precisely what to do". But Python has default arguments for a reason. Instead of simply allowing arguments to be optional, and then ALWAYS having code inside the function to provide values when they are omitted, Python allows us to provide actual default values that are visible to the caller (eg in help()). This is a good thing. Is it "implicit"? Yes, in a sense. But it's very clear what happens if the argument is omitted. The exact same thing is true with these defaults; you can see what happens.
The only difference is whether it is a *value* or an *expression* that defines the default. Either way, if the argument is omitted, the given default is used instead.
I guess I wasn't clear enough. What I mean with "implicit" is that execution of the expression is delayed by simply adding a ">" to the keyword default parameter definition.
Given that this alters the timing of evaluation, a single character does not create enough attention to make this choice explicit.
If I instead write:
def process_files(processor, files=deferred(os.listdir(DEFAULT_DIR))):
def process_files(processor, files=deferred("os.listdir(DEFAULT_DIR)")):
it is pretty clear that something is happening at a different time than function definition time :-)
Even better: the deferred() object can be passed in as a value and does not have to be defined when defining the function, since the function will obviously know what to do with such deferred() objects.
Actually, I consider that to be far far worse, since it looks like deferred() is a callable that takes the *result* of calling os.listdir. Maybe it would be different if it were deferred("os.listdir(DEFAULT_DIR)"), but now we're losing a lot of clarity.
Yes, sorry. I forgot to add the quotes. The idea is to take the argument and essentially prepend the parameter processing to the function call logic, or even build a new function with the code added at the top.
If it's done with syntax, it can have special behaviour. If it looks like a function call (or class constructor), it doesn't look like it has special behaviour.
Having the explicit code at the start of the function is more flexible and does not introduce such questions.
Then use the explicit code! For this situation, it seems perfectly reasonable to write it either way.
But for plenty of other examples, it makes a lot of sense to late-bind in a more visible way. It's for those situations that the feature would exist.
Sure, you can always find examples where late binding may make sense and it's still possible to write explicit code for this as well, but that's not the point.
By introducing new syntax, you always increase the potential for readers not knowing about the new syntax, misunderstanding what the syntax means, or even not paying attention to the subtleties it introduces.
So whenever new syntax is discussed, I think it's important to look at it from the perspective of a user who hasn't seen it before (could be a programmer new to Python or one who has not worked with the new feature before).
I actually have a plan for that exact perspective. Was going to arrange things tonight, but it may have to wait for later in the week.
Ok :-)
In this particular case, I find the syntax not ideal in making it clear that evaluation is deferred. It's also not intuitive where exactly execution will happen (before entering the function, in which order, in a separate scope, etc).
Why not turn this into a decorator instead ?
@deferred(files=os.listdir(DEFAULT_DIR))
@deferred(files="os.listdir(DEFAULT_DIR)")
def process_files(processor, files=None):
Same reason. This most definitely looks like it has to calculate the directory listing in advance. It also is extremely difficult to explain how this is able to refer to other parameters, such as in the bisect example.
Not really, since the code you provide will simply get inlined and then does have full access to the parameter names and their values.
It's also extremely verbose, given that it's making a very small difference to the behaviour - all it changes is when something is calculated (and, for technical reasons, where; but I expect that intuition will cover that).
It is verbose indeed, which is why I still think that putting such code directly at the top of the function is the better way to go :-)
That's why I want a syntax that keeps things close to the function header, and keeps it looking like an argument default. When someone looks at the syntax for a 'def' statement, the two ways of doing argument defaults should look more similar than, say, early-bound-default and parameter-annotation. Currently, those two are very similar (just a change of punctuation), but late-bound defaults have to be done in a very different way, either in a decorator or in the function body.
That's what I want to improve.
That's fair, but since the late binding code will have to sit at the top of the function definition anyway, you're not really saving much. def add_item(item, target=>[]): vs. def add_item(item, target=None): if target is None: target = [] -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Oct 25 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
On Mon, Oct 25, 2021 at 11:53 PM Marc-Andre Lemburg <mal@egenix.com> wrote:
On 25.10.2021 14:26, Chris Angelico wrote:
On Mon, Oct 25, 2021 at 11:20 PM Marc-Andre Lemburg <mal@egenix.com> wrote:
On 25.10.2021 13:53, Chris Angelico wrote:
On Mon, Oct 25, 2021 at 10:39 PM Marc-Andre Lemburg <mal@egenix.com> wrote:
I would prefer to not go down this path.
"Explicit is better than implicit" and this is too much "implicit" for my taste :-)
For simple use cases, this may save a few lines of code, but as soon as you end up having to think whether the expression will evaluate to the right value at function call time, the scope it gets executed in, what to do with exceptions, etc., you're introducing too much confusion with this syntax.
It's always possible to be more "explicit", as long as explicit means "telling the computer precisely what to do". But Python has default arguments for a reason. Instead of simply allowing arguments to be optional, and then ALWAYS having code inside the function to provide values when they are omitted, Python allows us to provide actual default values that are visible to the caller (eg in help()). This is a good thing. Is it "implicit"? Yes, in a sense. But it's very clear what happens if the argument is omitted. The exact same thing is true with these defaults; you can see what happens.
The only difference is whether it is a *value* or an *expression* that defines the default. Either way, if the argument is omitted, the given default is used instead.
I guess I wasn't clear enough. What I mean with "implicit" is that execution of the expression is delayed by simply adding a ">" to the keyword default parameter definition.
Given that this alters the timing of evaluation, a single character does not create enough attention to make this choice explicit.
If I instead write:
def process_files(processor, files=deferred(os.listdir(DEFAULT_DIR))):
def process_files(processor, files=deferred("os.listdir(DEFAULT_DIR)")):
@deferred(files="os.listdir(DEFAULT_DIR)")
Ahhh, okay. Now your explanation makes sense :) This does deal with the problem of function calls looking like function calls. It comes at the price of using a string to represent code, so unless it has compiler support, it's going to involve eval(), which is quite inefficient. (And if it has compiler support, it should have syntactic support too, otherwise you end up with weird magical functions that don't do normal things.)
It's also extremely verbose, given that it's making a very small difference to the behaviour - all it changes is when something is calculated (and, for technical reasons, where; but I expect that intuition will cover that).
It is verbose indeed, which is why I still think that putting such code directly at the top of the function is the better way to go :-)
That's what I want to avoid though. Why go with the incredibly verbose version that basically screams "don't use this"? Use something much more akin to other argument defaults, and then it looks much more useful.
That's fair, but since the late binding code will have to sit at the top of the function definition anyway, you're not really saving much.
def add_item(item, target=>[]):
vs.
def add_item(item, target=None): if target is None: target = []
It doesn't always have to sit at the top of the function; it can be anywhere in the function, including at the use site. More importantly, this is completely opaque to introspection. Tools like help() can't see that the default is a new empty list - they just see that the default is None. That's not meaningful, that's not helpful. It also pollutes the API with a fake argument value, such that one might think that passing None is meaningful. If, in the future, you change your API to have a unique sentinel object as the default, people's code might break. Did you document that None was an intentional parameter option, or did you intend for the default to be "new empty list"? For technical reasons, the default currently has to be a single value, but that value isn't really meaningful. A function's header is its primary documentation and definition, and things should only have perceived meaning when they also have real meaning. (Case in point: positional-only args, where there is no keyword argument that can masquerade as that positional arg. Being forced to name every argument is a limitation.) The purpose of late-evaluated argument defaults (I'm wondering if I should call them LEADs, or if that's too cute) is to make the function's signature truly meaningful. It shouldn't be necessary to warp your code around a technical limitation. ChrisA
On 25.10.2021 15:44, Chris Angelico wrote:
On Mon, Oct 25, 2021 at 11:53 PM Marc-Andre Lemburg <mal@egenix.com> wrote:
On 25.10.2021 14:26, Chris Angelico wrote:
On Mon, Oct 25, 2021 at 11:20 PM Marc-Andre Lemburg <mal@egenix.com> wrote:
On 25.10.2021 13:53, Chris Angelico wrote:
On Mon, Oct 25, 2021 at 10:39 PM Marc-Andre Lemburg <mal@egenix.com> wrote:
I would prefer to not go down this path.
"Explicit is better than implicit" and this is too much "implicit" for my taste :-)
For simple use cases, this may save a few lines of code, but as soon as you end up having to think whether the expression will evaluate to the right value at function call time, the scope it gets executed in, what to do with exceptions, etc., you're introducing too much confusion with this syntax.
It's always possible to be more "explicit", as long as explicit means "telling the computer precisely what to do". But Python has default arguments for a reason. Instead of simply allowing arguments to be optional, and then ALWAYS having code inside the function to provide values when they are omitted, Python allows us to provide actual default values that are visible to the caller (eg in help()). This is a good thing. Is it "implicit"? Yes, in a sense. But it's very clear what happens if the argument is omitted. The exact same thing is true with these defaults; you can see what happens.
The only difference is whether it is a *value* or an *expression* that defines the default. Either way, if the argument is omitted, the given default is used instead.
I guess I wasn't clear enough. What I mean with "implicit" is that execution of the expression is delayed by simply adding a ">" to the keyword default parameter definition.
Given that this alters the timing of evaluation, a single character does not create enough attention to make this choice explicit.
If I instead write:
def process_files(processor, files=deferred(os.listdir(DEFAULT_DIR))):
def process_files(processor, files=deferred("os.listdir(DEFAULT_DIR)")):
@deferred(files="os.listdir(DEFAULT_DIR)")
Ahhh, okay. Now your explanation makes sense :)
This does deal with the problem of function calls looking like function calls. It comes at the price of using a string to represent code, so unless it has compiler support, it's going to involve eval(), which is quite inefficient. (And if it has compiler support, it should have syntactic support too, otherwise you end up with weird magical functions that don't do normal things.)
The decorator version would not need eval, since the decorator would actually rewrite the function to include the parameter defaulting logic right at the top of the function and recompile it. For the object version, the string would have to be compiled as well and then executed at the top of the function somehow :-) I think for the latter, we'd need a more generic concept of deferred execution in Python, but even then, you'd not really save typing: def process_files(processor, files=defer os.listdir(DEFAULT_DIR)): if deferred(files): files = eval(files) ... The details are more complex than the above, but it demonstrates the idea. Note that eval() would evaluate an already compiled expression encapsulated in a deferred object, so it's not slow or dangerous to use. Now, it may not be obvious, but the key advantage of such deferred objects is that you can pass them around, i.e. the "defer os.listdir(DEFAULT_DIR)" could also be passed in via another function.
It's also extremely verbose, given that it's making a very small difference to the behaviour - all it changes is when something is calculated (and, for technical reasons, where; but I expect that intuition will cover that).
It is verbose indeed, which is why I still think that putting such code directly at the top of the function is the better way to go :-)
That's what I want to avoid though. Why go with the incredibly verbose version that basically screams "don't use this"? Use something much more akin to other argument defaults, and then it looks much more useful.
That's fair, but since the late binding code will have to sit at the top of the function definition anyway, you're not really saving much.
def add_item(item, target=>[]):
vs.
def add_item(item, target=None): if target is None: target = []
It doesn't always have to sit at the top of the function; it can be anywhere in the function, including at the use site. More importantly, this is completely opaque to introspection. Tools like help() can't see that the default is a new empty list - they just see that the default is None. That's not meaningful, that's not helpful.
You'd typically write about those defaults in the doc string, though. At least that's how I document such more involved defaults.
It also pollutes the API with a fake argument value, such that one might think that passing None is meaningful. If, in the future, you change your API to have a unique sentinel object as the default, people's code might break. Did you document that None was an intentional parameter option, or did you intend for the default to be "new empty list"? For technical reasons, the default currently has to be a single value, but that value isn't really meaningful. A function's header is its primary documentation and definition, and things should only have perceived meaning when they also have real meaning. (Case in point: positional-only args, where there is no keyword argument that can masquerade as that positional arg. Being forced to name every argument is a limitation.)
None in this case is used as sentinel, because you are expecting a sequence, so it's clear that None doesn't work as a valid argument. The discussion around a separate standard sentinel, which can never be used as valid argument, is what brought us to this thread, AFAIR :-)
The purpose of late-evaluated argument defaults (I'm wondering if I should call them LEADs, or if that's too cute) is to make the function's signature truly meaningful. It shouldn't be necessary to warp your code around a technical limitation.
That's a good argument, but with e.g. the deferred objects, you could have the same: simply make the repr(deferred object) include the expression, e.g. "deferred(os.listdir(DEFAULT_DIR))". help() would then output this when printing out the function signature. Ditto for the decorator, since this could add generated text to the doc-string of the function. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Oct 26 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
On 26.10.2021 10:54, Marc-Andre Lemburg wrote:
[...] For the object version, the string would have to be compiled as well and then executed at the top of the function somehow :-)
I think for the latter, we'd need a more generic concept of deferred execution in Python, but even then, you'd not really save typing:
def process_files(processor, files=defer os.listdir(DEFAULT_DIR)): if deferred(files): files = eval(files) ...
The details are more complex than the above, but it demonstrates the idea.
Note that eval() would evaluate an already compiled expression encapsulated in a deferred object, so it's not slow or dangerous to use.
Now, it may not be obvious, but the key advantage of such deferred objects is that you can pass them around, i.e. the "defer os.listdir(DEFAULT_DIR)" could also be passed in via another function.
Here's a better way to write the above pseudo-code, which makes the intent clearer: def process_files(processor, files=defer os.listdir(DEFAULT_DIR)): if isdeferred(files): files = files.eval() ... isdeferred() would simply check the object for being a deferred object. I'm using "eval" for lack of a better word to say "please run the deferred code now and in this context)". Perhaps a second keyword could be used to wrap the whole "if isdeferred()..." dance into something more intuitive. Here's an old recipe which uses this concept: https://code.activestate.com/recipes/502206/ BTW: While thinking about defer some more, I came up with this alternative syntax for your proposal: def process_files(processor, defer files=os.listdir(DEFAULT_DIR)): # results in adding the deferred statement at the top of the # function, if the parameter is not given, i.e. if files is NotGiven: files = os.listdir(DEFAULT_DIR) ... This has the advantage of making things a lot more obvious than the small added ">", which is easy to miss and the main obstacle I see with your PEP. That said, I still like the idea to be able to "inject" expressions into functions. This opens up lots of doors to make dynamic programming more intuitive in Python. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Oct 26 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
On 10/25/2021 8:26 AM, Chris Angelico wrote:
If it's done with syntax, it can have special behaviour. If it looks like a function call (or class constructor), it doesn't look like it has special behaviour.
This has been mentioned before, but I'll say it again: It doesn't need to be syntax in the sense of non-ascii characters, it could be a (soft) keyword: def process_files(processor, files=deferred os.listdir(DEFAULT_DIR)): I'm -1 on the proposal, for reasons I'll eventually write up, but if it has to exist, I'd prefer a keyword of some sort. Eric
On 10/25/21 6:54 AM, Eric V. Smith wrote:
On 10/25/2021 8:26 AM, Chris Angelico wrote:
If it's done with syntax, it can have special behaviour. If it looks like a function call (or class constructor), it doesn't look like it has special behaviour.
This has been mentioned before, but I'll say it again: It doesn't need to be syntax in the sense of non-ascii characters, it could be a (soft) keyword:
def process_files(processor, files=deferred os.listdir(DEFAULT_DIR)):
I agree. My two favorite bike-shed colors are - `deferred` soft keyword - @ in the front Both options make it much clearer that something special is happening, whilst all of the in-the-middle options can be easily missed. -- ~Ethan~
On Mon, Oct 25, 2021 at 3:42 PM Mike Miller <python-ideas@mgmiller.net> wrote:
On 2021-10-25 11:27, Ethan Furman wrote:
- `deferred` soft keyword
"defer" please.
This construct did not happen in the past, and it's shorter of course.
-Mike
For soft keyword options, defer is better than deferred. But previously I suggested ellipses (because to me it looks kind of like "later..."). Chris A didn't care for it, but and I still kind of like it despite his objections: def f(a, b = ... []): ... Chris A said it could be a problem because ellipses are legal in an expression, but I don't think that should be a problem for the PEG parser? This would be syntactically legal, though not very useful: def f(a, b = ... ...): ... ...and this would be syntactically legal, but would result in an error: def f(a, b = ... ...+1): ... If everyone hates the ... bikeshed color, my other suggestion would be "late" (standing for "late binding"), which is 1 character shorter than "defer", and semantically meaningful: def f(a, b = late []): ... --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
On 2021-10-25 3:31 p.m., Mike Miller wrote:
"defer" please.
This construct did not happen in the past, and it's shorter of course.
-Mike
I also like `defer`:
def range(a, min=0, max = defer len(a)): return a[min:max]
`default` is also nice:
def range(a, min=0, max default len(a)): return a[min:max]
I was concerned with this proposal at first because an inner function definition may be ambiguous:
def do_work(): a = ['this', 'is', 'a', 'list'] def range(a, min=0, max = defer len(a)): return a[min:max]
which `a` does `len(a)` refer to? Looking through my code, it seems this is not a problem; outer method variables rarely conflict with inner method parameters in practice. Can deferred defaults refer to variables in scope? Can I use this to evaluate arguments lazily?
def coalesce(a, b): def _coalesce(x = defer a(), y = defer b()): if x is None: return x return y _coalesce()
def expensive_method(): return 84
print(coalesce(lambda: 42, expensive_method))
Thank you
On Tue, Oct 26, 2021 at 10:20 AM Kyle Lahnakoski <kyle@lahnakoski.com> wrote:
I was concerned with this proposal at first because an inner function definition may be ambiguous:
def do_work(): a = ['this', 'is', 'a', 'list'] def range(a, min=0, max = defer len(a)): return a[min:max]
which `a` does `len(a)` refer to?
In this case, there's absolutely no ambiguity; it will refer to the parameter a. There are other situations that are less clear, but for the most part, assume that a late-bound default is evaluated in the context of the function body.
Can deferred defaults refer to variables in scope? Can I use this to evaluate arguments lazily?
def coalesce(a, b): def _coalesce(x = defer a(), y = defer b()): if x is None: return x return y _coalesce()
def expensive_method(): return 84
print(coalesce(lambda: 42, expensive_method))
They can refer to variables in scope, but all argument defaults are evaluated before the body of the function begins. This is one reason NOT to call this "defer", since this isn't a generic tool for late-evaluated thunks; it is broadly equivalent to "if y is not set: y = b()" at the start of the function. ChrisA
On Mon, Oct 25, 2021 at 11:26 PM Chris Angelico <rosuav@gmail.com> wrote:
On Mon, Oct 25, 2021 at 11:20 PM Marc-Andre Lemburg <mal@egenix.com> wrote:
So whenever new syntax is discussed, I think it's important to look at it from the perspective of a user who hasn't seen it before (could be a programmer new to Python or one who has not worked with the new feature before).
I actually have a plan for that exact perspective. Was going to arrange things tonight, but it may have to wait for later in the week.
This is only one data point, so don't take this TOO strongly, but I spoke with one of my brothers about his expectations. Walked through some things, led him up against the problem, and watched him try to make sense of things. Here's what I learned: 0) When led to the basic problem of "make the default be a new empty list", he didn't stumble on the standard "=None" idiom, so this proposal has definite value. 1) Pure left-to-right evaluation makes more sense than multi-stage evaluation 2) He expected failed forward references to look to the enclosing scope, but on seeing that that isn't what Python does, expected UnboundLocalError 3) To him, it's still basically an argument default value, so all syntax ideas that he came up with were small changes to the "=" sign, rather than being keywords or function-like constructs 5) It's really REALLY hard to devise examples of APIs that logically would want out-of-order referencing, so it doesn't really even matter that much 6) A full understanding of the exact semantics of Python argument passing is (a) not as common as you might think, and (b) not actually even necessary to most programmers One small additional contribution was this syntax: def spam(foo=<expr>): I'm not sure whether it's worth adding to the list, but it's another idea. If anyone else has friends or family who know a bit of Python, would love to hear other people's viewpoints. ChrisA
On Mon, Oct 25, 2021, 8:20 AM Marc-Andre Lemburg
If I instead write:
def process_files(processor, files=deferred(os.listdir(DEFAULT_DIR))):
it is pretty clear that something is happening at a different time than function definition time :-)
Even better: the deferred() object can be passed in as a value and does not have to be defined when defining the function, since the function will obviously know what to do with such deferred() objects.
Gosh, that EXACTLY what I've been suggesting :-). Except I don't think 'deferred()' can just be a function as in Marc-André's example. The arguments would still evaluate too eagerly. I think it needs syntax or a keyword.
On 2021-10-23 17:13, Chris Angelico wrote:
def bisect_right(a, x, lo=0, hi=>len(a), *, key=None):
Sounds like deferred execution could be useful, but I wanted to complain about the example above. I realize it is "just an example" but if I saw this in code, the first thing I'd do is ask for it to be changed. Why? The same variable (or simple variant) shouldn't be passed twice in a signature, when it can be operated on inside the body of the function to get that variant. i.e.: DRY. Probably someone will think up an exception to that, but even so, occurrence is a fraction of rare and not enough to justify this new feature. Believe I saw better examples in the discussion, so please go with one of those. -Mike
On Tue, Oct 26, 2021 at 6:46 AM Mike Miller <python-ideas@mgmiller.net> wrote:
On 2021-10-23 17:13, Chris Angelico wrote:
def bisect_right(a, x, lo=0, hi=>len(a), *, key=None):
Sounds like deferred execution could be useful, but I wanted to complain about the example above. I realize it is "just an example" but if I saw this in code, the first thing I'd do is ask for it to be changed.
Why? The same variable (or simple variant) shouldn't be passed twice in a signature, when it can be operated on inside the body of the function to get that variant. i.e.: DRY.
Not sure I understand. Which variable is being passed twice? This example is straight from the standard library's bisect module. ChrisA
On Mon, Oct 25, 2021 at 12:45:03PM -0700, Mike Miller wrote:
On 2021-10-23 17:13, Chris Angelico wrote:
def bisect_right(a, x, lo=0, hi=>len(a), *, key=None):
Sounds like deferred execution could be useful, but I wanted to complain about the example above. I realize it is "just an example" but if I saw this in code, the first thing I'd do is ask for it to be changed.
Changed to what?
Why? The same variable (or simple variant) shouldn't be passed twice in a signature, when it can be operated on inside the body of the function to get that variant. i.e.: DRY.
I'm sorry, I don't understand your objection here. The function parameters are: a, x, lo, hi, key None of them are passed twice. I'm not clear what you consider to be a DRY violation here. Is this also a DRY violation? def bisect_right(a, x, lo=0, hi=None, *, key=None): if hi is None: hi = len(a) If not, then how is it a DRY violation to lift the initialisation of the default `hi = len(a)` from a manual operation inside the body of the function to a late-bound default argument? def bisect_right(a, x, lo=0, @hi=len(a), *, key=None): *Moving* code from one place to another isn't *repeating* the code. And moving initialisation code for defaults from the body of the function to the signature is the point of the exercise. Sometimes that default initialisation code refers to other parameters. -- Steve
On 2021-10-25 15:59, Steven D'Aprano wrote:
None of them are passed twice.
Yes, the word "passed" misses the mark.
*Moving* code from one place to another isn't *repeating* the code.
I think that captures the problem nicely. This is already possible in a much clearer way. I'd still ask for it to be removed. An example should show us something more compelling. -Mike
Among my objections to this proposal is introspection: how would that work? The PEP mentions that the text of the expression would be available for introspection, but that doesn't seem very useful. At the very least, the PEP needs to talk about inspect.Signature objects, and how they would support these late-bound function arguments. And in particular, how would you create a Signature object that represents a function with such arguments? What would the default values look like? I think Dave Beazley has a talk somewhere where he dynamically creates objects that implement specific Signatures, I'll try to dig it up and produce an example. For me, it's a show-stopper if you can't support this with late-bound arguments. And I suspect the answer to introspection is going to interact with the desire to create late-bound objects outside of the scope of function arguments and pass them around (as mentioned by MAL). They need to exist stand-alone, outside of function parameters. Eric On 10/23/2021 8:13 PM, Chris Angelico wrote:
Incorporates comments from the thread we just had.
Is anyone interested in coauthoring this with me? Anyone who has strong interest in seeing this happen - whether you've been around the Python lists for years, or you're new and interested in getting involved for the first time, or anywhere in between!
https://www.python.org/dev/peps/pep-0671/
PEP: 671 Title: Syntax for late-bound function argument defaults Author: Chris Angelico <rosuav@gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 24-Oct-2021 Python-Version: 3.11 Post-History: 24-Oct-2021
Abstract ========
Function parameters can have default values which are calculated during function definition and saved. This proposal introduces a new form of argument default, defined by an expression to be evaluated at function call time.
Motivation ==========
Optional function arguments, if omitted, often have some sort of logical default value. When this value depends on other arguments, or needs to be reevaluated each function call, there is currently no clean way to state this in the function header.
Currently-legal idioms for this include::
# Very common: Use None and replace it in the function def bisect_right(a, x, lo=0, hi=None, *, key=None): if hi is None: hi = len(a)
# Also well known: Use a unique custom sentinel object _USE_GLOBAL_DEFAULT = object() def connect(timeout=_USE_GLOBAL_DEFAULT): if timeout is _USE_GLOBAL_DEFAULT: timeout = default_timeout
# Unusual: Accept star-args and then validate def add_item(item, *optional_target): if not optional_target: target = [] else: target = optional_target[0]
In each form, ``help(function)`` fails to show the true default value. Each one has additional problems, too; using ``None`` is only valid if None is not itself a plausible function parameter, the custom sentinel requires a global constant; and use of star-args implies that more than one argument could be given.
Specification =============
Function default arguments can be defined using the new ``=>`` notation::
def bisect_right(a, x, lo=0, hi=>len(a), *, key=None): def connect(timeout=>default_timeout): def add_item(item, target=>[]):
The expression is saved in its source code form for the purpose of inspection, and bytecode to evaluate it is prepended to the function's body.
Notably, the expression is evaluated in the function's run-time scope, NOT the scope in which the function was defined (as are early-bound defaults). This allows the expression to refer to other arguments.
Self-referential expressions will result in UnboundLocalError::
def spam(eggs=>eggs): # Nope
Multiple late-bound arguments are evaluated from left to right, and can refer to previously-calculated values. Order is defined by the function, regardless of the order in which keyword arguments may be passed.
Choice of spelling ------------------
Our chief syntax proposal is ``name=>expression`` -- our two syntax proposals ... ahem. Amongst our potential syntaxes are::
def bisect(a, hi=>len(a)): def bisect(a, hi=:len(a)): def bisect(a, hi?=len(a)): def bisect(a, hi!=len(a)): def bisect(a, hi=\len(a)): def bisect(a, hi=`len(a)`): def bisect(a, hi=@len(a)):
Since default arguments behave largely the same whether they're early or late bound, the preferred syntax is very similar to the existing early-bind syntax. The alternatives offer little advantage over the preferred one.
How to Teach This =================
Early-bound default arguments should always be taught first, as they are the simpler and more efficient way to evaluate arguments. Building on them, late bound arguments are broadly equivalent to code at the top of the function::
def add_item(item, target=>[]):
# Equivalent pseudocode: def add_item(item, target=<OPTIONAL>): if target was omitted: target = []
Open Issues ===========
- yield/await? Will they cause problems? Might end up being a non-issue.
- annotations? They go before the default, so is there any way an anno could want to end with ``=>``?
References ==========
Copyright =========
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/KR2TML... Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, Oct 27, 2021 at 2:05 AM Eric V. Smith <eric@trueblade.com> wrote:
Among my objections to this proposal is introspection: how would that work? The PEP mentions that the text of the expression would be available for introspection, but that doesn't seem very useful.
Doesn't it? It would certainly be useful in help().
At the very least, the PEP needs to talk about inspect.Signature objects, and how they would support these late-bound function arguments. And in particular, how would you create a Signature object that represents a function with such arguments? What would the default values look like? I think Dave Beazley has a talk somewhere where he dynamically creates objects that implement specific Signatures, I'll try to dig it up and produce an example. For me, it's a show-stopper if you can't support this with late-bound arguments.
That's a very good point. I suspect that what will happen is that these args will simply be omitted. It's the only way to ensure that the values are calculated correctly. But I don't (yet) know enough of the details of inspect.Signature to say for sure. ChrisA
On 10/26/2021 11:19 AM, Chris Angelico wrote:
Among my objections to this proposal is introspection: how would that work? The PEP mentions that the text of the expression would be available for introspection, but that doesn't seem very useful. Doesn't it? It would certainly be useful in help(). Yes, help() should be accurate. I was referring to programmatic introspection with the inspect module. At the very least, the PEP needs to talk about inspect.Signature objects, and how they would support these late-bound function arguments. And in particular, how would you create a Signature object that represents a function with such arguments? What would the default values look like? I think Dave Beazley has a talk somewhere where he dynamically creates objects that implement specific Signatures, I'll try to dig it up and produce an example. For me, it's a show-stopper if you can't support this with late-bound arguments. That's a very good point. I suspect that what will happen is that
On Wed, Oct 27, 2021 at 2:05 AM Eric V. Smith <eric@trueblade.com> wrote: these args will simply be omitted. It's the only way to ensure that the values are calculated correctly. But I don't (yet) know enough of the details of inspect.Signature to say for sure.
Okay. I look forward to your thoughts. Omitting late-bound arguments or defaults would not be acceptable. You may or may not recall that a big reason for the removal of "tuple parameter unpacking" in PEP 3113 was that they couldn't be supported by the inspect module. Quoting that PEP: "Python has very powerful introspection capabilities. These extend to function signatures. There are no hidden details as to what a function's call signature is." (Aside: I loved tuple parameter unpacking, and used it all the time! I was sad to see them go, but I agreed with PEP 3113.) And also the "No Loss of Abilities If Removed" section sort of applies to late-bound function arguments: there's nothing proposed that can't currently be done in existing Python. I'll grant you that they might (might!) be more newbie-friendly, but I think the bar is high for proposals that make existing things doable in a different way, as opposed to proposals that add new expressiveness to the language. Eric
On Wed, Oct 27, 2021 at 2:47 AM Eric V. Smith <eric@trueblade.com> wrote:
Okay. I look forward to your thoughts. Omitting late-bound arguments or defaults would not be acceptable.
No, I agree. We have to still be able to introspect them. At the moment, when you look at a function's defaults, they are all values. With this change, some would be values and some would be markers saying that code would be executed. The markers would incorporate the source code for the expression in question (for human readability), but I don't think they can include anything else; it seems a bit costly to retain the AST, plus it's not going to be dependable across versions anyway (the AST can change at any time). ChrisA
On Tue, Oct 26, 2021, 1:08 PM Chris Angelico <rosuav@gmail.com> wrote:
No, I agree. We have to still be able to introspect them.
At the moment, when you look at a function's defaults, they are all values. With this change, some would be values and some would be markers saying that code would be executed.
So why on earth NOT make these general "deferred" objects that can be used in other contexts?! Effectively that's what you need them to be. Going out of your way to make sure they aren't used outside function signatures seems backwards. I used the construct `defer: some_code`. Marc-André spells it without the colon. I think now I like his better, but was initially emphasizing the similarities with lambda. Punctuation would be fine too. But the general idea is that a first class object is more useful, whether it's spelled: do_later = defer: foo(bar(baz)) do_later = defer foo(bar(baz)) do_later = $( foo(bar(baz)) ) Or some other way... I actually don't hate the bash-inspired punctuation as much as I'm sure everyone else will... But I really just mean to suggest "some punctuation."
On Wed, Oct 27, 2021 at 4:51 AM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
On Tue, Oct 26, 2021, 1:08 PM Chris Angelico <rosuav@gmail.com> wrote:
No, I agree. We have to still be able to introspect them.
At the moment, when you look at a function's defaults, they are all values. With this change, some would be values and some would be markers saying that code would be executed.
So why on earth NOT make these general "deferred" objects that can be used in other contexts?!
Because they're NOT deferred objects. They're argument defaults. Consider: def f(x=>print("Hello")): ... When should that print happen? Argument defaults are processed before the function body begins, so you can be confident that it has happened before the first line of actual code. But a deferred object might not be evaluated until later, and in fact might not be evaluated at all. Deferred evaluation is a useful feature, but that isn't what I'm proposing here. If you want to propose it as a completely separate feature, and then posit that it makes PEP 671 unnecessary, then go ahead; that's your proposal, not mine. ChrisA
On 2021-10-26 10:55, Chris Angelico wrote:
So why on earth NOT make these general "deferred" objects that can be used in other contexts?! Because they're NOT deferred objects. They're argument defaults. Consider:
def f(x=>print("Hello")): ...
When should that print happen? Argument defaults are processed before the function body begins, so you can be confident that it has happened before the first line of actual code. But a deferred object might not be evaluated until later, and in fact might not be evaluated at all.
I don't think that's necessarily true. If you have deferred objects, you could write some kind of decorator or class that evaluated deferred objects (or a particular kind of deferred object) when they occur as arguments. You're speaking as if you think "deferred object" and "argument default" are mutually exclusive. But they're not. You could have an argument whose default value is a deferred object which is evaluated before the function body begins. Then it would be a default argument and a deferred object and would also be evaluated before the function body begins. The details would have to be worked out (just like they do for your PEP) but it's not automatically impossible. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Wed, Oct 27, 2021 at 5:21 AM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-26 10:55, Chris Angelico wrote:
So why on earth NOT make these general "deferred" objects that can be used in other contexts?! Because they're NOT deferred objects. They're argument defaults. Consider:
def f(x=>print("Hello")): ...
When should that print happen? Argument defaults are processed before the function body begins, so you can be confident that it has happened before the first line of actual code. But a deferred object might not be evaluated until later, and in fact might not be evaluated at all.
I don't think that's necessarily true. If you have deferred objects, you could write some kind of decorator or class that evaluated deferred objects (or a particular kind of deferred object) when they occur as arguments.
You're speaking as if you think "deferred object" and "argument default" are mutually exclusive. But they're not. You could have an argument whose default value is a deferred object which is evaluated before the function body begins. Then it would be a default argument and a deferred object and would also be evaluated before the function body begins. The details would have to be worked out (just like they do for your PEP) but it's not automatically impossible.
Okay, sure, they're not mutually exclusive... but why have a deferred object that you evaluate in a decorator, when you could simply have an argument default? Using an overly-generic tool means your code ends up WAY more convoluted, and it doesn't help with introspection or anything anyway; you may as well just have a dedicated sentinel object with a suitable repr, and then use the standard "if x is sentinel" check with the code in the body. ChrisA
On Tue, 26 Oct 2021 at 16:48, Eric V. Smith <eric@trueblade.com> wrote:
And also the "No Loss of Abilities If Removed" section sort of applies to late-bound function arguments: there's nothing proposed that can't currently be done in existing Python. I'll grant you that they might (might!) be more newbie-friendly, but I think the bar is high for proposals that make existing things doable in a different way, as opposed to proposals that add new expressiveness to the language.
One issue with not having an introspection capability, which has been bothering me but I've not yet had the time to come up with a complete example, is the fact that with this new feature, you have functions where there's no way to express "just use the default" without knowing what the default actually *is*. Take for example def f(a, b=None): if b is None: b = len(a) ... def g(a, b=>len(a)): ... Suppose you want to call f as follows: args = [ ([1,2,3], 2), ([4,5,6], None), ([7,8,9], 4), ] for a, b in args: f(a, b) That works fine. But you cannot replace f by g, because None doesn't mean "use the default", and in fact by design there's *nothing* that means "use the default" other than "know what the default is and supply it explicitly". So if you want to do something similar with g (allowing the use of None in the list of tuples to mean "use the default"), you need to be able to introspect g to know what the default is. You may also need to manipulate first-class "deferred expression" objects as well, just to have something you can return as the default value (you could return a string and require the user to eval it, I guess, but that doesn't seem particularly user-friendly...) I don't have a good solution for this, unfortunately. And maybe it's something where a "good enough" solution would be sufficient. But definitely, it should be discussed in the PEP so what's being proposed is clear. Paul
On Wed, Oct 27, 2021 at 5:05 AM Paul Moore <p.f.moore@gmail.com> wrote:
On Tue, 26 Oct 2021 at 16:48, Eric V. Smith <eric@trueblade.com> wrote:
And also the "No Loss of Abilities If Removed" section sort of applies to late-bound function arguments: there's nothing proposed that can't currently be done in existing Python. I'll grant you that they might (might!) be more newbie-friendly, but I think the bar is high for proposals that make existing things doable in a different way, as opposed to proposals that add new expressiveness to the language.
One issue with not having an introspection capability, which has been bothering me but I've not yet had the time to come up with a complete example, is the fact that with this new feature, you have functions where there's no way to express "just use the default" without knowing what the default actually *is*.
Take for example
def f(a, b=None): if b is None: b = len(a) ...
def g(a, b=>len(a)): ...
Suppose you want to call f as follows:
args = [ ([1,2,3], 2), ([4,5,6], None), ([7,8,9], 4), ]
for a, b in args: f(a, b)
That works fine. But you cannot replace f by g, because None doesn't mean "use the default", and in fact by design there's *nothing* that means "use the default" other than "know what the default is and supply it explicitly". So if you want to do something similar with g (allowing the use of None in the list of tuples to mean "use the default"), you need to be able to introspect g to know what the default is. You may also need to manipulate first-class "deferred expression" objects as well, just to have something you can return as the default value (you could return a string and require the user to eval it, I guess, but that doesn't seem particularly user-friendly...)
I don't have a good solution for this, unfortunately. And maybe it's something where a "good enough" solution would be sufficient. But definitely, it should be discussed in the PEP so what's being proposed is clear.
Wouldn't cases like this be most likely to use *args and/or **kwargs? Simply omitting the argument from those would mean "use the default". Or am I misunderstanding your example here? ChrisA
On Tue, 26 Oct 2021 at 19:25, Chris Angelico <rosuav@gmail.com> wrote:
On Wed, Oct 27, 2021 at 5:05 AM Paul Moore <p.f.moore@gmail.com> wrote:
On Tue, 26 Oct 2021 at 16:48, Eric V. Smith <eric@trueblade.com> wrote:
And also the "No Loss of Abilities If Removed" section sort of applies to late-bound function arguments: there's nothing proposed that can't currently be done in existing Python. I'll grant you that they might (might!) be more newbie-friendly, but I think the bar is high for proposals that make existing things doable in a different way, as opposed to proposals that add new expressiveness to the language.
One issue with not having an introspection capability, which has been bothering me but I've not yet had the time to come up with a complete example, is the fact that with this new feature, you have functions where there's no way to express "just use the default" without knowing what the default actually *is*.
Take for example
def f(a, b=None): if b is None: b = len(a) ...
def g(a, b=>len(a)): ...
Suppose you want to call f as follows:
args = [ ([1,2,3], 2), ([4,5,6], None), ([7,8,9], 4), ]
for a, b in args: f(a, b)
That works fine. But you cannot replace f by g, because None doesn't mean "use the default", and in fact by design there's *nothing* that means "use the default" other than "know what the default is and supply it explicitly". So if you want to do something similar with g (allowing the use of None in the list of tuples to mean "use the default"), you need to be able to introspect g to know what the default is. You may also need to manipulate first-class "deferred expression" objects as well, just to have something you can return as the default value (you could return a string and require the user to eval it, I guess, but that doesn't seem particularly user-friendly...)
I don't have a good solution for this, unfortunately. And maybe it's something where a "good enough" solution would be sufficient. But definitely, it should be discussed in the PEP so what's being proposed is clear.
Wouldn't cases like this be most likely to use *args and/or **kwargs? Simply omitting the argument from those would mean "use the default". Or am I misunderstanding your example here?
Maybe. I don't want to make more out of this issue than it warrants. But I will say that I genuinely would write code like I included there. The reason comes from the fact that I do a lot of ad-hoc scripting, and "copy this text file of data, edit it to look like a Python list, paste it into my code" is a very common approach. I wouldn't bother writing anything generic like f(*args), because it's just a "quick hack". (I'd rewrite it to use f(*args) when I'd done the same copy/paste exercise 20 times, and I was fed up enough to insist that I was allowed the time to "write it properly this time" ;-)) Having the code stop working just because I changed the way the called function handles its defaults would be a nuisance (again, I'm not saying this would happen often, just that I could easily imagine it happening if this feature was available). The function f might well be from a library, or a script, that I wrote for something else, and it might be perfectly reasonable to use the new feature for that other use case. There's no doubt that this is an artificial example - I'm not trying to pretend otherwise. But it's a *plausible* one, in the sort of coding I do regularly in my day job. And it's the sort of awkward edge case where you'd expect there to be support for it, by analogy with similar features elsewhere, so it would feel like a "wart" when you found out you couldn't do it. As I said, I'm not demanding a solution, but I would like to see it acknowledged and discussed in the PEP, just so the trade-offs are clear to people. Paul
On Wed, Oct 27, 2021 at 6:15 AM Paul Moore <p.f.moore@gmail.com> wrote:
Having the code stop working just because I changed the way the called function handles its defaults would be a nuisance (again, I'm not saying this would happen often, just that I could easily imagine it happening if this feature was available). The function f might well be from a library, or a script, that I wrote for something else, and it might be perfectly reasonable to use the new feature for that other use case.
The same will be true if it changes from "x=None" to "x=_sentinel". Every change breaks someone's workflow. I think Python would benefit from a "map these arguments to that function signature" tool, but that's independent of this proposal. Don't want to have massive scope creep here. ChrisA
On Tue, Oct 26, 2021 at 2:05 PM Paul Moore <p.f.moore@gmail.com> wrote:
On Tue, 26 Oct 2021 at 16:48, Eric V. Smith <eric@trueblade.com> wrote:
And also the "No Loss of Abilities If Removed" section sort of applies to late-bound function arguments: there's nothing proposed that can't currently be done in existing Python. I'll grant you that they might (might!) be more newbie-friendly, but I think the bar is high for proposals that make existing things doable in a different way, as opposed to proposals that add new expressiveness to the language.
One issue with not having an introspection capability, which has been bothering me but I've not yet had the time to come up with a complete example, is the fact that with this new feature, you have functions where there's no way to express "just use the default" without knowing what the default actually *is*.
Take for example
def f(a, b=None): if b is None: b = len(a) ...
def g(a, b=>len(a)): ...
Suppose you want to call f as follows:
args = [ ([1,2,3], 2), ([4,5,6], None), ([7,8,9], 4), ]
for a, b in args: f(a, b)
That works fine. But you cannot replace f by g, because None doesn't mean "use the default", and in fact by design there's *nothing* that means "use the default" other than "know what the default is and supply it explicitly". So if you want to do something similar with g (allowing the use of None in the list of tuples to mean "use the default"), you need to be able to introspect g to know what the default is. You may also need to manipulate first-class "deferred expression" objects as well, just to have something you can return as the default value (you could return a string and require the user to eval it, I guess, but that doesn't seem particularly user-friendly...)
I don't have a good solution for this, unfortunately. And maybe it's something where a "good enough" solution would be sufficient. But definitely, it should be discussed in the PEP so what's being proposed is clear.
Paul
I had drafted an entire reply last night trying to explain this same concern; Paul Moore, you did a better job. Just want to say I agree: being able to say "I want the defaults" seems like a useful ability. But on the other hand, we also must recognize that right now there isn't really a great, UNIVERSAL way to say "I want the defaults". It varies from API to API. Many times getting the default means passing None, but many it is True, or False, or a MISSING sentinel. You have to read the docs to find out. Taking PM's examples of f and g, the way you'd have to dynamically call g this way be: for arg_group in args: g(*arg_group) But this is going to be limiting because maybe you could also have a function j, like this: def j(a, b=None, c=None): if b is None: b = len(a) ... args = [ ([1,2,3], 2, "spam"), ([4,5,6], None, "eggs"), ([7,8,9], 4, "bacon"), ] for a, b, c in args: j(a, b, c) But with function k below, where the b parameter is deferred, you can't get the default b parameter by dynamically unpacking some values; you would have to pass c as a kwd arg: def k(a, b=>len(a), c=None): ... Seems like it would be- needed? convenient?- to be able to "ask" for the default in a dynamic way... Am I making more of this than is justified? --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
On Wed, Oct 27, 2021 at 5:30 AM Ricky Teachey <ricky@teachey.org> wrote:
But with function k below, where the b parameter is deferred, you can't get the default b parameter by dynamically unpacking some values; you would have to pass c as a kwd arg:
def k(a, b=>len(a), c=None): ...
Seems like it would be- needed? convenient?- to be able to "ask" for the default in a dynamic way... Am I making more of this than is justified?
Question: Does it make sense to ask for the default for the second positional parameter, while passing an actual value for the third? If it does, then the function needs an API that reflects this. The most obvious such API is.... what you already described: passing c as a keyword argument instead. I don't think this is a problem. When you're looking at positional args, it is only ever the last N that can be omitted. If it makes sense to pass any combination of arguments, then they should probably be keyword-only, to clarify this. Do you have any examples where this isn't the case? ChrisA
On Tue, Oct 26, 2021 at 2:40 PM Chris Angelico <rosuav@gmail.com> wrote:
On Wed, Oct 27, 2021 at 5:30 AM Ricky Teachey <ricky@teachey.org> wrote:
But with function k below, where the b parameter is deferred, you can't get the default b parameter by dynamically unpacking some values; you would have to pass c as a kwd arg:
def k(a, b=>len(a), c=None): ...
Seems like it would be- needed? convenient?- to be able to "ask" for the default in a dynamic way... Am I making more of this than is justified?
Question: Does it make sense to ask for the default for the second positional parameter, while passing an actual value for the third? If it does, then the function needs an API that reflects this. The most obvious such API is.... what you already described: passing c as a keyword argument instead. I don't think this is a problem. When you're looking at positional args, it is only ever the last N that can be omitted. If it makes sense to pass any combination of arguments, then they should probably be keyword-only, to clarify this.
Do you have any examples where this isn't the case?
ChrisA
I don't. I only have a niggling feeling that maybe this is a bigger problem than we're giving it credit for. If I can, I'll try to substantiate it better. Maybe others can better flesh out the concern here if it's valid. At bottom I guess I'd describe the problem this way: with most APIs, there is a way to PASS SOMETHING that says "give me the default". With this proposed API, we don't have that; the only want to say "give me the default" is to NOT pass something. I don't KNOW if that's a problem, it just feels like one. --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
On 10/26/21 12:08 PM, Ricky Teachey wrote:
On Tue, Oct 26, 2021 at 2:40 PM Chris Angelico wrote:
Do you have any examples where this isn't the case?
I don't. I only have a niggling feeling that maybe this is a bigger problem than we're giving it credit for.
At bottom I guess I'd describe the problem this way: with most APIs, there is a way to PASS SOMETHING that says "give me the default". With this proposed API, we don't have that; the only want to say "give me the default" is to NOT pass something.
I don't KNOW if that's a problem, it just feels like one.
Several times I've had to write code that calls a function in several different ways, based solely on where I could pass None to get the default behavior: my_wrapper_func(this, that, the_other=None): if framework.version > 4: framework.utility(this, that, the_other) elif framework.version > 3.5: if the_other is None: framework.utility(this, that) else: framework.utility(this, that, the_other) What a pain. When this PEP originally came out I thought that passing None was the way to trigger it -- if that's not the case, and there is nothing we can pass to trigger it, I am much less interested. -- ~Ethan~
On Tue, Oct 26, 2021 at 1:35 PM Ethan Furman <ethan@stoneleaf.us> wrote:
to PASS SOMETHING that says "give me the default". With this proposed API, we don't have that; the only want to say "give me the default" is to NOT pass something.
but that, in fact, is all that Python provides. yes, None indicating "use the default" is a very common idiom, but it is only an convention, and you can't be sure it will work with every API. Not passing a value is the only way to indicate that, well, you have not passed a value ;-) I think this ideas is addressing two related issues: 1) we have to write extra boilerplate code to process mutable (or calculated) default arguments. 2) There is no well defined way to specify "not specified" -- None is a convention, but it isnot used everywhere, and there are some places it can't be used, as None is heavily overloaded and is a valid value in some cases. When this PEP originally came out I thought that passing None was the way
to trigger it -- if that's not the case, and there is nothing we can pass to trigger it, I am much less interested.
See (2) above -- we really should not make None an "official" way to specify not-set, or use-default. If we wanted to, we could create a new standard sentinel for that, say "MISSING", and then allow folks to use that, or omit the argument. But frankly, I don't see the point. we already have the args tuple and kwargs dict -- it is not hard to omit items from those. But in a previous thread, I argued for a handful of standard sentinels so that we wouldn't have to overload None so much -- so MISSINGdoes have it's appeal. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Tue, 26 Oct 2021, Ricky Teachey wrote:
At bottom I guess I'd describe the problem this way: with most APIs, there is a way to PASS SOMETHING that says "give me the default". With this proposed API, we don't have that; the only want to say "give me the default" is to NOT pass something.
I don't KNOW if that's a problem, it just feels like one.
I agree that it's annoying, but it's actually an existing problem with early-bound defaults too. Consider: ``` def f(eggs = [], spam = {}): ... ``` There isn't an easy way to get the defaults for the arguments, because they're not just *any* `[]` or `{}`, they're a specific list and dict. So if you want to specify a value for the second argument but not the first, you'd need to do one of the following: ``` f(spam = {'more'}) f(f.__defaults__[0], {'more'}) ``` The former would work just as well with PEP 671. The latter depends on introspection, which we're still working out. Unfortunately, even if we can get access to the code that produces the default, we won't be able to actually call it, because it needs to be called from the function's scope. For example, consider: ``` def g(eggs := [], spam := {}): ... ``` In this simple case, there are no dependencies, so we could do something like this: ``` g(g.__defaults__[0](), {'more'}) ``` But in general we won't be able to make this call, because we don't have the scope until `g` gets called and its scope created... So there is a bit of functionality loss with PEP 671, though I'm not sure it's that big a deal. I wonder if it would make sense to offer a "missing argument" object (builtin? attribute of inspect.Parameter? attribute of types.FunctionType?) that actually simulates the behavior of that argument not being passed. Let me call it `_missing` for now. This would actually make it far easier to accomplish "pass in the second argument but not the first", both with early- and late-binding defaults: ``` f(_missing, {'more'}) g(_missing, {'more'}) ``` I started thinking about `_missing` when thinking about how to implement late-binding defaults. It's at least one way to do it (then the function itself could even do the argument checks), though perhaps there are simpler ways that avoid the ref count increments. Erik -- Erik Demaine | edemaine@mit.edu | http://erikdemaine.org/
On Wed, Oct 27, 2021 at 8:06 AM Erik Demaine <edemaine@mit.edu> wrote:
I wonder if it would make sense to offer a "missing argument" object (builtin? attribute of inspect.Parameter? attribute of types.FunctionType?) that actually simulates the behavior of that argument not being passed. Let me call it `_missing` for now. This would actually make it far easier to accomplish "pass in the second argument but not the first", both with early- and late-binding defaults:
``` f(_missing, {'more'}) g(_missing, {'more'}) ```
I started thinking about `_missing` when thinking about how to implement late-binding defaults. It's at least one way to do it (then the function itself could even do the argument checks), though perhaps there are simpler ways that avoid the ref count increments.
The trouble with sentinel values is that you always need another one. Sooner or later, you're going to need to talk about the _missing object, and you'll need to distinguish it from the absence of an object. If there is a way to say "don't pass this argument", it would have to be some kind of syntactic token. ChrisA
Hmm, it seems our notes crossed paths, sorry. On Tue, Oct 26, 2021 at 2:12 PM Chris Angelico <rosuav@gmail.com> wrote:
The trouble with sentinel values is that you always need another one. Sooner or later, you're going to need to talk about the _missing object, and you'll need to distinguish it from the absence of an object.
If there is a way to say "don't pass this argument", it would have to be some kind of syntactic token.
I don't think that's true. even now, folks can misuse None, and True and False in all sorts of ways, but if a singleton is well documented to mean "missing argument" then anyone who uses it some other way gets what they deserve. The reason None can't be counted on to mean MISSING is that is, in fact, used in other ways already. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Wed, Oct 27, 2021 at 10:14 AM Christopher Barker <pythonchb@gmail.com> wrote:
Hmm, it seems our notes crossed paths, sorry.
On Tue, Oct 26, 2021 at 2:12 PM Chris Angelico <rosuav@gmail.com> wrote:
The trouble with sentinel values is that you always need another one. Sooner or later, you're going to need to talk about the _missing object, and you'll need to distinguish it from the absence of an object.
If there is a way to say "don't pass this argument", it would have to be some kind of syntactic token.
I don't think that's true. even now, folks can misuse None, and True and False in all sorts of ways, but if a singleton is well documented to mean "missing argument" then anyone who uses it some other way gets what they deserve.
The reason None can't be counted on to mean MISSING is that is, in fact, used in other ways already.
None is a great sentinel for a lot of cases, but it can't do everything. If your API documents that None means absent, then sure, that works fine! But sometimes you can't, because None has other meaning. If you create a single global sentinel for _missing, sooner or later, it will become necessary to distinguish it from actual absence - for instance, when iterating over the builtins and doing something with each object. The truth is that there is no value that can be a truly universal representation of absence, so it *always* has to be specific to each API. Using *a, **kw does allow us to truly omit an argument. ChrisA
Well, you just repeated what I said, and then again asserted: On Tue, Oct 26, 2021 at 4:21 PM Chris Angelico <rosuav@gmail.com> wrote:
The truth is that there is no value that can be a truly universal representation of absence, so it *always* has to be specific to each API.
But I don't see what that's the case -- why would anyone ever need to use a MISSING to mean anything else? If you have a different meaning, use a different sentinel. Sure, that wouldn't be enforceable, but it sure could be considered best practice. Though i suppose I'm missing something here. could it be a soft keyword? Using *a, **kw does allow us to truly omit an argument.
Exactly, I don't really see why this is considered important. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
subclassing and defaults. There is one use case I'm not sure how to wrap my head around. Say a class and subclass have different default arguments. And we want to pass the fact that the argument wasn't set along to the superclass. The "use None" convention works fine: def __init__(self, something=None): super().__init__(something) The superclass could define its own default. But deferred binding would not work: def __init__(self, something=an_expression): super().__init__(something) The superclass would get the result of an_expression, and not know that it should use its default. Is this an actual problem? I'm not sure -- I can't think of when I wouldn't want a subclass to override the default of the superclass, but maybe ? And of course, if you really do need that, then use a sentinel in that case :-) No one's suggesting that every default should be specified explicitly! -CHB On Tue, Oct 26, 2021 at 4:32 PM Christopher Barker <pythonchb@gmail.com> wrote:
Well, you just repeated what I said, and then again asserted:
On Tue, Oct 26, 2021 at 4:21 PM Chris Angelico <rosuav@gmail.com> wrote:
The truth is that there is no value that can be a truly universal representation of absence, so it *always* has to be specific to each API.
But I don't see what that's the case -- why would anyone ever need to use a MISSING to mean anything else? If you have a different meaning, use a different sentinel. Sure, that wouldn't be enforceable, but it sure could be considered best practice.
Though i suppose I'm missing something here.
could it be a soft keyword?
Using *a, **kw does allow us to truly omit an argument.
Exactly, I don't really see why this is considered important.
-CHB
-- Christopher Barker, PhD (Chris)
Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
-- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Wed, Oct 27, 2021 at 10:40 AM Christopher Barker <pythonchb@gmail.com> wrote:
subclassing and defaults.
There is one use case I'm not sure how to wrap my head around.
Say a class and subclass have different default arguments. And we want to pass the fact that the argument wasn't set along to the superclass. The "use None" convention works fine:
def __init__(self, something=None): super().__init__(something)
The superclass could define its own default.
But deferred binding would not work:
def __init__(self, something=an_expression): super().__init__(something)
The superclass would get the result of an_expression, and not know that it should use its default.
Is this an actual problem? I'm not sure -- I can't think of when I wouldn't want a subclass to override the default of the superclass, but maybe ? And of course, if you really do need that, then use a sentinel in that case :-) No one's suggesting that every default should be specified explicitly!
Hmm. Trying to figure out what this would mean in practice. Let's get some concrete names for a moment. class Pizza: def __init__(self, cheese="mozzarella"): ... class WeirdPizza(Pizza): def __init__(self, cheese="brie"): # you monster ... Pizza("cheddar") # Legal, makes a cheddar pizza WeirdPizza("cheddar") # Legal, makes a cheddar weirdpizza Pizza() # Standard pizza WeirdPizza() # Nonstandard pizza If you're overriding the default, then you definitely want to use the subclass's default. So in this simple case, no problem. Another case is where you're trying to specify that the subclass should retain the superclass's default. But in that sort of situation, you probably want to retain ALL unspecified defaults - and that's best spelled *a,**kw, so that's fine too. All I can think of is that you want to replace some, but not all, of the defaults. That'd be a bit clunky, but if your API is complicated like that, it should probably be made entirely kwonly for safety, which means you can write code like this: def __init__(self, *, tomato="ketchup", **kw): super().__init__(self, tomato=tomato, **kw) A little clunky, but it lets you change one default, while keeping others with the same defaults. It would be nice to have a way to daisy-chain function signatures (think like functools.wraps but merging the signatures), but that would be quite a big thing to try to build. Would be an interesting idea to tackle though. ChrisA
On Tue, Oct 26, 2021, 7:20 PM Chris Angelico
The truth is that there is no value that can be a truly universal representation of absence, so it *always* has to be specific to each API.
That's a fact about Python rather than a fact of programming in general. For example, R has a NULL that is "even more missing" than it's NA or NaN. For example: # operation on NULL Vector v1 <- c(NULL, NULL, NULL) str(v1) # NULL In contrast, a vector of NA would still have 3 elements (or however many). NA is basically Python None, NULL is like a true missing that cannot be anything else. I'm not saying Python should add this, but it is possible to do.
I'm very confused about the apparent convergence on the token "=>" for deferred parameter assignment. 1) As others have said, it sure feels like the arrow is going the wrong way. But the bigger question I have is with the similarity to lambda: 2) As I understand it, there's a good chance that "=>" will be adopted to mean defining an anonymous function, i.e. a new spelling of lambda. But we can use lambda as default arguments (indeed, that's a common idiom), and, presumably, lambda will be able to use deferred defaults. So there's a reasonable probability that the same token will be used in the same context, meaning two different things, maybe even at the same time. I'm sure the parser will be able to figure it out, but I really don't get why folks think this is a good idea for human readability. Can someone explain? On the other hand, others have suggested :=, which is could also be used as part of the expression, so not a good idea either :-( BTW: was it intentional that this: In [8]: def fun(x, y=(z:=3)): ...: print(x,y,z) ...: ...: adds z to the function namespace -- sure seems odd to me. In [9]: fun(2) 2 3 3 -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Wed, Oct 27, 2021 at 10:27 AM Christopher Barker <pythonchb@gmail.com> wrote:
I'm very confused about the apparent convergence on the token "=>" for deferred parameter assignment.
The current text of the PEP uses that. It's not particularly important to me *what* symbol is used, so I just use the same one as in the PEP, for some measure of consistency.
On the other hand, others have suggested :=, which is could also be used as part of the expression, so not a good idea either :-(
Yes. I originally was positing =: and another viable suggestion is ?=. There are a few other options listed in the PEP, and I'm happy to hear arguments for and against.
BTW: was it intentional that this:
In [8]: def fun(x, y=(z:=3)): ...: print(x,y,z) ...: ...:
adds z to the function namespace -- sure seems odd to me.
If it doesn't add it to the function namespace, what does it add it to? Do parameters get their own namespace? You can reassign your parameters in the function without a nonlocal declaration, which strongly suggests that they're in the exact same namespace as assignments made in the function body; so I don't see any particular reason to isolate this one example. Of course, this wouldn't be *good* code, but it should be acceptable by the interpreter. (For starters, it will only assign z if y is omitted.) ChrisA
On Tue, Oct 26, 2021 at 5:29 PM Christopher Barker <pythonchb@gmail.com> wrote:
BTW: was it intentional that this:
In [8]: def fun(x, y=(z:=3)): ...: print(x,y,z) ...: ...:
adds z to the function namespace -- sure seems odd to me.
In [9]: fun(2) 2 3 3
It doesn't. It adds it to the namespace in which the function is defined, which is what you'd expect given when function defaults are currently evaluated (at function definition time). It's just that if `z` is referenced in the function body and isn't a local, it gets looked up in the enclosing namespace (either as a global, or via closure if the function is nested.) Carl
On Wed, Oct 27, 2021 at 10:37 AM Carl Meyer <carl@oddbird.net> wrote:
On Tue, Oct 26, 2021 at 5:29 PM Christopher Barker <pythonchb@gmail.com> wrote:
BTW: was it intentional that this:
In [8]: def fun(x, y=(z:=3)): ...: print(x,y,z) ...: ...:
adds z to the function namespace -- sure seems odd to me.
In [9]: fun(2) 2 3 3
It doesn't. It adds it to the namespace in which the function is defined, which is what you'd expect given when function defaults are currently evaluated (at function definition time).
It's just that if `z` is referenced in the function body and isn't a local, it gets looked up in the enclosing namespace (either as a global, or via closure if the function is nested.)
Oops, I missed seeing that that's actually an early-bound, so my response was on the misinterpretation that the function was written thus: def fun(x, y=>(z:=3)): In which case it *would* add it to the function's namespace. As it currently is, yes, that's added to the same namespace that fun is. ChrisA
On Tue, Oct 26, 2021 at 4:42 PM Chris Angelico <rosuav@gmail.com> wrote:
Oops, I missed seeing that that's actually an early-bound, so my response was on the misinterpretation that the function was written thus:
def fun(x, y=>(z:=3)):
In which case it *would* add it to the function's namespace.
That's OK -- you pre-answered my next question :-) -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Tue, Oct 26, 2021 at 4:37 PM Carl Meyer <carl@oddbird.net> wrote:
On Tue, Oct 26, 2021 at 5:29 PM Christopher Barker <pythonchb@gmail.com> wrote:
BTW: was it intentional that this:
In [8]: def fun(x, y=(z:=3)): ...: print(x,y,z) ...: ...:
It doesn't. It adds it to the namespace in which the function is defined, which is what you'd expect given when function defaults are currently evaluated (at function definition time).
indeed: ----> 1 fun(2) <ipython-input-10-cef0a16457f4> in fun(x, y) 1 def fun(x, y=(z:=3)): ----> 2 z += 1 3 print(x,y,z) 4 5 UnboundLocalError: local variable 'z' referenced before assignment Sorry for the brain fart. Though it does point out the dangers of the walrus operator ... -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On 27/10/2021 00:26, Christopher Barker wrote:
I'm very confused about the apparent convergence on the token "=>" for deferred parameter assignment.
1) As others have said, it sure feels like the arrow is going the wrong way.
But the bigger question I have is with the similarity to lambda:
2) As I understand it, there's a good chance that "=>" will be adopted to mean defining an anonymous function, i.e. a new spelling of lambda.
But we can use lambda as default arguments (indeed, that's a common idiom), and, presumably, lambda will be able to use deferred defaults. So there's a reasonable probability that the same token will be used in the same context, meaning two different things, maybe even at the same time.
I'm sure the parser will be able to figure it out, but I really don't get why folks think this is a good idea for human readability.
Can someone explain?
On the other hand, others have suggested :=, which is could also be used as part of the expression, so not a good idea either :-( There's no necessary clash. At present you can write f(a = (b:=c)): so you would be able to write f(a:= (b:= c)): Perhaps a tad confusing at first glance, but unambiguous.
BTW: was it intentional that this:
In [8]: def fun(x, y=(z:=3)): ...: print(x,y,z) ...: ...:
adds z to the function namespace -- sure seems odd to me.
In [9]: fun(2) 2 3 3 Erm, where else should z go? What would be the point of writing it if it didn't? Rob Cliffe
-CHB
-- Christopher Barker, PhD (Chris)
Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
_______________________________________________ Python-ideas mailing list --python-ideas@python.org To unsubscribe send an email topython-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived athttps://mail.python.org/archives/list/python-ideas@python.org/message/XHXGW2... Code of Conduct:http://python.org/psf/codeofconduct/
Greetings, this use case interests me. Is anyone interested in coauthoring this with me? Anyone who has
strong interest in seeing this happen - whether you've been around the Python lists for years, or you're new and interested in getting involved for the first time, or anywhere in between!
Did you find a co-author yet? If not i apply. Yours, Abdur-Rahmaan Janhangeer
On 10/23/21 5:13 PM, Chris Angelico wrote:
PEP: 671 Title: Syntax for late-bound function argument defaults
I have a major concern about the utility of this addition -- it was mentioned already (sorry, I don't remember who) and I don't think it has yet been addressed. Using the `bisect()` function as a stand-in for the 20+ years worth of Python APIs in existence: def bisect_right(a, x, lo=0, hi=None, *, key=None): if hi is None: hi = len(a) while lo < hi: ... That function would be transformed to: def bisect_right(a, x, lo=0, @hi=len(a), *, key=None): if hi is None: hi = len(a) while lo < hi: ... Notice that the `None` check is still in the body -- why? Backwards compatibility: there is code out there that actually passes in `None` for `hi`, and removing the None-check in the body will suddenly cause TypeErrors. This seems like a lot of effort for a very marginal gain. -- ~Ethan~
On Wed, Oct 27, 2021 at 2:30 PM Ethan Furman <ethan@stoneleaf.us> wrote:
On 10/23/21 5:13 PM, Chris Angelico wrote:
PEP: 671 Title: Syntax for late-bound function argument defaults
I have a major concern about the utility of this addition -- it was mentioned already (sorry, I don't remember who) and I don't think it has yet been addressed.
Using the `bisect()` function as a stand-in for the 20+ years worth of Python APIs in existence:
def bisect_right(a, x, lo=0, hi=None, *, key=None): if hi is None: hi = len(a) while lo < hi: ...
That function would be transformed to:
def bisect_right(a, x, lo=0, @hi=len(a), *, key=None): if hi is None: hi = len(a) while lo < hi: ...
Notice that the `None` check is still in the body -- why? Backwards compatibility: there is code out there that actually passes in `None` for `hi`, and removing the None-check in the body will suddenly cause TypeErrors.
This seems like a lot of effort for a very marginal gain.
The question then becomes: How important is this feature of being able to pass None as a stand-in for the length of the list? If it's important, then don't change the style of the code at all. But since help(bisect) clearly states that the default is len(a), and doesn't say anything about passing None, so it would be just as valid to deem such code to be buggy. Obviously that decision has to be made individually for each use-case, but one thing is clear: newly-defined APIs will not need to have None-pollution in this way. Also, anything that uses a dedicated sentinel should be safe to convert; consider: # Inspired by socket.create_connection, but simplified: _USE_GLOBAL_DEFAULT = object() def connect(timeout=_USE_GLOBAL_DEFAULT): if timeout is _USE_GLOBAL_DEFAULT: timeout = default_timeout If any outside caller is reaching into the module and referring to this underscore-preceded name, it is buggy, and I would have no qualms in breaking that code. (The inspiration for this can't actually be converted, since socket.create_connection with the global default just doesn't call the settimeout() method, but there will be other cases where the same applies.) As with every new feature, a decision has to be made as to whether old code should be changed. This is no different, but I do believe that in many cases, it will be safe to do so. ChrisA
On Tue, Oct 26, 2021 at 8:29 PM Ethan Furman <ethan@stoneleaf.us> wrote:
Using the `bisect()` function as a stand-in for the 20+ years worth of Python APIs in existence:
def bisect_right(a, x, lo=0, hi=None, *, key=None): if hi is None: hi = len(a) while lo < hi: ...
That function would be transformed to:
def bisect_right(a, x, lo=0, @hi=len(a), *, key=None): if hi is None: hi = len(a) while lo < hi:
Notice that the `None` check is still in the body -- why? Backwards compatibility:
Of course. personally, I don't think there's any reason to go in and change the standard library at all. This is a feature for new code, new APIs. *maybe* a long time from now, some stdlib APIs could be updated, but no hurry. This seems like a lot of effort for a very marginal gain.
marginal gain to the stdlib, yes of course. Though now that you mention it -- there would be some marginal game even to functions like that, making the function signature a bit more clear. Interestingly, though, here's bisect's docstring: In [14]: bisect.bisect? Docstring: bisect_right(a, x[, lo[, hi]]) -> index Return the index where to insert item x in list a, assuming a is sorted. The return value i is such that all e in a[:i] have e <= x, and all e in a[i:] have e > x. So if x already appears in the list, i points just beyond the rightmost x already there Optional args lo (default 0) and hi (default len(a)) bound the slice of a to be searched. Type: builtin_function_or_method It's not actually documented that None indicates "use the default". Which, it turns out is because it doesn't :-) In [24]: bisect.bisect([1,3,4,6,8,9], 5, hi=None) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-24-65fd10e3a3b5> in <module> ----> 1 bisect.bisect([1,3,4,6,8,9], 5, hi=None) TypeError: 'NoneType' object cannot be interpreted as an integer I guess that's because in C there is a way to define optional other than using a sentinel? or it's using an undocumented sentinal? Note: that's python 3.8 -- I can't imagine anything;s changed, but ... -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Wed, Oct 27, 2021 at 4:07 PM Christopher Barker <pythonchb@gmail.com> wrote:
It's not actually documented that None indicates "use the default".
Which, it turns out is because it doesn't :-)
In [24]: bisect.bisect([1,3,4,6,8,9], 5, hi=None) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-24-65fd10e3a3b5> in <module> ----> 1 bisect.bisect([1,3,4,6,8,9], 5, hi=None)
TypeError: 'NoneType' object cannot be interpreted as an integer
I guess that's because in C there is a way to define optional other than using a sentinel? or it's using an undocumented sentinal?
Note: that's python 3.8 -- I can't imagine anything;s changed, but ...
Actually it has. The C-accelerated version of the function changed its signature between 3.8 and 3.11 - probably when Argument Clinic got deployed. bisect_right(...) bisect_right(a, x[, lo[, hi]]) -> index So, yes, the 3.8 version of it does indeed use "optional" without a default. And code that passes None directly is buggy. ChrisA
On Tue, 26 Oct 2021, Christopher Barker wrote:
It's not actually documented that None indicates "use the default". Which, it turns out is because it doesn't :-) In [24]: bisect.bisect([1,3,4,6,8,9], 5, hi=None) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-24-65fd10e3a3b5> in <module> ----> 1 bisect.bisect([1,3,4,6,8,9], 5, hi=None)
TypeError: 'NoneType' object cannot be interpreted as an integer
I guess that's because in C there is a way to define optional other than using a sentinel? or it's using an undocumented sentinal?
Note: that's python 3.8 -- I can't imagine anything;s changed, but ...
It seems to have changed. I can reproduce the error in CPython 3.8, but the same code words in CPython 3.9 and 3.10 (all using the C version of the module, though there's also a Python version of the module that probably always supported hi=None). I think it's the result of this commit: https://github.com/python/cpython/commit/3a855b26aed02abf87fc1163ad0d564dc3d... On the plus side, this probably means that there aren't many people using the hi=None API. :-) So it might be safe to change to a late-bound default. Erik -- Erik Demaine | edemaine@mit.edu | http://erikdemaine.org/
One similarity that I don't think has been mentioned yet: - decorator syntax says, "run me later, after this function is built" - late-bound argument syntax says, "run me later, just before each function call" Because both mean "run me later" we can leverage the @ symbol to aid understanding; also, because "run me later" can completely change the workings of a function (mutable defaults, anyone?), it deserves more attention than being buried in the middle of the expression where it is easy to miss (which is why I originally proposed the ? -- it stood out better). -- ~Ethan~
On Thu, Nov 4, 2021 at 2:29 AM Ethan Furman <ethan@stoneleaf.us> wrote:
One similarity that I don't think has been mentioned yet:
- decorator syntax says, "run me later, after this function is built"
- late-bound argument syntax says, "run me later, just before each function call"
Hmm, I more think of decorator syntax as "modify this function". It runs at the same time that the def statement does, although the effects may be felt at call time (including a lot of simple ones like lru_cache).
Because both mean "run me later" we can leverage the @ symbol to aid understanding; also, because "run me later" can completely change the workings of a function (mutable defaults, anyone?), it deserves more attention than being buried in the middle of the expression where it is easy to miss (which is why I originally proposed the ? -- it stood out better).
One of the reasons I want to keep the latebound vs earlybound indication at the equals sign is the presence of annotations. I want to associate the lateboundness of the default with the default itself; consider: def func(spam: list = []) -> str: ... Which part of that becomes late bound? The [], not the name "list", not the name "spam". So if you want to make use of the at sign, it would end up looking like matrix multiplication: def func(spam: list @= []) -> str: ... def func(spam: list =@ []) -> str: ... rather than feeling like decorating the variable. Is that still valuable enough to prefer it? ChrisA
On 11/3/21 9:07 AM, Chris Angelico wrote:
On Thu, Nov 4, 2021 at 2:29 AM Ethan Furman wrote:
One similarity that I don't think has been mentioned yet:
- decorator syntax says, "run me later, after this function is built"
- late-bound argument syntax says, "run me later, just before each function call"
Hmm, I more think of decorator syntax as "modify this function". It runs at the same time that the def statement does, although the effects may be felt at call time (including a lot of simple ones like lru_cache).
Well, if "at the same time" you mean "after the function is defined even though the decorator appears first", then sure. ;-)
Because both mean "run me later" we can leverage the @ symbol to aid understanding; also, because "run me later" can completely change the workings of a function (mutable defaults, anyone?), it deserves more attention than being buried in the middle of the expression where it is easy to miss (which is why I originally proposed the ? -- it stood out better).
One of the reasons I want to keep the latebound vs earlybound indication at the equals sign is the presence of annotations. I want to associate the lateboundness of the default with the default itself;
I think that level of specificity is unnecessary, and counter-productive. When discussing a particular late-bound default, how are you (usually) going to reference it? By name: "the 'spam' parameter is late-bound" -- so decorate the variable name.
So if you want to make use of the at sign, it would end up looking like matrix multiplication:
def func(spam: list @= []) -> str: ... def func(spam: list =@ []) -> str: ...
rather than feeling like decorating the variable.
Which is horrible. Put the @ at the front: - its relation to decorators, and delayed evaluation, is much more clear - it stands out better to the reader -- ~Ethan~
On Wed, Nov 03, 2021 at 10:25:02AM -0700, Ethan Furman wrote:
Which is horrible. Put the @ at the front:
- its relation to decorators, and delayed evaluation, is much more clear - it stands out better to the reader
I agree. (Well, I would, wouldn't I? :-) But if people really, really hate the idea of the @ symbol as a prefix modifier/sigil, I think that there are some alternatives which also look good. (See my previous post.) !parameter=expression # bang parameter >parameter=expression ^parameter=expression ~parameter=expression (but not $parameter, thank you). I suppose that we could even add yet another overloaded meaning on the asterix: # with no default, * keeps the old meaning of collecting # extra positional values *parameter # with a default, * triggers late-binding *parameter=expression I should hate that, I know I should... but I kinda don't. -- Steve
On Thu, Nov 4, 2021 at 4:38 AM Steven D'Aprano <steve@pearwood.info> wrote:
On Wed, Nov 03, 2021 at 10:25:02AM -0700, Ethan Furman wrote:
Which is horrible. Put the @ at the front:
- its relation to decorators, and delayed evaluation, is much more clear - it stands out better to the reader
I agree. (Well, I would, wouldn't I? :-)
But if people really, really hate the idea of the @ symbol as a prefix modifier/sigil, I think that there are some alternatives which also look good. (See my previous post.)
!parameter=expression # bang parameter >parameter=expression ^parameter=expression ~parameter=expression
(but not $parameter, thank you).
I suppose that we could even add yet another overloaded meaning on the asterix:
# with no default, * keeps the old meaning of collecting # extra positional values
*parameter
# with a default, * triggers late-binding
*parameter=expression
I should hate that, I know I should... but I kinda don't.
I'll save you the trouble: I hate that :) As I said in the other post, *param changes the meaning of param itself. If there's any meaning to *param=expr, it should be something like "provide one or more args, but if you provide zero, use this instead". Which seems pretty weird. ChrisA
On 11/3/21 10:35 AM, Steven D'Aprano wrote:
I suppose that we could even add yet another overloaded meaning on the asterix:
# with no default, * keeps the old meaning of collecting # extra positional values
*parameter
# with a default, * triggers late-binding
*parameter=expression
I should hate that, I know I should... but I kinda don't.
Don't worry, I do. ;-) -- ~Ethan~
There’s some conflict about whether it’s the name or the expression that’s different, and thus should be adorned. But it’s neither: it’s the “=“ that’s different— so that’s what should be a different symbol. As for “=>” — I like it now, but if that same symbol is adopted for lambda, it would be horrible. -CHB On Wed, Nov 3, 2021 at 11:16 AM Ethan Furman <ethan@stoneleaf.us> wrote:
On 11/3/21 10:35 AM, Steven D'Aprano wrote:
I suppose that we could even add yet another overloaded meaning on the asterix:
# with no default, * keeps the old meaning of collecting # extra positional values
*parameter
# with a default, * triggers late-binding
*parameter=expression
I should hate that, I know I should... but I kinda don't.
Don't worry, I do. ;-)
-- ~Ethan~ _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/C675R5... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On 2021-11-03 18:14, Ethan Furman wrote:
On 11/3/21 10:35 AM, Steven D'Aprano wrote:
I suppose that we could even add yet another overloaded meaning on the asterix:
# with no default, * keeps the old meaning of collecting # extra positional values
*parameter
# with a default, * triggers late-binding
*parameter=expression
I should hate that, I know I should... but I kinda don't.
Don't worry, I do. ;-)
How about: parameter=*expression so, for example: parameter=*[] indicates that you get a new list for each default, so multiple lists, not a single shared list?
On Wed, Nov 3, 2021, 2:40 PM MRAB <python@mrabarnett.plus.com> wrote:
On 11/3/21 10:35 AM, Steven D'Aprano wrote:
I suppose that we could even add yet another overloaded meaning on
On 2021-11-03 18:14, Ethan Furman wrote: the
asterix:
# with no default, * keeps the old meaning of collecting # extra positional values
*parameter
# with a default, * triggers late-binding
*parameter=expression
I should hate that, I know I should... but I kinda don't.
Don't worry, I do. ;-)
How about:
parameter=*expression
so, for example:
parameter=*[]
indicates that you get a new list for each default, so multiple lists, not a single shared list?
I'm sure I could get used to it, but that looks to me like you're unpacking an empty list and assigning the resulting tuple. It's weird.
On Thu, Nov 4, 2021 at 4:25 AM Ethan Furman <ethan@stoneleaf.us> wrote:
On 11/3/21 9:07 AM, Chris Angelico wrote:
On Thu, Nov 4, 2021 at 2:29 AM Ethan Furman wrote:
One similarity that I don't think has been mentioned yet:
- decorator syntax says, "run me later, after this function is built"
- late-bound argument syntax says, "run me later, just before each function call"
Hmm, I more think of decorator syntax as "modify this function". It runs at the same time that the def statement does, although the effects may be felt at call time (including a lot of simple ones like lru_cache).
Well, if "at the same time" you mean "after the function is defined even though the decorator appears first", then sure. ;-)
Yes, I do mean that that's the same time. There are three very different points in the life of a function: 1) Compilation (converting source code to byte code) 2) Definition 3) Invocation Decorators happen at definition time. Yes, the decorator is called after the function is otherwise defined, but that's very little difference (the only way it would be visible is an early-bound default, which would be evaluated prior to decorators being applied - all the rest of the work happens at compilation time). In contrast, late-bound defaults happen at invocation, which is completely separate.
Because both mean "run me later" we can leverage the @ symbol to aid understanding; also, because "run me later" can completely change the workings of a function (mutable defaults, anyone?), it deserves more attention than being buried in the middle of the expression where it is easy to miss (which is why I originally proposed the ? -- it stood out better).
One of the reasons I want to keep the latebound vs earlybound indication at the equals sign is the presence of annotations. I want to associate the lateboundness of the default with the default itself;
I think that level of specificity is unnecessary, and counter-productive. When discussing a particular late-bound default, how are you (usually) going to reference it? By name: "the 'spam' parameter is late-bound" -- so decorate the variable name.
You use the name because you can't refer to it other than by name, and that's fine. But it's the *default* that is different. Not the parameter. If you provide a value when you call the function, it doesn't make any difference whether there's an early default, a late default, or no default at all; the name is bound identically regardless.
So if you want to make use of the at sign, it would end up looking like matrix multiplication:
def func(spam: list @= []) -> str: ... def func(spam: list =@ []) -> str: ...
rather than feeling like decorating the variable.
Which is horrible. Put the @ at the front:
- its relation to decorators, and delayed evaluation, is much more clear - it stands out better to the reader
I agree that both of those are horrible. That's why I'm not advocating either :) ChrisA
On Thu, Nov 04, 2021 at 04:36:27AM +1100, Chris Angelico wrote:
You use the name because you can't refer to it other than by name, and that's fine. But it's the *default* that is different. Not the parameter.
Wait, are you saying that the list display here: parameter=[] is different from the list display here? parameter=>[] Well obviously they are distinct *objects*, but we consider that the semantics of the [] is the same here: a = [] b = [] even though a and b get distinct objects. So we should ignore the fact that they give distinct objects. What if we use a singleton as our default? parameter=None parameter=>None Now there's only a single object involved. There is no question at all that the token `None` refers to exactly the same thing in both cases. We cannot possibly say that "it's the *default* that is different" in this case, that one of the Nones is different from the other. But one parameter is still early bound and the other is still late-bound. It's not the binding itself that is different (the implementations are the same: we have a slot with a pointer to an object), and it's not the None defaults that are different, because there is only one None. It must be the parameter that is different.
I agree that both of those are horrible. That's why I'm not advocating either :)
I've lost track of what your preferred syntax is. It is this? # move parameter name into the expression parameter=>expression -- Steve
On Thu, Nov 4, 2021 at 4:56 AM Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Nov 04, 2021 at 04:36:27AM +1100, Chris Angelico wrote:
You use the name because you can't refer to it other than by name, and that's fine. But it's the *default* that is different. Not the parameter.
Wait, are you saying that the list display here:
parameter=[]
is different from the list display here?
parameter=>[]
Well obviously they are distinct *objects*, but we consider that the semantics of the [] is the same here:
a = [] b = []
even though a and b get distinct objects. So we should ignore the fact that they give distinct objects.
Here's a closer parallel. a = [] for _ in range(10): b = [] ... code that uses a and b Are the names a and b different? Are the lists different? Is the assignment different? What, exactly, is different? In a sense, nothing is.
What if we use a singleton as our default?
parameter=None parameter=>None
Now there's only a single object involved. There is no question at all that the token `None` refers to exactly the same thing in both cases. We cannot possibly say that "it's the *default* that is different" in this case, that one of the Nones is different from the other.
But one parameter is still early bound and the other is still late-bound.
The parameters are bound at the same time. One of the defaults is evaluated at function definition time, the other at function invocation time. It's the evaluation of None that is different.
It's not the binding itself that is different (the implementations are the same: we have a slot with a pointer to an object), and it's not the None defaults that are different, because there is only one None. It must be the parameter that is different.
Well, you have two evaluations that produce the same result, but that's because None is a singleton. I mean, you could write it like this and get the same result too: parameter=(None,)[0] I'm sure nobody would try to claim that that's the same expression as just None :) Different expressions can easily yield the same object. But the evaluation is what's different - not because the expression is different, but because the timing is.
I agree that both of those are horrible. That's why I'm not advocating either :)
I've lost track of what your preferred syntax is. It is this?
# move parameter name into the expression parameter=>expression
Yeah, but only a weak preference. I'm sticking to it for consistency, but only until something better comes along. However, I don't consider adorning the name to be better. :) ChrisA
On Thu, Nov 04, 2021 at 03:07:22AM +1100, Chris Angelico wrote:
def func(spam: list = []) -> str: ...
Which part of that becomes late bound?
The name "spam" of course. Without the parameter, nothing gets bound at all.
The [], not the name "list", not the name "spam".
The type annotation is irrelevant. Its an annotation. There's no sense that the earliness/lateness of the binding comes into it at all. The parameter spam is declared to be a list even if the function is never called and nothing gets bound to it at all. And the [] is merely the expression that is bound to the parameter. I say "merely", but of course the expression is important in the sense that if you want the default value to be an empty list, it won't do to use "Hello world" as the expression. Obviously. But the parameter is, in a sense, more important than the default value, because the default value is only a default. If you pass an argument to the function, the default doesn't get used. So if you talk about the function (either in code, the function's body, or in actual prose), you will say something like: if condition: spam.append(eggs) "if this condition is true, append eggs to spam" rather than "...to the empty list" because it might not be the empty list. We talk about the parameter. The parameter is late bound. Not the empty list. Which empty list? My program might generate millions of them. The only one that matters here is spam. It is *spam* which is late bound. Late binding or early binding is a property of the identifier, not of the value that gets bound to it. Putting the late-bound modifier on the assignment buries it in the middle of what might be a fairly complicated (or hideously complicated!) signature with a type annotations and an expression for the default value: parameter:Optional[Literal[X]|List[T]]=>(alpha if value > 2 else beta)*gamma ... I'm sure that we can make it unambiguous to the compiler, but we may not make it stand out sufficiently to the human reader. It should be front and centre, like the `*` in the `*args` and `**kwargs` syntax. I personally feel that the @ symbol works as a prefix modifier on the parameter. But if you hate the @ symbol, I think that other prefix modifiers may also work: $parameter=expression # perhaps a little too like Sparameter !parameter=expression # "bang" parameter :-) ^parameter=expression >parameter=expression # unary greater than :-) ~parameter=expression Perl calls these symbols that modify an identifier "sigils". Other languages have them too: https://en.wikipedia.org/wiki/Sigil_(computer_programming) I guess this is not really a "true" sigil, because the modifier doesn't follow the parameter name outside of the function header (likewise for the `*args` and `**kwargs` syntax). But the analogy is close: the signal instructs the compiler that this identifier needs to be treated differently by using late-binding instead of early for the default, just as the `*` and `**` sigils instruct the compiler that they are to collect extra positional and keyword arguments. -- Steve
On Thu, Nov 4, 2021 at 4:30 AM Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Nov 04, 2021 at 03:07:22AM +1100, Chris Angelico wrote:
def func(spam: list = []) -> str: ...
Which part of that becomes late bound?
The name "spam" of course. Without the parameter, nothing gets bound at all.
True, but the parameter will be bound at the same time (minor implementation details aside) whether there's an early default, a late default, or no default, and (orthogonally) whether a value was provided by the caller. There's nothing early or late there.
The [], not the name "list", not the name "spam".
The type annotation is irrelevant. Its an annotation. There's no sense that the earliness/lateness of the binding comes into it at all. The parameter spam is declared to be a list even if the function is never called and nothing gets bound to it at all.
Correct, the annotation is irrelevant - but it is between the name and the default. That's why I'm using annotated examples here: to emphasize that separation (which will happen in many codebases), and to highlight the distinction between annotating the name and annotating the expression.
And the [] is merely the expression that is bound to the parameter.
I say "merely", but of course the expression is important in the sense that if you want the default value to be an empty list, it won't do to use "Hello world" as the expression. Obviously.
Uhh, but that's exactly the bit that's evaluated at a different time. So it's not "merely" the expression; it is the exact thing that we're changing the behaviour of.
But the parameter is, in a sense, more important than the default value, because the default value is only a default. If you pass an argument to the function, the default doesn't get used. So if you talk about the function (either in code, the function's body, or in actual prose), you will say something like:
if condition: spam.append(eggs)
"if this condition is true, append eggs to spam" rather than "...to the empty list" because it might not be the empty list. We talk about the parameter. The parameter is late bound. Not the empty list.
Of course! The parameter is FAR more important than the default, and that's why it comes first. But it's not the parameter that changes.
Late binding or early binding is a property of the identifier, not of the value that gets bound to it.
I disagree :) The identifier is bound in exactly the same way.
I'm sure that we can make it unambiguous to the compiler, but we may not make it stand out sufficiently to the human reader. It should be front and centre, like the `*` in the `*args` and `**kwargs` syntax.
But *a and **kw are actually different parameter types. They change the meaning of the parameter itself, and therefore annotate that.
I personally feel that the @ symbol works as a prefix modifier on the parameter. But if you hate the @ symbol, I think that other prefix modifiers may also work:
$parameter=expression # perhaps a little too like Sparameter !parameter=expression # "bang" parameter :-) ^parameter=expression >parameter=expression # unary greater than :-) ~parameter=expression
The at sign doesn't particularly bother me, the prefix does. So I'm not really a fan of any of these alternatives. ChrisA
participants (31)
-
2QdxY4RzWzUUiLuE@potatochowder.com
-
Abdulla Al Kathiri
-
Abdur-Rahmaan Janhangeer
-
Barry Scott
-
Ben Rudiak-Gould
-
Brendan Barnwell
-
Carl Meyer
-
Chris Angelico
-
Christopher Barker
-
David Mertz, Ph.D.
-
Eric V. Smith
-
Erik Demaine
-
Ethan Furman
-
Evpok Padding
-
Greg Ewing
-
Guido van Rossum
-
Jonathan Fine
-
Kyle Lahnakoski
-
Marc-Andre Lemburg
-
Matt del Valle
-
Mike Miller
-
MRAB
-
Paul Moore
-
Ricky Teachey
-
Rob Cliffe
-
Sebastian Berg
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Sven R. Kunze
-
Thomas Mc Kay
-
Zomatree .