PEP 671: Syntax for late-bound function argument defaults
Incorporates comments from the thread we just had. Is anyone interested in coauthoring this with me? Anyone who has strong interest in seeing this happen - whether you've been around the Python lists for years, or you're new and interested in getting involved for the first time, or anywhere in between! https://www.python.org/dev/peps/pep-0671/ PEP: 671 Title: Syntax for late-bound function argument defaults Author: Chris Angelico <rosuav@gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 24-Oct-2021 Python-Version: 3.11 Post-History: 24-Oct-2021 Abstract ======== Function parameters can have default values which are calculated during function definition and saved. This proposal introduces a new form of argument default, defined by an expression to be evaluated at function call time. Motivation ========== Optional function arguments, if omitted, often have some sort of logical default value. When this value depends on other arguments, or needs to be reevaluated each function call, there is currently no clean way to state this in the function header. Currently-legal idioms for this include:: # Very common: Use None and replace it in the function def bisect_right(a, x, lo=0, hi=None, *, key=None): if hi is None: hi = len(a) # Also well known: Use a unique custom sentinel object _USE_GLOBAL_DEFAULT = object() def connect(timeout=_USE_GLOBAL_DEFAULT): if timeout is _USE_GLOBAL_DEFAULT: timeout = default_timeout # Unusual: Accept star-args and then validate def add_item(item, *optional_target): if not optional_target: target = [] else: target = optional_target[0] In each form, ``help(function)`` fails to show the true default value. Each one has additional problems, too; using ``None`` is only valid if None is not itself a plausible function parameter, the custom sentinel requires a global constant; and use of star-args implies that more than one argument could be given. Specification ============= Function default arguments can be defined using the new ``=>`` notation:: def bisect_right(a, x, lo=0, hi=>len(a), *, key=None): def connect(timeout=>default_timeout): def add_item(item, target=>[]): The expression is saved in its source code form for the purpose of inspection, and bytecode to evaluate it is prepended to the function's body. Notably, the expression is evaluated in the function's run-time scope, NOT the scope in which the function was defined (as are early-bound defaults). This allows the expression to refer to other arguments. Self-referential expressions will result in UnboundLocalError:: def spam(eggs=>eggs): # Nope Multiple late-bound arguments are evaluated from left to right, and can refer to previously-calculated values. Order is defined by the function, regardless of the order in which keyword arguments may be passed. Choice of spelling ------------------ Our chief syntax proposal is ``name=>expression`` -- our two syntax proposals ... ahem. Amongst our potential syntaxes are:: def bisect(a, hi=>len(a)): def bisect(a, hi=:len(a)): def bisect(a, hi?=len(a)): def bisect(a, hi!=len(a)): def bisect(a, hi=\len(a)): def bisect(a, hi=`len(a)`): def bisect(a, hi=@len(a)): Since default arguments behave largely the same whether they're early or late bound, the preferred syntax is very similar to the existing early-bind syntax. The alternatives offer little advantage over the preferred one. How to Teach This ================= Early-bound default arguments should always be taught first, as they are the simpler and more efficient way to evaluate arguments. Building on them, late bound arguments are broadly equivalent to code at the top of the function:: def add_item(item, target=>[]): # Equivalent pseudocode: def add_item(item, target=<OPTIONAL>): if target was omitted: target = [] Open Issues =========== - yield/await? Will they cause problems? Might end up being a non-issue. - annotations? They go before the default, so is there any way an anno could want to end with ``=>``? References ========== Copyright ========= This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.
What would the the syntax for the default be, i would presume just a single expression like a lambda, but what about something like an async function, for example: ```py async def foo(arg=>await get_arg()): ``` would this work?
On Mon, Oct 25, 2021 at 1:16 AM Zomatree . <angelokontaxis@hotmail.com> wrote:
What would the the syntax for the default be, i would presume just a single expression like a lambda, but what about something like an async function, for example: ```py async def foo(arg=>await get_arg()): ``` would this work?
That's an open question at the moment, but I suspect that it will be perfectly acceptable. Same with a yield expression in a generator. ChrisA
Eight hours from the initial post on Python-Ideas, to a PEP, with just eight responses from six people. Is that some sort of a record? And in the wee hours of the morning too (3am to 11am local time). I thought my sleep habits were bad. Do you not sleep any more? :-) -- Steve
On Sun, Oct 24, 2021 at 12:10 PM Steven D'Aprano <steve@pearwood.info> wrote:
Eight hours from the initial post on Python-Ideas, to a PEP, with just eight responses from six people. Is that some sort of a record?
And in the wee hours of the morning too (3am to 11am local time). I thought my sleep habits were bad. Do you not sleep any more? :-)
Fair point, but this is something that just keeps on coming up in some form or another. Anyway, if it ends up going nowhere, it's still not wasted time. Sleep? What is sleep? https://docs.python.org/3/library/time.html#time.sleep Ah yes. That. :) ChrisA
From what I've seen so far, I'm -0 on this. I understand the pattern it addresses, but it doesn't feel all that common, nor that hard to address with the existing sentinel-check pattern alien in the PEP draft. This just doesn't feel big enough to merit it's own syntax. ... On the other hand, if this could express a much more general deferred computation, I'd be really enthusiastic (subject to syntax and behavioral details). However, I recognize that a new general "dynamically scoped lambda" would indeed have a lot of edge cases. On Sat, Oct 23, 2021, 8:15 PM Chris Angelico <rosuav@gmail.com> wrote:
Incorporates comments from the thread we just had.
Is anyone interested in coauthoring this with me? Anyone who has strong interest in seeing this happen - whether you've been around the Python lists for years, or you're new and interested in getting involved for the first time, or anywhere in between!
https://www.python.org/dev/peps/pep-0671/
PEP: 671 Title: Syntax for late-bound function argument defaults Author: Chris Angelico <rosuav@gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 24-Oct-2021 Python-Version: 3.11 Post-History: 24-Oct-2021
Abstract ========
Function parameters can have default values which are calculated during function definition and saved. This proposal introduces a new form of argument default, defined by an expression to be evaluated at function call time.
Motivation ==========
Optional function arguments, if omitted, often have some sort of logical default value. When this value depends on other arguments, or needs to be reevaluated each function call, there is currently no clean way to state this in the function header.
Currently-legal idioms for this include::
# Very common: Use None and replace it in the function def bisect_right(a, x, lo=0, hi=None, *, key=None): if hi is None: hi = len(a)
# Also well known: Use a unique custom sentinel object _USE_GLOBAL_DEFAULT = object() def connect(timeout=_USE_GLOBAL_DEFAULT): if timeout is _USE_GLOBAL_DEFAULT: timeout = default_timeout
# Unusual: Accept star-args and then validate def add_item(item, *optional_target): if not optional_target: target = [] else: target = optional_target[0]
In each form, ``help(function)`` fails to show the true default value. Each one has additional problems, too; using ``None`` is only valid if None is not itself a plausible function parameter, the custom sentinel requires a global constant; and use of star-args implies that more than one argument could be given.
Specification =============
Function default arguments can be defined using the new ``=>`` notation::
def bisect_right(a, x, lo=0, hi=>len(a), *, key=None): def connect(timeout=>default_timeout): def add_item(item, target=>[]):
The expression is saved in its source code form for the purpose of inspection, and bytecode to evaluate it is prepended to the function's body.
Notably, the expression is evaluated in the function's run-time scope, NOT the scope in which the function was defined (as are early-bound defaults). This allows the expression to refer to other arguments.
Self-referential expressions will result in UnboundLocalError::
def spam(eggs=>eggs): # Nope
Multiple late-bound arguments are evaluated from left to right, and can refer to previously-calculated values. Order is defined by the function, regardless of the order in which keyword arguments may be passed.
Choice of spelling ------------------
Our chief syntax proposal is ``name=>expression`` -- our two syntax proposals ... ahem. Amongst our potential syntaxes are::
def bisect(a, hi=>len(a)): def bisect(a, hi=:len(a)): def bisect(a, hi?=len(a)): def bisect(a, hi!=len(a)): def bisect(a, hi=\len(a)): def bisect(a, hi=`len(a)`): def bisect(a, hi=@len(a)):
Since default arguments behave largely the same whether they're early or late bound, the preferred syntax is very similar to the existing early-bind syntax. The alternatives offer little advantage over the preferred one.
How to Teach This =================
Early-bound default arguments should always be taught first, as they are the simpler and more efficient way to evaluate arguments. Building on them, late bound arguments are broadly equivalent to code at the top of the function::
def add_item(item, target=>[]):
# Equivalent pseudocode: def add_item(item, target=<OPTIONAL>): if target was omitted: target = []
Open Issues ===========
- yield/await? Will they cause problems? Might end up being a non-issue.
- annotations? They go before the default, so is there any way an anno could want to end with ``=>``?
References ==========
Copyright =========
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/KR2TML... Code of Conduct: http://python.org/psf/codeofconduct/
On Sun, Oct 24, 2021 at 05:49:50AM +0400, David Mertz, Ph.D. wrote:
... On the other hand, if this could express a much more general deferred computation, I'd be really enthusiastic (subject to syntax and behavioral details).
However, I recognize that a new general "dynamically scoped lambda" would indeed have a lot of edge cases.
Dynamic scoping is not the same as deferred computation. -- Steve
On Sat, Oct 23, 2021, 10:58 PM Steven D'Aprano
... On the other hand, if this could express a much more general deferred computation, I'd be really enthusiastic (subject to syntax and behavioral details).
However, I recognize that a new general "dynamically scoped lambda" would indeed have a lot of edge cases.
Dynamic scoping is not the same as deferred computation.
Of course not generally. But a dynamic deferred could cover this specific desire of the proposal. So stawman proposal: def fun(seq, low=0, high=defer: len(seq)): assert low < high # other stuff... Where the implication here is that the "defer expression" creates a dynamic scope. The reason this could appeal to me is that it wouldn't be limited to function signatures, nor even necessarily most useful there. Instead, a deferred object would be a completely general thing that could be bound anywhere any object can. Such a capability would allow passing around potential computational "code blocks", but only perform the work if or when a value is required.
On Sun, Oct 24, 2021 at 2:21 PM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
On Sat, Oct 23, 2021, 10:58 PM Steven D'Aprano
... On the other hand, if this could express a much more general deferred computation, I'd be really enthusiastic (subject to syntax and behavioral details).
However, I recognize that a new general "dynamically scoped lambda" would indeed have a lot of edge cases.
Dynamic scoping is not the same as deferred computation.
Of course not generally. But a dynamic deferred could cover this specific desire of the proposal.
So stawman proposal:
def fun(seq, low=0, high=defer: len(seq)): assert low < high # other stuff...
Where the implication here is that the "defer expression" creates a dynamic scope.
At what point is this defer-expression to be evaluated? For instance: def f(x=defer: a + b): a, b = 3, 5 return x Would this return 8, or a defer-expression? If 8, then the scope isn't truly dynamic, since there's no way to keep it deferred until it moves to another scope. If not 8, then I'm not sure how you'd define the scope or what triggers its evaluation. ChrisA
On Sat, Oct 23, 2021, 11:28 PM Chris Angelico <rosuav@gmail.com> wrote:
So stawman proposal:
def fun(seq, low=0, high=defer: len(seq)): assert low < high # other stuff...
Where the implication here is that the "defer expression" creates a dynamic scope.
At what point is this defer-expression to be evaluated? For instance:
def f(x=defer: a + b): a, b = 3, 5 return x
Would this return 8, or a defer-expression? If 8, then the scope isn't truly dynamic, since there's no way to keep it deferred until it moves to another scope. If not 8, then I'm not sure how you'd define the scope or what triggers its evaluation.
This would return 8. Basically as if the expression were passed as a string, and `eval(x)` were run when the name "x" was looked up within a scope. An elaboration of this strawman could allow partial binding at the static scope as well. E.g. cursor = db_connection.cursor() table = "employees" expensive_query = defer c=cursor, t=table: c.execute( f"SELECT * FROM {t} WHERE name={name}") def employee_check(q=expensive_query): if random.random() > 0.5: name = "Smith" return q So 'c' and 't' would be closed over when the deferred is defined, but 'name' would utilize the dynamic scope. In particular, the expensive operation of resolving the deferred would only occur if it was needed.
On Sat, Oct 23, 2021, 11:46 PM David Mertz, Ph.D.
def f(x=defer: a + b):
a, b = 3, 5 return x
Would this return 8, or a defer-expression? If 8, then the scope isn't truly dynamic, since there's no way to keep it deferred until it moves to another scope. If not 8, then I'm not sure how you'd define the scope or what triggers its evaluation.
Oh... Keep in mind I'm proposing a strawman deliberately, but the most natural approach to keeping an object deferred rather than evaluated is simply to say so: def f(x=defer: a + b): a, b = 3, 5 fn2(defer: x) # look for local a, b within fn2() if needed # ... other stuff return x # return 8 here
On Sun, Oct 24, 2021 at 2:52 PM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
On Sat, Oct 23, 2021, 11:46 PM David Mertz, Ph.D.
def f(x=defer: a + b): a, b = 3, 5 return x
Would this return 8, or a defer-expression? If 8, then the scope isn't truly dynamic, since there's no way to keep it deferred until it moves to another scope. If not 8, then I'm not sure how you'd define the scope or what triggers its evaluation.
Oh... Keep in mind I'm proposing a strawman deliberately, but the most natural approach to keeping an object deferred rather than evaluated is simply to say so:
def f(x=defer: a + b): a, b = 3, 5 fn2(defer: x) # look for local a, b within fn2() if needed # ... other stuff return x # return 8 here
How would it know to look for a and b inside fn2's scope, instead of looking for x inside fn2's scope? ChrisA
On Sat, Oct 23, 2021 at 9:21 PM Chris Angelico <rosuav@gmail.com> wrote:
On Sun, Oct 24, 2021 at 2:52 PM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
On Sat, Oct 23, 2021, 11:46 PM David Mertz, Ph.D.
def f(x=defer: a + b): a, b = 3, 5 return x
Would this return 8, or a defer-expression? If 8, then the scope isn't
truly dynamic, since there's no way to keep it deferred until it moves to another scope. If not 8, then I'm not sure how you'd define the scope or what triggers its evaluation.
Oh... Keep in mind I'm proposing a strawman deliberately, but the most
natural approach to keeping an object deferred rather than evaluated is simply to say so:
def f(x=defer: a + b): a, b = 3, 5 fn2(defer: x) # look for local a, b within fn2() if needed # ... other stuff return x # return 8 here
How would it know to look for a and b inside fn2's scope, instead of looking for x inside fn2's scope?
I am worried that this side-thread about dynamic scopes (which are a ridiculous idea IMO) will derail the decent proposal of the PEP. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On Sun, Oct 24, 2021, 12:25 AM Guido van Rossum
I am worried that this side-thread about dynamic scopes (which are a ridiculous idea IMO) will derail the decent proposal of the PEP.
It's really not a suggestion about dynamic scoping but about more generalized deferred computation. This has been a topic of other threads over the years, and something I've wanted at least since I first worked with Dask's 'delayed()' function.
On Sun, Oct 24, 2021 at 09:39:27AM -0400, David Mertz, Ph.D. wrote:
This has been a topic of other threads over the years, and something I've wanted at least since I first worked with Dask's 'delayed()' function.
You mean this? https://docs.dask.org/en/latest/delayed.html -- Steve
Yes! Exactly that. I believe (and believed in some discussions here since maybe 4-5 years ago) that having something close to the dask.delayed() function baked into the language works be a good think. And as a narrow point, it could address the narrower late-bound function argument matter as one narrow use. On Sun, Oct 24, 2021, 9:59 AM Steven D'Aprano <steve@pearwood.info> wrote:
On Sun, Oct 24, 2021 at 09:39:27AM -0400, David Mertz, Ph.D. wrote:
This has been a topic of other threads over the years, and something I've wanted at least since I first worked with Dask's 'delayed()' function.
You mean this?
https://docs.dask.org/en/latest/delayed.html
-- Steve _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ORMQES... Code of Conduct: http://python.org/psf/codeofconduct/
Hi Please forgive me if it's not already been considered. Is the following valid syntax, and if so what's the semantics? Here it is: def puzzle(*, a=>b+1, b=>a+1): return a, b Aside: In a functional programming language a = b + 1 b = a + 1 would be a syntax (or at least compile time) error. -- Jonathan
On Mon, Oct 25, 2021 at 3:43 AM Jonathan Fine <jfine2358@gmail.com> wrote:
Hi
Please forgive me if it's not already been considered. Is the following valid syntax, and if so what's the semantics? Here it is:
def puzzle(*, a=>b+1, b=>a+1): return a, b
Aside: In a functional programming language a = b + 1 b = a + 1 would be a syntax (or at least compile time) error.
There are two possibilities: either it's a SyntaxError, or it's a run-time UnboundLocalError if you omit both of them (in which case it would be perfectly legal and sensible if you specify one of them). I'm currently inclined towards SyntaxError, since permitting it would open up some hard-to-track-down bugs, but am open to suggestions about how it would be of value to permit this. ChrisA
On Mon, Oct 25, 2021 at 3:47 AM Chris Angelico <rosuav@gmail.com> wrote:
On Mon, Oct 25, 2021 at 3:43 AM Jonathan Fine <jfine2358@gmail.com> wrote:
Hi
Please forgive me if it's not already been considered. Is the following valid syntax, and if so what's the semantics? Here it is:
def puzzle(*, a=>b+1, b=>a+1): return a, b
Aside: In a functional programming language a = b + 1 b = a + 1 would be a syntax (or at least compile time) error.
There are two possibilities: either it's a SyntaxError, or it's a run-time UnboundLocalError if you omit both of them (in which case it would be perfectly legal and sensible if you specify one of them).
I'm currently inclined towards SyntaxError, since permitting it would open up some hard-to-track-down bugs, but am open to suggestions about how it would be of value to permit this.
In fact, on subsequent consideration, I'm inclining more strongly towards SyntaxError, due to the difficulty of explaining the actual semantics. Changing the PEP accordingly. ChrisA
Hi Chris You wrote: In fact, on subsequent consideration, I'm inclining more strongly
towards SyntaxError, due to the difficulty of explaining the actual semantics. Changing the PEP accordingly.
Your PEP, so your choice. I now think that if implemented, your PEP adds to the Python compiler (and also runtime?) tools for detecting and well-ordering Directed Acyclic Graphs (DAG). Here's another problem. Suppose def puzzle (*, a=>..., z>=...) gives rise to a directed acyclic graph, and all the initialisation functions consume and use a value from a counter. The semantics of puzzle will now depend on the linearization you choose for the DAG. (This consumption and use of the value from a counter could be internal to the initialisation function.) -- Jonathan
On Mon, 25 Oct 2021, Chris Angelico wrote:
On Mon, Oct 25, 2021 at 3:47 AM Chris Angelico <rosuav@gmail.com> wrote:
On Mon, Oct 25, 2021 at 3:43 AM Jonathan Fine <jfine2358@gmail.com> wrote:
Please forgive me if it's not already been considered. Is the following valid syntax, and if so what's the semantics? Here it is:
def puzzle(*, a=>b+1, b=>a+1): return a, b
There are two possibilities: either it's a SyntaxError, or it's a run-time UnboundLocalError if you omit both of them (in which case it would be perfectly legal and sensible if you specify one of them).
I'm currently inclined towards SyntaxError, since permitting it would open up some hard-to-track-down bugs, but am open to suggestions about how it would be of value to permit this.
In fact, on subsequent consideration, I'm inclining more strongly towards SyntaxError, due to the difficulty of explaining the actual semantics. Changing the PEP accordingly.
I think the semantics are easy to specify: the argument defaults get evaluated for unspecified order, in left to right order as specified in the def. Those may trigger exceptions as usual. Erik
On Sun, 24 Oct 2021, Erik Demaine wrote:
I think the semantics are easy to specify: the argument defaults get evaluated for unspecified order, in left to right order as specified in the def. Those may trigger exceptions as usual.
Sorry, that should be: I think the semantics are easy to specify: the argument defaults get evaluated for unspecified ARGUMENT(s), in left to right order as specified in the def. Those may trigger exceptions as usual.
On Mon, Oct 25, 2021 at 4:29 AM Erik Demaine <edemaine@mit.edu> wrote:
On Sun, 24 Oct 2021, Erik Demaine wrote:
I think the semantics are easy to specify: the argument defaults get evaluated for unspecified order, in left to right order as specified in the def. Those may trigger exceptions as usual.
Sorry, that should be:
I think the semantics are easy to specify: the argument defaults get evaluated for unspecified ARGUMENT(s), in left to right order as specified in the def. Those may trigger exceptions as usual.
Ah, but is it ALL argument defaults, or only those that are late-evaluated? Either way, it's going to be inconsistent with itself and harder to explain. That's what led me to change my mind. ChrisA
On Mon, Oct 25, 2021 at 05:23:38AM +1100, Chris Angelico wrote:
On Mon, Oct 25, 2021 at 4:29 AM Erik Demaine <edemaine@mit.edu> wrote:
On Sun, 24 Oct 2021, Erik Demaine wrote:
I think the semantics are easy to specify: the argument defaults get evaluated for unspecified order, in left to right order as specified in the def. Those may trigger exceptions as usual.
Sorry, that should be:
I think the semantics are easy to specify: the argument defaults get evaluated for unspecified ARGUMENT(s), in left to right order as specified in the def. Those may trigger exceptions as usual.
Ah, but is it ALL argument defaults, or only those that are late-evaluated? Either way, it's going to be inconsistent with itself and harder to explain. That's what led me to change my mind.
The rules for applying parameter defaults are well-defined. I would have to look it up to be sure, but by memory the rules are: 1. apply positional arguments from left to right; - if there are more positional arguments than parameters, raise; 2. apply named keyword arguments to parameters: - if the parameter already has a value, raise; - if the keyword parameter doesn't exist, raise; 3. for any parameter still without a value, fetch its default; - if there is no default, then raise. I would say that it makes most sense to assign early-bound defaults first, then late-bound defaults, specifically so that late-bound defaults can refer to early-bound ones: def func(x=0, @y=x+1) So step 3 above should become: 3. for any parameters still without a value, skip those which are late-bound, and fetch the default of the others; - if there is no default, then raise; 4. for any parameters still without a value, which will all be late-bound, run from left-to-right and evaluate the default. This will be consistent and understandable, and if you get an UnboundLocalError, the cause should be no more confusing than any other UnboundLocalError. Note that step 4 (evaluating the late-bound defaults) can raise *any* exception at all (it's an arbitrary expression, so it can fail in arbitrary ways). I see no good reason for trying to single out UnboundLocalError for extra protection by turning it into a syntax error. -- Steve
On Mon, Oct 25, 2021 at 6:13 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Mon, Oct 25, 2021 at 05:23:38AM +1100, Chris Angelico wrote:
On Mon, Oct 25, 2021 at 4:29 AM Erik Demaine <edemaine@mit.edu> wrote:
On Sun, 24 Oct 2021, Erik Demaine wrote:
I think the semantics are easy to specify: the argument defaults get evaluated for unspecified order, in left to right order as specified in the def. Those may trigger exceptions as usual.
Sorry, that should be:
I think the semantics are easy to specify: the argument defaults get evaluated for unspecified ARGUMENT(s), in left to right order as specified in the def. Those may trigger exceptions as usual.
Ah, but is it ALL argument defaults, or only those that are late-evaluated? Either way, it's going to be inconsistent with itself and harder to explain. That's what led me to change my mind.
The rules for applying parameter defaults are well-defined. I would have to look it up to be sure...
And that right there is all the evidence I need. If you, an experienced Python programmer, can be unsure, then there's a strong indication that novice programmers will have far more trouble. Why permit bad code at the price of hard-to-explain complexity? Offer me a real use-case where this would matter. So far, we had better use-cases for arbitrary assignment expression targets than for back-to-front argument default references, and those were excluded. ChrisA
On Mon, 25 Oct 2021, Chris Angelico wrote:
On Mon, Oct 25, 2021 at 6:13 PM Steven D'Aprano <steve@pearwood.info> wrote:
The rules for applying parameter defaults are well-defined. I would have to look it up to be sure...
And that right there is all the evidence I need. If you, an experienced Python programmer, can be unsure, then there's a strong indication that novice programmers will have far more trouble. Why permit bad code at the price of hard-to-explain complexity?
I'm not sure how this helps; the rules are already a bit complicated. Steven's proposed rules are a natural way to extend the existing rules; I don't see the new rules as (much) more complicated.
Offer me a real use-case where this would matter. So far, we had better use-cases for arbitrary assignment expression targets than for back-to-front argument default references, and those were excluded.
I can think of a few examples, though they are a bit artificial: ``` def search_listdir(path = None, files := os.listdir(path), start = 0, end = len(files)): '''specify path or files''' # variation of the LocaleTextCalendar from stdlib (in a message of Steven's) class Calendar: default_firstweekday = 0 def __init__(self, firstweekday := Calendar.default_firstweekday, locale := find_default_locale(), firstweekdayname := locale.lookup_day_name(firstweekday)): ... Calendar.default_firstweekday = 1 ``` But I think the main advantage of the left-to-right semantics is simplicity and predictability. I don't think the following functions should evaluate the default values in different orders. ``` def f(a := side_effect1(), b := side_effect2()): ... def g(a := side_effect1(), b := side_effect2() + a): ... def h(a := side_effect1() + b, b := side_effect2()): ... ``` I expect left-to-right semantics of the side effects (so function h will probably raise an error), just like I get from the corresponding tuple expressions: ``` (a := side_effect1(), b := side_effect2()) (a := side_effect1(), b := side_effect2() + a) (a := side_effect1() + b, b := side_effect2()) ``` As Jonathan Fine mentioned, if you defined the order to be a linearization of the partial order on arguments, (a) this would be complicated and (b) it would be ambiguous. I think, if you're going to forbid `def f(a := b, b:= a)` at the compiler level, you would need to forbid using late-bound arguments (at least) in least-bound argument expressions. But I don't see a reason to forbid this. It's rare that order would matter, and if it did, a quick experiment or learning "left to right" is really easy. The tuple expression equivalence leads me to think that `:=` is decent notation. As a result, I would expect: ``` def f(a := expr1, b := expr2, c := expr3): pass ``` to behave the same as: ``` _no_a = object() _no_b = object() _no_c = object() def f(a = _no_a, b = _no_b, c = _no_c): (a := expr1 if a is _no_a else a, b := expr2 if b is _no_b else b, c := expr3 if c is _no_c else c) ``` Given that `=` assignments within a function's parameter spec already only means "assign when another value isn't specified", this is pretty similar. On Mon, 25 Oct 2021, Chris Angelico wrote:
On Sun, 24 Oct 2021, Erik Demaine wrote:
I think the semantics are easy to specify: the argument defaults get evaluated for unspecified ARGUMENT(s), in left to right order as specified in the def. Those may trigger exceptions as usual.
Ah, but is it ALL argument defaults, or only those that are late-evaluated? Either way, it's going to be inconsistent with itself and harder to explain. That's what led me to change my mind.
I admit I missed this subtlety, though again I don't think it would often make a difference. But working out subtleties is what PEPs and discussion are for. :-) I'd be inclined to assign the early-bound argument defaults before the late-bound arguments, because their values are already known (they're stored right in the function argument) so they can't cause side effects, and it could offer slight incremental benefits, like being able to write the following (again, somewhat artificial): ``` def manipulate(top_list): def recurse(start=0, end := len(rec_list), rec_list=top_list): ... ``` But I don't feel strongly either way about either interpretation. Mixing both types of default arguments breaks the analogy to tuple expressions above, alas. The corresponding tuple expression with `=` is just invalid. Personally, I'd expect to use late-bound defaults almost all or all the time; they behave more how I expect and how I usually need them (I use a fair amount of `[]` and `{}` and `set()` as default values). The only context I'd use/want the current default behavior is to hack closures, as in: ``` for thing in things: thing.callback = lambda thing=thing: print(thing.name) ``` I believe the general preference for late-bound defaults is why Guido called this a "wart" in https://mail.python.org/archives/list/python-ideas@python.org/message/T4VPHD... By the way, JavaScript's semantics for default arguments are just like what I'm describing: they are evaluated at call time, in the function scope, and in left-to-right order. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/... https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/... A key difference from the PEP is that JavaScript doesn't have the notion of "omitted arguments"; any omitted arguments are just passed in as `undefined`; so `f()` and `f(undefined)` always behave the same (triggering default argument behavior). There is a subtlety mentioned in the case of JavaScript, which is that the default value expressions are evaluated in their own scope: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/... This is perhaps worth considering for the Python context. I'm not sure this is as important in Python, because UnboundLocalError exists (so attempts to access things in the function's scope will fail), but perhaps I'm missing a ramification... Erik -- Erik Demaine | edemaine@mit.edu | http://erikdemaine.org/
On Tue, Oct 26, 2021 at 3:32 AM Erik Demaine <edemaine@mit.edu> wrote:
As Jonathan Fine mentioned, if you defined the order to be a linearization of the partial order on arguments, (a) this would be complicated and (b) it would be ambiguous. I think, if you're going to forbid `def f(a := b, b:= a)` at the compiler level, you would need to forbid using late-bound arguments (at least) in least-bound argument expressions. But I don't see a reason to forbid this. It's rare that order would matter, and if it did, a quick experiment or learning "left to right" is really easy.
Oh yes, absolutely. I have never at any point considered any sort of linearization or reordering of evaluation, and it would be a nightmare. They'll always be evaluated left-to-right. The two options on the table are: 1) Allow references to any value that has been provided in any way 2) Allow references only to parameters to the left Option 2 is a simple SyntaxError on compilation (you won't even get as far as the def statement). Option 1 allows everything all up to the point where you call it, but then might raise UnboundLocalError if you refer to something that wasn't passed. The permissive option allows mutual references as long as one of the arguments is provided, but will give a peculiar error if you pass neither. I think this is bad API design. If you have a function for which one or other of two arguments must be provided, it should raise TypeError when you fail to do so, not UnboundLocalError.
Ah, but is it ALL argument defaults, or only those that are late-evaluated? Either way, it's going to be inconsistent with itself and harder to explain. That's what led me to change my mind.
I admit I missed this subtlety, though again I don't think it would often make a difference. But working out subtleties is what PEPs and discussion are for. :-)
Yeah. I have plans to try this out on someone who knows some Python but has no familiarity with this proposal, and see how he finds it.
I'd be inclined to assign the early-bound argument defaults before the late-bound arguments, because their values are already known (they're stored right in the function argument) so they can't cause side effects, and it could offer slight incremental benefits, like being able to write the following (again, somewhat artificial):
``` def manipulate(top_list): def recurse(start=0, end := len(rec_list), rec_list=top_list): ... ```
That would be the most logical semantics, if it's permitted at all.
Personally, I'd expect to use late-bound defaults almost all or all the time; they behave more how I expect and how I usually need them (I use a fair amount of `[]` and `{}` and `set()` as default values).
Interesting. In many cases, the choice will be irrelevant, and early-bound is more efficient. There aren't many situations where early-bind semantics are going to be essential, but there will be huge numbers where late-bind semantics will be unnecessary.
A key difference from the PEP is that JavaScript doesn't have the notion of "omitted arguments"; any omitted arguments are just passed in as `undefined`; so `f()` and `f(undefined)` always behave the same (triggering default argument behavior).
Except when it doesn't, and you have to use null instead... I have never understood those weird inconsistencies!
There is a subtlety mentioned in the case of JavaScript, which is that the default value expressions are evaluated in their own scope:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/...
Yeah, well, JS scope is a weird mess of historical artifacts. Fortunately, we don't have to be compatible with it :)
This is perhaps worth considering for the Python context. I'm not sure this is as important in Python, because UnboundLocalError exists (so attempts to access things in the function's scope will fail), but perhaps I'm missing a ramification...
Hmm. I think the only way it could possibly matter would be something like this: def f(x=>spam): global spam spam += 1 Unsure what this should do. A naive interpretation would be this: def f(x=None): if x is None: x = spam global spam spam += 1 and would bomb with SyntaxError. But perhaps it's better to permit this, on the understanding that a global statement anywhere in a function will apply to late-bound defaults; or alternatively, to evaluate the arguments in a separate scope. Or, which would be a simpler way of achieving the same thing: all name lookups inside function defaults come from the enclosing scope unless they are other arguments. But maybe that's unnecessarily complicated. ChrisA
On Mon, Oct 25, 2021 at 10:28 AM Chris Angelico <rosuav@gmail.com> wrote:
[...] The two options on the table are:
1) Allow references to any value that has been provided in any way 2) Allow references only to parameters to the left
Option 2 is a simple SyntaxError on compilation (you won't even get as far as the def statement). Option 1 allows everything all up to the point where you call it, but then might raise UnboundLocalError if you refer to something that wasn't passed.
Note that if you were to choose the SyntaxError option, you'd be breaking new ground. Everywhere else in Python, undefined names are runtime errors (NameError or UnboundLocalError). -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
Option 2 is a simple SyntaxError on compilation (you won't even get as
far as the def statement). Option 1 allows everything all up to the point where you call it, but then might raise UnboundLocalError if you refer to something that wasn't passed.
Note that if you were to choose the SyntaxError option, you'd be breaking new ground. Everywhere else in Python, undefined names are runtime errors (NameError or UnboundLocalError).
That’s why I said earlier that this is not technically a SyntaxError. Would it be possible to raise a UnboundLocalError at function definition time if any deferred parameters refer to any others. Functionality similar to a SyntaxError, but more in line with present behavior. -CHB
--
Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Tue, Oct 26, 2021 at 4:36 AM Guido van Rossum <guido@python.org> wrote:
On Mon, Oct 25, 2021 at 10:28 AM Chris Angelico <rosuav@gmail.com> wrote:
[...] The two options on the table are:
1) Allow references to any value that has been provided in any way 2) Allow references only to parameters to the left
Option 2 is a simple SyntaxError on compilation (you won't even get as far as the def statement). Option 1 allows everything all up to the point where you call it, but then might raise UnboundLocalError if you refer to something that wasn't passed.
Note that if you were to choose the SyntaxError option, you'd be breaking new ground. Everywhere else in Python, undefined names are runtime errors (NameError or UnboundLocalError).
I'm considering this to be more similar to mismatching local and global usage, or messing up nonlocal names:
def spam(): ... ham ... global ham ... File "<stdin>", line 3 SyntaxError: name 'ham' is used prior to global declaration def spam(): ... def ham(): ... nonlocal eggs ... File "<stdin>", line 3 SyntaxError: no binding for nonlocal 'eggs' found
The problem is the bizarre inconsistencies that can come up, which are difficult to explain unless you know exactly how everything is implemented internally. What exactly is the difference between these, and why should some be legal and others not? def f1(x=>y + 1, y=2): ... def f2(x=>y + 1, y=>2): ... def f3(x=>y + 1, *, y): ... def f4(x=>y + 1): y = 2 def f5(x=>y + 1): global y y = 2 And importantly, do Python core devs agree with less-skilled Python programmers on the intuitions? If this should be permitted, there are two plausible semantic meanings for these kinds of constructs: 1) Arguments are defined left-to-right, each one independently of each other 2) Early-bound arguments and those given values are defined first, then late-bound arguments The first option is much easier to explain, but will never give useful results for out-of-order references (unless it's allowed to refer to the containing scope or something). The second is closer to the "if x is None: x = y + 1" equivalent, but is harder to explain. Two-phase initialization is my second-best preference after rejecting with SyntaxError, but I would love to see some real-world usage before opening it up. Once permission is granted, it cannot be revoked, and it might turn out that one of the other behaviours would have made more sense. ChrisA
There are other options. Maybe you can't combine early and late binding defaults in the same signature. Or maybe all early binding defaults must precede all late binding defaults. FWIW have you started an implementation yet? "If the implementation is easy to explain, ..." On Mon, Oct 25, 2021 at 10:49 AM Chris Angelico <rosuav@gmail.com> wrote:
On Tue, Oct 26, 2021 at 4:36 AM Guido van Rossum <guido@python.org> wrote:
On Mon, Oct 25, 2021 at 10:28 AM Chris Angelico <rosuav@gmail.com>
wrote:
[...] The two options on the table are:
1) Allow references to any value that has been provided in any way 2) Allow references only to parameters to the left
Option 2 is a simple SyntaxError on compilation (you won't even get as far as the def statement). Option 1 allows everything all up to the point where you call it, but then might raise UnboundLocalError if you refer to something that wasn't passed.
Note that if you were to choose the SyntaxError option, you'd be breaking new ground. Everywhere else in Python, undefined names are runtime errors (NameError or UnboundLocalError).
I'm considering this to be more similar to mismatching local and global usage, or messing up nonlocal names:
def spam(): ... ham ... global ham ... File "<stdin>", line 3 SyntaxError: name 'ham' is used prior to global declaration def spam(): ... def ham(): ... nonlocal eggs ... File "<stdin>", line 3 SyntaxError: no binding for nonlocal 'eggs' found
The problem is the bizarre inconsistencies that can come up, which are difficult to explain unless you know exactly how everything is implemented internally. What exactly is the difference between these, and why should some be legal and others not?
def f1(x=>y + 1, y=2): ... def f2(x=>y + 1, y=>2): ... def f3(x=>y + 1, *, y): ... def f4(x=>y + 1): y = 2 def f5(x=>y + 1): global y y = 2
And importantly, do Python core devs agree with less-skilled Python programmers on the intuitions?
If this should be permitted, there are two plausible semantic meanings for these kinds of constructs:
1) Arguments are defined left-to-right, each one independently of each other 2) Early-bound arguments and those given values are defined first, then late-bound arguments
The first option is much easier to explain, but will never give useful results for out-of-order references (unless it's allowed to refer to the containing scope or something). The second is closer to the "if x is None: x = y + 1" equivalent, but is harder to explain.
Two-phase initialization is my second-best preference after rejecting with SyntaxError, but I would love to see some real-world usage before opening it up. Once permission is granted, it cannot be revoked, and it might turn out that one of the other behaviours would have made more sense.
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/46ZWYA... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On Tue, Oct 26, 2021 at 4:54 AM Guido van Rossum <guido@python.org> wrote:
There are other options. Maybe you can't combine early and late binding defaults in the same signature. Or maybe all early binding defaults must precede all late binding defaults.
All early must precede all late would make a decent option. Will keep that in mind.
FWIW have you started an implementation yet? "If the implementation is easy to explain, ..."
Not yet. Juggling a lot of things; will get to that Real Soon™, unless someone else offers to help out, which I would definitely welcome. ChrisA
On Tue, Oct 26, 2021 at 04:48:17AM +1100, Chris Angelico wrote:
The problem is the bizarre inconsistencies that can come up, which are difficult to explain unless you know exactly how everything is implemented internally. What exactly is the difference between these, and why should some be legal and others not?
They should all be legal. Legal doesn't mean "works". Code that raises an exception is still legal code.
def f1(x=>y + 1, y=2): ... def f2(x=>y + 1, y=>2): ... def f3(x=>y + 1, *, y): ... def f4(x=>y + 1): y = 2 def f5(x=>y + 1): global y y = 2
What "bizarre inconsistencies" do you think they have? Each example is different so it is hardly shocking if they behave different too. f1() assigns positional arguments first (there are none), then keyword arguments (still none), then early-bound defaults left to right (y=2), then late-bound defaults left to right (x=y+1). That is, I argue, the most useful behaviour. But if you insist on a strict left-to-right single pass to assign defaults, then instead it will raise UnboundLocalError because y doesn't have a value. Just like the next case: f2() assigns positional arguments first (there are none), then keyword arguments (still none), then early-bound defaults left to right (none of these either), then late-bound defaults left to right (x=y+1) which raises UnboundLocalError because y is a local but doesn't have a value yet. f3() assigns positional arguments first (there are none), then keyword arguments (still none), at which point it raises TypeError because you have a mandatory keyword-only argument with no default. f4() is just like f2(). And lastly, f5() assigns positional arguments first (there are none), then keyword arguments (still none), then early-bound defaults left to right (none of these either), then late-bound defaults left to right (x=y+1) which might raise NameError if global y doesn't exist, otherwise it will succeed. Each of those cases is easily understandable. There is no reason to expect the behaviour in all four cases to be the same, so we can hardly complain that they are "inconsistent" let alone that they are "bizarrely inconsistent". The only novelty here is that functions with late-binding can raise arbitrary exceptions, including UnboundLocalError, before the body of the function is entered. If you don't like that, then you don't like late-bound defaults at all and you should be arguing in favour of rejecting the PEP :-( If we consider code that already exists today, with the None sentinel trick, each of those cases have equivalent errors today, even if some of the fine detail is different (e.g. getting TypeError because we attempt to add 1 to None instead of an unbound local). However there is a real, and necessary, difference in behaviour which I think you missed: def func(x=x, y=>x) # or func(x=x, @y=x) The x=x parameter uses global x as the default. The y=x parameter uses the local x as the default. We can live with that difference. We *need* that difference in behaviour, otherwise these examples won't work: def method(self, x=>self.attr) # @x=self.attr def bisect(a, x, lo=0, hi=>len(a)) # @hi=len(a) Without that difference in behaviour, probably fifty or eighty percent of the use-cases are lost. (And the ones that remain are mostly trivial ones of the form arg=[].) So we need this genuine inconsistency. If you can live with that actual inconsistency, why are you losing sleep over behaviour (functions f1 through f4) which isn't actually inconsistent? * Code that does different things is supposed to behave differently; * The differences in behaviour are easy to understand; * You can't prevent the late-bound defaults from raising UnboundLocalError, so why are you trying to turn a tiny subset of such errors into SyntaxError? * The genuine inconsistency is *necessary*: late-bound expressions should be evaluated in the function's namespace, not the surrounding (global) namespace.
And importantly, do Python core devs agree with less-skilled Python programmers on the intuitions?
We should write a list of the things that Python wouldn't have if the intuitions of "less-skilled Python programmers" was a neccessary condition. - no metaclasses, descriptors or decorators; - no classes, inheritence (multiple or single); - no slices or zero-based indexing; - no mutable objects; - no immutable objects; - no floats or Unicode strings; etc. I think that, *maybe*, we could have `print("Hello world")`, so long as the programmer's intuition is that print needs parentheses.
If this should be permitted, there are two plausible semantic meanings for these kinds of constructs:
1) Arguments are defined left-to-right, each one independently of each other 2) Early-bound arguments and those given values are defined first, then late-bound arguments
The first option is much easier to explain, but will never give useful results for out-of-order references (unless it's allowed to refer to the containing scope or something). The second is closer to the "if x is None: x = y + 1" equivalent, but is harder to explain.
You just explained it perfectly in one sentence. The two options are equally easy to explain. The second takes a few more words, but the concepts are no harder. And the second is much more useful. In comparison, think about how hard it is to explain your preferred behaviour, a SyntaxError. Think about how many posts you have written, and how many examples you have given, hundreds maybe thousands of words, dozens or hundreds of sentences, and you have still not convinced everyone that "raise SyntaxError" is the right thing to do. "Why does this simple function definition raise SyntaxError?" is MUCH harder to explain than "Why does a default value that tries to access an unbound local variable raise UnboundLocalError?".
Two-phase initialization is my second-best preference after rejecting with SyntaxError, but I would love to see some real-world usage before opening it up. Once permission is granted, it cannot be revoked, and it might turn out that one of the other behaviours would have made more sense.
Being cautious about new syntax is often worthy, but here you are being overcautious. You are trying to prohibit something as a syntax error because it *might* fail at runtime. We don't even protect against things that we know *will* fail! x = 1 + 'a' # Not a syntax error. In this case, two-pass defaults is clearly superior because it would allow everything that the one-pass behaviour would allow, *plus more* applications that we haven't even thought of yet (but others will). Analogy: When Python 1 was first evolving, nobody said that we ought to be cautious about parallel assignment: a, b, c = ... just because the user might misuse it. a = 1 if False: b = 1 # oops I forgot to define b a, b = b, a # SyntaxError just in case Nor did we lose sleep over which parallel assignment model is better, and avoid making a decision: a, b = b, a # Model 1: push b push a swap a = pop stack b = pop stack versus: # Model 2: push b a = pop stack push a b = pop stack The two models are identical if the expressions on the right are all distinct from the targets on the left, e.g. `a, b = x, y`, but the first model allows us to do much more useful things that the second doesn't, such as the "swap two variables" idiom. Be bold! The "two pass" model is clearly better than the "one pass" model. You don't need to prevaricate just in case. Worst case, the Steering Council will say "Chris we love everything about the PEP except this..." and you will have to change it. But they won't because the two pass model is clearly the best *wink* -- Steve
On Tue, Oct 26, 2021 at 3:00 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Oct 26, 2021 at 04:48:17AM +1100, Chris Angelico wrote:
The problem is the bizarre inconsistencies that can come up, which are difficult to explain unless you know exactly how everything is implemented internally. What exactly is the difference between these, and why should some be legal and others not?
They should all be legal. Legal doesn't mean "works". Code that raises an exception is still legal code.
Then there's no such thing as illegal code, and my entire basis for explanation is bunk. Come on, you know what I mean. If it causes SyntaxError:, it's not legal code. Just because that's a catchable exception doesn't change anything. Example:
def f5(x=>y + 1): global y y = 2
According to the previously-defined equivalencies, this would mean: def f5(x=None): if x is None: x = y + 1 global y y = 2 And that's a SyntaxError. Do you see what I mean now? Either these things are not consistent with existing idioms, or they're not consistent with each other. Since writing that previous post, I have come to the view that "consistency with existing idioms" is the one that gets sacrificed to resolve this. I haven't yet gotten started on implementation (definitely gonna get to that Real Soon Now™), but one possible interpretation of f5, once disconnected from the None parallel, is that omitting x would use one more than the module-level y. That implies that a global statement *anywhere* in a function will also apply to the function header, despite it not otherwise being legal to refer to a name earlier in the function than the global statement.
And lastly, f5() assigns positional arguments first (there are none), then keyword arguments (still none), then early-bound defaults left to right (none of these either), then late-bound defaults left to right (x=y+1) which might raise NameError if global y doesn't exist, otherwise it will succeed.
It's interesting that you assume this. By any definition, the header is a reference prior to the global statement, which means the global statement would have to be hoisted. I think that's probably the correct behaviour, but it is a distinct change from the current situation.
However there is a real, and necessary, difference in behaviour which I think you missed:
def func(x=x, y=>x) # or func(x=x, @y=x)
The x=x parameter uses global x as the default. The y=x parameter uses the local x as the default. We can live with that difference. We *need* that difference in behaviour, otherwise these examples won't work:
def method(self, x=>self.attr) # @x=self.attr
def bisect(a, x, lo=0, hi=>len(a)) # @hi=len(a)
Without that difference in behaviour, probably fifty or eighty percent of the use-cases are lost. (And the ones that remain are mostly trivial ones of the form arg=[].) So we need this genuine inconsistency.
I agree, we do need that particular inconsistency. I want to avoid others where possible.
If you can live with that actual inconsistency, why are you losing sleep over behaviour (functions f1 through f4) which isn't actually inconsistent?
(Sleep? What is sleep? I don't lose what I don't have!) Based on the multi-pass assignment model, which you still favour, those WOULD be quite inconsistent, and some of them would make little sense. It would also mean that there is a distinct semantic difference between: def f1(x=>y + 1, y=2): ... def f2(x=>y + 1, y=>2): ... in that it changes what's viable and what's not. (Since you don't like the term "legal" here, I'll go with "viable", since a runtime exception isn't terribly useful.) Changing the default from y=2 to y=>2 would actually stop the example from working. Multi-pass initialization makes sense where it's necessary. Is it really necessary here?
And importantly, do Python core devs agree with less-skilled Python programmers on the intuitions?
We should write a list of the things that Python wouldn't have if the intuitions of "less-skilled Python programmers" was a neccessary condition.
- no metaclasses, descriptors or decorators; - no classes, inheritence (multiple or single); - no slices or zero-based indexing; - no mutable objects; - no immutable objects; - no floats or Unicode strings;
etc. I think that, *maybe*, we could have `print("Hello world")`, so long as the programmer's intuition is that print needs parentheses.
No, you misunderstand. I am not saying that less-skilled programmers have to intuit things perfectly; I am saying that, when there are drastic differences of expectation, there is probably a problem. I can easily explain "arguments are assigned left to right". It is much harder to explain multi-stage initialization and why different things can be referenced.
Two-phase initialization is my second-best preference after rejecting with SyntaxError, but I would love to see some real-world usage before opening it up. Once permission is granted, it cannot be revoked, and it might turn out that one of the other behaviours would have made more sense.
Being cautious about new syntax is often worthy, but here you are being overcautious. You are trying to prohibit something as a syntax error because it *might* fail at runtime. We don't even protect against things that we know *will* fail!
x = 1 + 'a' # Not a syntax error.
But this is an error: x = 1 def f(): print(x) x = 2 And so is this: def f(x): global x As is this: def f(): x = 1 global x x = 2 You could easily give these functions meaning using any of a variety of rules, like "the global statement applies to what's after it" or "the global statement applies to the whole function regardless of placement". Why are they SyntaxErrors? Is that being overcautious, or is it blocking code that makes no sense? The two-pass model is closer to existing idioms. That's of value, but it isn't the greatest justification. And given that there is no idiom that perfectly matches the semantics, I don't consider that to be strong enough to justify the increase in complexity. ChrisA
On Tue, Oct 26, 2021 at 05:27:49PM +1100, Chris Angelico wrote:
On Tue, Oct 26, 2021 at 3:00 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Oct 26, 2021 at 04:48:17AM +1100, Chris Angelico wrote:
The problem is the bizarre inconsistencies that can come up, which are difficult to explain unless you know exactly how everything is implemented internally. What exactly is the difference between these, and why should some be legal and others not?
They should all be legal. Legal doesn't mean "works". Code that raises an exception is still legal code.
Then there's no such thing as illegal code,
I mean that code that compiles and runs is legal, even if it raises a runtime error. Code that cannot compile due to syntax errors is "illegal", we often talk about "illegal syntax": None[0] # Legal syntax, still raises import() = while x or and else # Illegal syntax Sorry if I wasn't clear.
and my entire basis for explanation is bunk. Come on, you know what I mean. If it causes SyntaxError:, it's not legal code.
Sorry Chris, I don't know what you mean. It only causes syntax error because you are forcing it to cause syntax error, not because it cannot be interpreted under the existing (proposed or actual) semantics. You are (were?) arguing that something that is otherwise meaningful should be a syntax error because there are some circumstances that it could fail. That's not "illegal code" in the sense I mean, and I don't know why you want it to be a syntax error (unless you've changed your mind). We don't do this: y = x+1 # Syntax error, because x might be undefined and we shouldn't make this a syntax error def func(@spam=eggs+1, @eggs=spam-1): either just because `func()` with no arguments raises. So long as you pass at least one argument, it works fine, and that may be perfectly suitable for some uses. Let linters worry about flagging that as an violation. The interpreter should be for consenting adults. There is plenty of code that we can already write that might raise a NameError or UnboundLocalError. This is not special enough to promote it to a syntax error.
def f5(x=>y + 1): global y y = 2
According to the previously-defined equivalencies, this would mean:
def f5(x=None): if x is None: x = y + 1 global y y = 2
Of course it would not mean that. That's a straw-man. You have deliberately written code which you know is illegal (now, it wasn't illegal just a few releases back). Remember that "global y" is not an executable statement, it is a declaration, we can move the declaration anywhere we want to make the code legal. So it would be equivalent to: def f5(x=None): global y if x is None: x = y + 1 y = 2 And it can still raise NameError if y is not defined. Caveat utilitor (let the user beware). Parameters (and their defaults) are not written inside the function body, they are written in the function header, and the function header by definition must preceed the body and any declarations inside it. We should not allow such an unimportant technicality to prevent late bound defaults from using globals. Remember that for two decades or so, global declarations could be placed anywhere in the function body. It is only recently that we have tightened that up with a rule that the declaration must occur before any use of a name inside the function body. We created that more restrictive rule by fiat, we can loosen it *for late-bound expressions* by fiat too: morally, global declarations inside the body are deemed to occur before the parameter defaults. Done and solved. (I don't know why we decided on this odd rule that the global declaration has to occur before the usage of the variable, instead of just insisting that any globals be declared immediately after the function header and docstring. Oh well.)
That implies that a global statement *anywhere* in a function will also apply to the function header, despite it not otherwise being legal to refer to a name earlier in the function than the global statement.
Great minds think alike :-) If it makes you happy, you could enforce a rule that the global has to occur after the docstring and before the function body, but honestly I'm not sure why we would bother. Some more comments, which hopefully match your vision of the feature: If a late bound default refers to a name -- and most of them will -- we should follow the same rules as we otherwise would, to the extent that makes sense. For example: * If the name in the default expression matches a parameter, then it refers to the parameter, not the same name in the surrounding scope; parameters are always local to the function, so the name should be local to the function inside the default expression too. * If the name in the default expression matches a local name in the body of the function, that is, one that we assign to and haven't declared as global or nonlocal, then the default expression should likewise treat it as local. * If the name in the default matches a name in the function body that has been declared global or nonlocal, then treat it the same way in the default expression. * Otherwise treat it as global/nonlocal/builtin. (I think that covers all the cases.) Do these scoping rules mean it is possible to write defaults that will fail at run time? Yes. So does the code we can write today. Don't worry about it. It is the coder's responsibility, not the interpreters and not yours, to ensure that the code they write works.
And lastly, f5() assigns positional arguments first (there are none), then keyword arguments (still none), then early-bound defaults left to right (none of these either), then late-bound defaults left to right (x=y+1) which might raise NameError if global y doesn't exist, otherwise it will succeed.
It's interesting that you assume this. By any definition, the header is a reference prior to the global statement, which means the global statement would have to be hoisted. I think that's probably the correct behaviour, but it is a distinct change from the current situation.
See my comments above. What other possible meaning would make sense? We can write a language with whatever restrictions we like: # Binding operations must follow # the I before E rule, unless after C self.increment = 1 # Syntax error because E occurs before I container = [] # Syntax error because I before E after C but such a language would be no fun to use. Or possibly lots of fun, if you had a twisted mind *wink* So yes, I looked at what the clear and obvious intention of the code was, and assumed that it should work. Executable pseudocode, remember? There's no need to restrict things just for the sake of restricting them.
Based on the multi-pass assignment model, which you still favour, those WOULD be quite inconsistent, and some of them would make little sense. It would also mean that there is a distinct semantic difference between:
def f1(x=>y + 1, y=2): ... def f2(x=>y + 1, y=>2): ...
Sure. They behave differently because they are different. These are different too: # Block 1 y = 2 x = y + 1 # Block 2 x = y + 1 y = 2
in that it changes what's viable and what's not. (Since you don't like the term "legal" here, I'll go with "viable", since a runtime exception isn't terribly useful.) Changing the default from y=2 to y=>2 would actually stop the example from working.
Um, yes? Changing the default from y=2 to y="two" will also stop it from working. Even if you swap the order of the parameters.
Multi-pass initialization makes sense where it's necessary. Is it really necessary here?
We already have multi-pass initialisation. 1. positional arguments are applied, left to right; 2. then keyword arguments; 3. then defaults are applied. (It is, I think, an implementation detail whether 2 and 3 are literally two separate passes or whether they can be rolled into a single pass. There are probably many good ways to actually implement binding of arguments to parameters. But semantically, argument binding to parameters behaves as if it were multiple passes. Since the number of parameters is likely to be small (more likely 6 parameters than 6000), we shouldn't care about the cost of a second pass to fill in the late-bound defaults after all the early-bound defaults are done.
No, you misunderstand. I am not saying that less-skilled programmers have to intuit things perfectly; I am saying that, when there are drastic differences of expectation, there is probably a problem.
I can easily explain "arguments are assigned left to right". It is much harder to explain multi-stage initialization and why different things can be referenced.
I disagree that it is much harder. In any case, my fundamental model here is that if we can do something using pseudo-late binding (the "if arg is None" idiom), then it should (more or less) be possible using late-binding. We should be able to just move the expression from the body of the function to the parameter and in most cases it should work. Obviously some conditions apply: - single expressions only, not a full block; - exceptions may change (e.g. a TypeError from `None + 1` may turn into an UnboundLocalError, etc) - not all cases will work, due to order of operations, but we should be able to get most cases to work. Inside the body of a function, we can apply pseudo-late binding using the None idiom in any order we like. As late-binding parameters, we are limited to left-to-right. But we can get close to the (existing) status quo by ensuring that all early-bound defaults are applied before we start the late-bound defaults. # Status quo def function(arg, spam=None, eggs="something useful"): if spam is None: spam = process(eggs) eggs is guaranteed to have a result here because the early-bound defaults are all assigned before the body of the function is entered. So in the new regime of late-binding, I want to write: def function(arg, @spam=process(eggs), eggs="something useful"): and the call to process(eggs) should occur after the early bound default is assigned. The easiest way to get that is to say that early bound defaults are assigned in one pass, and late bound in a second pass. Without that, many use cases for late-binding (I won't try to guess a proportion) are not going to translate to the new idiom. -- Steve
On Tue, Oct 26, 2021 at 11:10 PM Steven D'Aprano <steve@pearwood.info> wrote:
Based on the multi-pass assignment model, which you still favour, those WOULD be quite inconsistent, and some of them would make little sense. It would also mean that there is a distinct semantic difference between:
def f1(x=>y + 1, y=2): ... def f2(x=>y + 1, y=>2): ...
Sure. They behave differently because they are different.
These are different too:
# Block 1 y = 2 x = y + 1
# Block 2 x = y + 1 y = 2
Yes, those ARE different. Those are more equivalent to changing the order of the parameters in the function signature, and I think we all agree that that DOES make a difference. The question is whether these could change meaning if you used a different type of assignment, such as: y := 2 x = y + 1 Does that suddenly make it legal? I think you'll find that this sort of thing is rather surprising. And that's what we have here: changing from one form of argument default to another changes whether left-to-right applies or not. I don't want that. And based on an experiment with a less-experienced Python programmer (admittedly only a single data point), neither do other people. Left-to-right makes sense; multi-pass does not.
Multi-pass initialization makes sense where it's necessary. Is it really necessary here?
We already have multi-pass initialisation.
1. positional arguments are applied, left to right; 2. then keyword arguments; 3. then defaults are applied.
(It is, I think, an implementation detail whether 2 and 3 are literally two separate passes or whether they can be rolled into a single pass. There are probably many good ways to actually implement binding of arguments to parameters. But semantically, argument binding to parameters behaves as if it were multiple passes.
Those aren't really multi-pass assignment though, because they could just as easily be assigned simultaneously. You can't, in Python code, determine which order the parameters were assigned. There are rules about how to map positional and keyword arguments to the names, but it would be just as logical to say: 1. Assign all defaults 2. Assign all keyword args, overwriting defaults 3. Assign positional args, overwriting defaults but not kwargs And the net result would be exactly the same. But with anything that executes arbitrary Python code, it matters, and it matters what state the other values are in. So we have a few options: a) Assign all early-evaluated defaults and explicitly-passed arguments, leaving others unbound; then process late-evaluated defaults one by one b) Assign parameters one by one, left to right
Since the number of parameters is likely to be small (more likely 6 parameters than 6000), we shouldn't care about the cost of a second pass to fill in the late-bound defaults after all the early-bound defaults are done.
I'm not concerned with performance, I'm concerned with semantics.
No, you misunderstand. I am not saying that less-skilled programmers have to intuit things perfectly; I am saying that, when there are drastic differences of expectation, there is probably a problem.
I can easily explain "arguments are assigned left to right". It is much harder to explain multi-stage initialization and why different things can be referenced.
I disagree that it is much harder.
In any case, my fundamental model here is that if we can do something using pseudo-late binding (the "if arg is None" idiom), then it should (more or less) be possible using late-binding.
We should be able to just move the expression from the body of the function to the parameter and in most cases it should work.
There are enough exceptions that this parallel won't really work, so I'd rather leave aside the parallel and just describe how argument defaults work. Yes, you can achieve the same effect in other ways, but you can't do a mechanical transformation and expect it to behave identically.
Inside the body of a function, we can apply pseudo-late binding using the None idiom in any order we like. As late-binding parameters, we are limited to left-to-right. But we can get close to the (existing) status quo by ensuring that all early-bound defaults are applied before we start the late-bound defaults.
# Status quo def function(arg, spam=None, eggs="something useful"): if spam is None: spam = process(eggs)
eggs is guaranteed to have a result here because the early-bound defaults are all assigned before the body of the function is entered. So in the new regime of late-binding, I want to write:
def function(arg, @spam=process(eggs), eggs="something useful"):
and the call to process(eggs) should occur after the early bound default is assigned. The easiest way to get that is to say that early bound defaults are assigned in one pass, and late bound in a second pass.
Without that, many use cases for late-binding (I won't try to guess a proportion) are not going to translate to the new idiom.
Can you find some actual real-world cases where this is true? I was unable to find any examples where I didn't have to apologize for the contrivedness of them. Having an argument default depend on arguments that come after it seems very surprising, especially since they can't be passed positionally anyway; so it would only be a very narrow set of circumstances where this is a problem - if they're keyword-only args, they can be reordered into something more logical, thus solving the problem. I think you're far too caught up on equivalences that don't exist. ChrisA
On Tue, Oct 26, 2021 at 11:10 PM Steven D'Aprano <steve@pearwood.info> wrote:
Based on the multi-pass assignment model, which you still favour, those WOULD be quite inconsistent, and some of them would make little sense. It would also mean that there is a distinct semantic difference between:
def f1(x=>y + 1, y=2): ... def f2(x=>y + 1, y=>2): ... Sure. They behave differently because they are different.
These are different too:
# Block 1 y = 2 x = y + 1
# Block 2 x = y + 1 y = 2 Yes, those ARE different. Those are more equivalent to changing the order of the parameters in the function signature, and I think we all agree that that DOES make a difference. The question is whether these could change meaning if you used a different type of assignment, such as:
y := 2 x = y + 1
Does that suddenly make it legal? I think you'll find that this sort of thing is rather surprising. And that's what we have here: changing from one form of argument default to another changes whether left-to-right applies or not.
I don't want that. And based on an experiment with a less-experienced Python programmer (admittedly only a single data point), neither do other people. Left-to-right makes sense; multi-pass does not. As I may be the data point in question: One of my posts seems to have got lost again, so I reproduce some of it (reworked): What I DON'T want to see is allowing something like this being legal: def f(a := b+1, b := e+1, c := a+1, d := 42, e := d+1): If no arguments are passed, the interpreter has to work out to evaluate first d, then e, then b, then a, then finally c. If some arguments are
On 26/10/2021 18:25, Chris Angelico wrote: passed, I guess the same order would work. But it feels ... messy. And obfuscated. And if this is legal (note: it IS a legitimate use case): def DrawCircle(centre=(0,0), radius := circumference / TWO_PI, circumference := radius * TWO_PI): the interpreter has to work out whether to evaluate the 2nd or 3rd arg first, depending on which is passed. AFAICS all this may need multiple passes though the args at runtime. Complicated, and inefficient. *If* it could all be sorted out at compile time, my objection would become weaker. There has been support for evaluating all early-bound defaults before all late-bound defaults. I have been persuaded that this is a reasonable option. AFAICS there would be little practical difference from straight left-to-right evaluation of defaults, since assigning an early-bound default should not have a side effect. So it could even be an implementation choice. Best wishes Rob PS Can I echo Guido's plea that people don't derail this PEP by trying to shoehorn deferred-evaluation-objects (or whatever you want to call them) into it? As Chris A says, that's a separate idea and should go into a separate PEP. If I need a screwdriver, I buy a screwdriver, not an expensive Swiss Army knife.
On Wed, Oct 27, 2021 at 10:38 AM Rob Cliffe <rob.cliffe@btinternet.com> wrote:
On 26/10/2021 18:25, Chris Angelico wrote:
On Tue, Oct 26, 2021 at 11:10 PM Steven D'Aprano <steve@pearwood.info> wrote:
Based on the multi-pass assignment model, which you still favour, those WOULD be quite inconsistent, and some of them would make little sense. It would also mean that there is a distinct semantic difference between:
def f1(x=>y + 1, y=2): ... def f2(x=>y + 1, y=>2): ... Sure. They behave differently because they are different.
These are different too:
# Block 1 y = 2 x = y + 1
# Block 2 x = y + 1 y = 2 Yes, those ARE different. Those are more equivalent to changing the order of the parameters in the function signature, and I think we all agree that that DOES make a difference. The question is whether these could change meaning if you used a different type of assignment, such as:
y := 2 x = y + 1
Does that suddenly make it legal? I think you'll find that this sort of thing is rather surprising. And that's what we have here: changing from one form of argument default to another changes whether left-to-right applies or not.
I don't want that. And based on an experiment with a less-experienced Python programmer (admittedly only a single data point), neither do other people. Left-to-right makes sense; multi-pass does not. As I may be the data point in question:
You're not - I sat down with one of my brothers and led him towards the problem in question, watching his attempts to solve it. Then showed him what we were discussing, and asked his interpretations of it.
One of my posts seems to have got lost again, so I reproduce some of it (reworked): What I DON'T want to see is allowing something like this being legal: def f(a := b+1, b := e+1, c := a+1, d := 42, e := d+1): If no arguments are passed, the interpreter has to work out to evaluate first d, then e, then b, then a, then finally c. If some arguments are passed, I guess the same order would work. But it feels ... messy. And obfuscated.
At no point will the interpreter reorder things. There have only ever been two (or three) options seriously considered: 1) Assign all parameters from left to right, giving them either a passed-in value or a default 2) First assign all parameters that were passed in values, and those with early-bound defaults; then, in a separate left-to-right pass, assign all late-bound defaults 3) Same as one of the other two, but validated at compilation time and raising SyntaxError for out-of-order references
And if this is legal (note: it IS a legitimate use case): def DrawCircle(centre=(0,0), radius := circumference / TWO_PI, circumference := radius * TWO_PI): the interpreter has to work out whether to evaluate the 2nd or 3rd arg first, depending on which is passed. AFAICS all this may need multiple passes though the args at runtime. Complicated, and inefficient. *If* it could all be sorted out at compile time, my objection would become weaker.
This is not as legit as you might think, since it runs into other problems. Having codependent arguments is not going to be solved by this proposal (for instance, what happens if you pass a radius of 3 and a circumference of 12?), so I'm not going to try to support narrow subsets of this that just happen to "work" (like the case where you only ever pass one of them).
There has been support for evaluating all early-bound defaults before all late-bound defaults. I have been persuaded that this is a reasonable option. AFAICS there would be little practical difference from straight left-to-right evaluation of defaults, since assigning an early-bound default should not have a side effect. So it could even be an implementation choice.
If it's an implementation choice, then it'll mean that code is legal on some interpreters and not others. Whether that's a major problem or not I don't know, but generally, when you can't depend on your code working on all interpreters, it's not best practice (eg "open(fn).read()" isn't recommended, even if it happens to close the file promptly in CPython). ChrisA
On Wed, Oct 27, 2021 at 10:38 AM Rob Cliffe <rob.cliffe@btinternet.com> wrote:
There has been support for evaluating all early-bound defaults before all late-bound defaults. I have been persuaded that this is a reasonable option. AFAICS there would be little practical difference from straight left-to-right evaluation of defaults, since assigning an early-bound default should not have a side effect. So it could even be an implementation choice. If it's an implementation choice, then it'll mean that code is legal on some interpreters and not others. Whether that's a major problem or not I don't know, but generally, when you can't depend on your code working on all interpreters, it's not best practice (eg "open(fn).read()" isn't recommended, even if it happens to close the file promptly in CPython).
ChrisA I guess I wasn't clear. I'm not sure if you understood what I meant to say, or not. First I should have said "binding" rather than "evaluating". What I meant was that in cases like def f(a=earlydefault1, b:= latedefault2, c=earlydefault3, d:=latedefault4): it makes no real difference if the bindings are done in the order a, c, b, d (early ones before late ones) or a, b, c, d (strict left-to-right) since binding an early default should have no side effects, so (I
On 27/10/2021 00:50, Chris Angelico wrote: thought, wrongly) it could be an implementation detail. Of course there IS a difference: it allows late default values to refer to subsequent early default values, e.g. in the example above `latedefault2' could refer to `c`. So yes, then that code would be legal on some interpreters and not others, as you said. If you understood exactly what I meant, I apologise. Rob Cliffe
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7GKSZE... Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, Oct 27, 2021 at 12:50 PM Rob Cliffe <rob.cliffe@btinternet.com> wrote:
On Wed, Oct 27, 2021 at 10:38 AM Rob Cliffe <rob.cliffe@btinternet.com> wrote:
There has been support for evaluating all early-bound defaults before all late-bound defaults. I have been persuaded that this is a reasonable option. AFAICS there would be little practical difference from straight left-to-right evaluation of defaults, since assigning an early-bound default should not have a side effect. So it could even be an implementation choice. If it's an implementation choice, then it'll mean that code is legal on some interpreters and not others. Whether that's a major problem or not I don't know, but generally, when you can't depend on your code working on all interpreters, it's not best practice (eg "open(fn).read()" isn't recommended, even if it happens to close the file promptly in CPython).
ChrisA I guess I wasn't clear. I'm not sure if you understood what I meant to say, or not. First I should have said "binding" rather than "evaluating". What I meant was that in cases like def f(a=earlydefault1, b:= latedefault2, c=earlydefault3, d:=latedefault4): it makes no real difference if the bindings are done in the order a, c, b, d (early ones before late ones) or a, b, c, d (strict left-to-right) since binding an early default should have no side effects, so (I
On 27/10/2021 00:50, Chris Angelico wrote: thought, wrongly) it could be an implementation detail. Of course there IS a difference: it allows late default values to refer to subsequent early default values, e.g. in the example above `latedefault2' could refer to `c`. So yes, then that code would be legal on some interpreters and not others, as you said. If you understood exactly what I meant, I apologise.
Yep, that's precisely the distinction that matters: whether it's legal to refer to parameters further to the right. If we consider tuple unpacking as an approximate parallel: def f(): a = [10,20,30] i, a[i] = 1, "Hi"
On Wed, Oct 27, 2021 at 12:59 PM Chris Angelico <rosuav@gmail.com> wrote:
On Wed, Oct 27, 2021 at 12:50 PM Rob Cliffe <rob.cliffe@btinternet.com> wrote:
On Wed, Oct 27, 2021 at 10:38 AM Rob Cliffe <rob.cliffe@btinternet.com> wrote:
There has been support for evaluating all early-bound defaults before all late-bound defaults. I have been persuaded that this is a reasonable option. AFAICS there would be little practical difference from straight left-to-right evaluation of defaults, since assigning an early-bound default should not have a side effect. So it could even be an implementation choice. If it's an implementation choice, then it'll mean that code is legal on some interpreters and not others. Whether that's a major problem or not I don't know, but generally, when you can't depend on your code working on all interpreters, it's not best practice (eg "open(fn).read()" isn't recommended, even if it happens to close the file promptly in CPython).
ChrisA I guess I wasn't clear. I'm not sure if you understood what I meant to say, or not. First I should have said "binding" rather than "evaluating". What I meant was that in cases like def f(a=earlydefault1, b:= latedefault2, c=earlydefault3, d:=latedefault4): it makes no real difference if the bindings are done in the order a, c, b, d (early ones before late ones) or a, b, c, d (strict left-to-right) since binding an early default should have no side effects, so (I
On 27/10/2021 00:50, Chris Angelico wrote: thought, wrongly) it could be an implementation detail. Of course there IS a difference: it allows late default values to refer to subsequent early default values, e.g. in the example above `latedefault2' could refer to `c`. So yes, then that code would be legal on some interpreters and not others, as you said. If you understood exactly what I meant, I apologise.
Yep, that's precisely the distinction that matters: whether it's legal to refer to parameters further to the right. If we consider tuple unpacking as an approximate parallel:
def f(): a = [10,20,30] i, a[i] = 1, "Hi"
Premature send, oops... def f(): a = [10, 20, 30] i, a[i] = 1, "Hi" print(a) It's perfectly valid to refer to something from earlier in the multiple assignment, because they're assigned left to right. Python doesn't start to look up the name 'a' until it's finished assigning to 'i'. Since Python doesn't really have a concept of statics or calculated constants, we don't really have any parallel, but imagine that we could do this: def f(): # Calculate this at function definition time and then save it # as a constant const pos = random.randrange(3) a = [10, 20, 30] i, a[i] = pos, "Hi" This is something what I'm describing. The exact value of an early-bound argument default gets calculated at definition time and saved; then it gets assigned to its corresponding parameter if one wasn't given. (Actually, I'd really like Python to get something like this, as it'd completely replace the "random=random" optimization - there'd be no need to pollute the function's signature with something that exists solely for the optimization. It'd also make some of these kinds of things a bit easier to explain, since there would be a concept of def-time evaluation separate from the argument list. But we have what we have.) So I think that we did indeed understand one another. ChrisA
On Tue, Oct 26, 2021 at 4:46 PM Rob Cliffe via Python-ideas < python-ideas@python.org> wrote:
There has been support for evaluating all early-bound defaults before all late-bound defaults. I have been persuaded that this is a reasonable option.
How could that be avoided? by definition, early bound is evaluated "earlier" than late-bound :-) early-bound (i.e. regular) parameters are evaluated at function definition time. But the time we get to the late-bound ones, those are actual values, not expressions. The interpreter could notice that early bound names are used in late-bound expressions and raise an error, but if not, there'd be no issue with when they were evaluated. This could cause a bit of confusion with "getting" that it's not a simple left-to-right rule, but that's the same potential confusion with early vs late bound parameters anyway. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Wed, Oct 27, 2021 at 11:17 AM Christopher Barker <pythonchb@gmail.com> wrote:
On Tue, Oct 26, 2021 at 4:46 PM Rob Cliffe via Python-ideas <python-ideas@python.org> wrote:
There has been support for evaluating all early-bound defaults before all late-bound defaults. I have been persuaded that this is a reasonable option.
How could that be avoided? by definition, early bound is evaluated "earlier" than late-bound :-)
early-bound (i.e. regular) parameters are evaluated at function definition time. But the time we get to the late-bound ones, those are actual values, not expressions.
The interpreter could notice that early bound names are used in late-bound expressions and raise an error, but if not, there'd be no issue with when they were evaluated.
This could cause a bit of confusion with "getting" that it's not a simple left-to-right rule, but that's the same potential confusion with early vs late bound parameters anyway.
The question is whether code like this should work: def f(a=>b + 1, b=2): ... f() f(b=4) Pure left-to-right assignment would raise UnboundLocalError in both cases. Tiered evaluation wouldn't. Are there any other places in Python where assignments aren't done left to right, but are done in two distinct phases? ChrisA
On Tue, Oct 26, 2021 at 5:24 PM Chris Angelico <rosuav@gmail.com> wrote:
How could that be avoided? by definition, early bound is evaluated "earlier" than late-bound :-)
This could cause a bit of confusion with "getting" that it's not a simple left-to-right rule, but that's the same potential confusion with early vs late bound parameters anyway.
sorry, got tangled up between "evaluating" and "name binding"
The question is whether code like this should work:
def f(a=>b + 1, b=2): ...
f() f(b=4)
Pure left-to-right assignment would raise UnboundLocalError in both cases. Tiered evaluation wouldn't.
Nice, simple example. I'm not a newbie, but my students are, and I think they'd find "tiered" evaluation really confusing. Are there any other places in Python where assignments aren't done
left to right, but are done in two distinct phases?
I sure can't think of one. I've been thinking about this from the perspective of a teacher or Python. I"m not looking forward to having one more thing to teach about function definitions -- I struggle enough with cover all of the *args, **kwargs, keyword-only, positional-only options. Python used to be such a simple language, not so much anymore :-( That being said, we currently have to teach, fairly early on, the consequences of using a mutable as a default value. And this PEP would make that easier to cover. But I think it's really important to keep the semantics as simple as possible, and left-to-right name binding is they way to do that. (all this complicated by the fact that there is a LOT of code and advice in the wild about the None idiom, but what can you do?) -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Wed, Oct 27, 2021 at 11:41 AM Christopher Barker <pythonchb@gmail.com> wrote:
I've been thinking about this from the perspective of a teacher or Python. I"m not looking forward to having one more thing to teach about function definitions -- I struggle enough with cover all of the *args, **kwargs, keyword-only, positional-only options.
Python used to be such a simple language, not so much anymore :-(
Yeah. My intention here is that this should be completely orthogonal to argument types (positional-only, positional-or-keyword, keyword-only), completely orthogonal to type hints (or, as I'm discovering as I read through the grammar, type comments), and as much as possible else. The new way will look very similar to the existing way of writing defaults, because it's still defining the default.
That being said, we currently have to teach, fairly early on, the consequences of using a mutable as a default value. And this PEP would make that easier to cover. But I think it's really important to keep the semantics as simple as possible, and left-to-right name binding is they way to do that.
Yes. It will now be possible to say "to construct a new list every time, write it like this" instead of drastically changing the style of code. def add_item(thing, target=[]): print("Oops, that uses a single shared default target") def add_item(thing, target=>[]): print("If you don't specify a target, it makes a new list")
(all this complicated by the fact that there is a LOT of code and advice in the wild about the None idiom, but what can you do?)
And that's not all going away. There will always be some situations where you can't define the default with an expression. The idea that a parameter is optional, but doesn't have a value, may itself be worth exploring (maybe some way to say "arg=pass" and then have an operator "if unset arg:" that returns True if it's unbound), but that's for another day. ChrisA
On 27/10/2021 01:47, Chris Angelico wrote:
The idea that a parameter is optional, but doesn't have a value, may itself be worth exploring (maybe some way to say "arg=pass" and then have an operator "if unset arg:" that returns True if it's unbound), but that's for another day.
ChrisA Indeed. And it could be useful to know if a parameter was passed a value or given the default value. Python has very comprehensive introspection abilities, but this is a (small) gap. Rob Cliffe _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/TSUPWX... Code of Conduct: http://python.org/psf/codeofconduct/
On Tue, Oct 26, 2021, 9:54 PM Rob Cliffe via Python-ideas < python-ideas@python.org> wrote:
On 27/10/2021 01:47, Chris Angelico wrote:
The idea that a parameter is optional, but doesn't have a value, may itself be worth exploring (maybe some way to say "arg=pass" and then have an operator "if unset arg:" that returns True if it's unbound), but that's for another day.
ChrisA Indeed. And it could be useful to know if a parameter was passed a value or given the default value. Python has very comprehensive introspection abilities, but this is a (small) gap. Rob Cliffe
I'll try to summarize why I still have pause even though after thinking about it I still can't really think of a solid example showing that this "give me the default" issue is a concrete problem: Until this point, exactly how to provide a default argument value has been the Wild West in python, and as Paul Moore said, there's really not a way to introspect whether a parameter was "defaulted". The result is that a cornucopia of APIs have flourished. By necessity, all these previous APIs provided ways to ask for the default through passing some special value, and this has felt like "the pythonic way"for a night time. We are all used to it (but perhaps only tolerated it in many cases). The proposal blesses a new API with language support, and it will suddenly become THE pythonic approach. But this newly blessed, pythonic API is a radical departure from years- decades- of coding habits. And so even though I like the proposal, I'm just concerned it could be a little bit more painful than at first glance. So it just seems like some version of these concerns belongs in the PEP. Thanks Chris A for putting up with what isn't much more than a hunch (at least on my part) and I'll say nothing more about it. Carry on.
On Wed, Oct 27, 2021 at 1:13 PM Ricky Teachey <ricky@teachey.org> wrote:
I'll try to summarize why I still have pause even though after thinking about it I still can't really think of a solid example showing that this "give me the default" issue is a concrete problem:
Until this point, exactly how to provide a default argument value has been the Wild West in python, and as Paul Moore said, there's really not a way to introspect whether a parameter was "defaulted". The result is that a cornucopia of APIs have flourished. By necessity, all these previous APIs provided ways to ask for the default through passing some special value, and this has felt like "the pythonic way"for a night time. We are all used to it (but perhaps only tolerated it in many cases).
The proposal blesses a new API with language support, and it will suddenly become THE pythonic approach. But this newly blessed, pythonic API is a radical departure from years- decades- of coding habits.
That's a very good point, but I'd like to turn it on its head and explain the situation from the opposite perspective. For years - decades - Python has lacked a way to express the concept that a default argument is something more than a simple value. Coders have used a variety of workarounds. In the future, most or all of those workarounds will no longer be necessary, and we will have a single obvious way to do things. Suppose that, up until today, Python had not had support for big integers - that the only numbers representable were those that fit inside a 32-bit (for compatibility, of course) two's complement integer. People who need to work with larger numbers would use a variety of tricks: storing digits in strings and performing arithmetic manually, or using a tuple of integers to represent a larger number, or using floats and accepting a loss of precision. Then Python gains native support for big integers. Yes, it would be a radical departure from years of workarounds. Yes, some people would continue to use the other methods, because there would be enough differences to warrant it. And yes, there would be backward-incompatible API changes as edge cases get cleaned up. Is it worth it? For integers, I can respond with a resounding YES, because we have plenty of evidence that they are immensely valuable! With default argument expressions, it's less of an obvious must-have, but I do believe that the benefits outweigh the costs. Will the standard library immediately remove all "=None" workaround defaults? No. Probably some of them, but not all. Will there be breakage as a result of something passing None where it wanted the default? Probably - if not in the stdlib, then I am sure it'll happen in third-party code. Will future code be more readable as a result? Absolutely. ChrisA
On Tue, Oct 26, 2021 at 10:44 PM Chris Angelico <rosuav@gmail.com> wrote:
On Wed, Oct 27, 2021 at 1:13 PM Ricky Teachey <ricky@teachey.org> wrote:
I'll try to summarize why I still have pause even though after thinking
about it I still can't really think of a solid example showing that this "give me the default" issue is a concrete problem:
Until this point, exactly how to provide a default argument value has
been the Wild West in python, and as Paul Moore said, there's really not a way to introspect whether a parameter was "defaulted". The result is that a cornucopia of APIs have flourished. By necessity, all these previous APIs provided ways to ask for the default through passing some special value, and this has felt like "the pythonic way"for a night time. We are all used to it (but perhaps only tolerated it in many cases).
The proposal blesses a new API with language support, and it will
suddenly become THE pythonic approach. But this newly blessed, pythonic API is a radical departure from years- decades- of coding habits.
That's a very good point, but I'd like to turn it on its head and explain the situation from the opposite perspective.
For years - decades - Python has lacked a way to express the concept that a default argument is something more than a simple value. Coders have used a variety of workarounds. In the future, most or all of those workarounds will no longer be necessary, and we will have a single obvious way to do things.
Suppose that, up until today, Python had not had support for big integers - that the only numbers representable were those that fit inside a 32-bit (for compatibility, of course) two's complement integer. People who need to work with larger numbers would use a variety of tricks: storing digits in strings and performing arithmetic manually, or using a tuple of integers to represent a larger number, or using floats and accepting a loss of precision. Then Python gains native support for big integers. Yes, it would be a radical departure from years of workarounds. Yes, some people would continue to use the other methods, because there would be enough differences to warrant it. And yes, there would be backward-incompatible API changes as edge cases get cleaned up. Is it worth it? For integers, I can respond with a resounding YES, because we have plenty of evidence that they are immensely valuable! With default argument expressions, it's less of an obvious must-have, but I do believe that the benefits outweigh the costs.
Will the standard library immediately remove all "=None" workaround defaults? No. Probably some of them, but not all. Will there be breakage as a result of something passing None where it wanted the default? Probably - if not in the stdlib, then I am sure it'll happen in third-party code. Will future code be more readable as a result? Absolutely.
ChrisA
If I might paraphrase Agrippa from the book of Acts: "Almost thou persuadest me..." ;) It's a fine answer. <http://python.org/psf/codeofconduct/> Thanks for the attentiveness to my concern, Chris. Very much appreciated! --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
On Wed, Oct 27, 2021 at 1:52 PM Ricky Teachey <ricky@teachey.org> wrote:
If I might paraphrase Agrippa from the book of Acts: "Almost thou persuadest me..." ;) It's a fine answer.
Thanks for the attentiveness to my concern, Chris. Very much appreciated!
My pleasure. This has been a fairly productive discussion thread, and highly informative; thank you for being a part of that! Now, if I could just figure out what's going on in the grammar... (Actually, the PEG parser is a lot easier to understand than Python's older grammar was. But this is my first time messing with it, so I'm going through all the stages of brand-new discovery.) ChrisA
On 2021-10-26 17:41, Christopher Barker wrote:
Python used to be such a simple language, not so much anymore :-(
I quite agree, and I feel like this is my biggest reason why I don't want this "feature" (or any of another gazillion features that have been suggested and/or accepted here, including the walrus). The benefit of this PEP simply does not justify the added complexity of the language as a whole. Using None (or some sentinel) and including code to check for it works fine. It's not worth it to add a whole new layer of behavior to something as basic as argument-passing just to avoid having to type `if arg is None`. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Wed, Oct 27, 2021 at 1:15 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-26 17:41, Christopher Barker wrote:
Python used to be such a simple language, not so much anymore :-(
I quite agree, and I feel like this is my biggest reason why I don't want this "feature" (or any of another gazillion features that have been suggested and/or accepted here, including the walrus). The benefit of this PEP simply does not justify the added complexity of the language as a whole. Using None (or some sentinel) and including code to check for it works fine. It's not worth it to add a whole new layer of behavior to something as basic as argument-passing just to avoid having to type `if arg is None`.
help(bisect.bisect)
bisect_right(a, x, lo=0, hi=None, *, key=None) ... Optional args lo (default 0) and hi (default len(a)) bound the slice of a to be searched. In the docstring, both lo and hi are given useful, meaningful defaults. In the machine-readable signature, which is also what would be used for tab completion or other tools, lo gets a very useful default, but hi gets a default of None. For the key argument, None makes a very meaningful default; it means that no transformation is done. But in the case of hi, the default really and truly is len(a), but because of a technical limitation, that can't be written that way. Suppose this function were written as: bisect_right(a, x, lo=None, hi=None, *, key=None) Do we really need the ability to specify actual function defaults? I mean, you can just have a single line of code "if lo is None: lo = 0" and it'd be fine. Why do we need the ability to put "lo=0" in the signature? I put it to you that the same justification should be valid for hi. ChrisA
On 2021-10-27 at 13:47:31 +1100, Chris Angelico <rosuav@gmail.com> wrote:
On Wed, Oct 27, 2021 at 1:15 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-26 17:41, Christopher Barker wrote:
Python used to be such a simple language, not so much anymore :-(
I quite agree, and I feel like this is my biggest reason why I don't want this "feature" (or any of another gazillion features that have been suggested and/or accepted here, including the walrus). The benefit of this PEP simply does not justify the added complexity of the language as a whole. Using None (or some sentinel) and including code to check for it works fine. It's not worth it to add a whole new layer of behavior to something as basic as argument-passing just to avoid having to type `if arg is None`.
help(bisect.bisect)
bisect_right(a, x, lo=0, hi=None, *, key=None) ... Optional args lo (default 0) and hi (default len(a)) bound the slice of a to be searched.
[...]
Suppose this function were written as:
bisect_right(a, x, lo=None, hi=None, *, key=None)
Do we really need the ability to specify actual function defaults? I mean, you can just have a single line of code "if lo is None: lo = 0" and it'd be fine. Why do we need the ability to put "lo=0" in the signature? I put it to you that the same justification should be valid for hi.
I agree. Why do we need the ability to put "lo=0" in the signature? If we have to rewrite parts of bisect_right for late binding anyway, then why not go all out? If you want to bisect a slice, then bisect a slice: result = bisect_right(a[lo:hi], x) Oh, wait, what if a has billions of elements and creating a slice containing a million or two out of the middle is too expensive? Then provide two functions: def bisect_right(a, x, key=None): return bisect_slice_right(a, x, 0, len(a), key) def bisect_slice_right(a, x, lo, hi, key=None): "actual guts of bisect function go here" "don't actually slice a; that might be really expensive" No extraneous logic to (re)compute default values at all. Probably fewer/simpler units tests, too, but that might depend on the programmer or other organizational considerations. On the other side, parts of doc strings may end up being duplicated, and flat is better than nested. (No, I don't know the backwards compatibility issues that might arise from re-writing bisect_right in this manner, nor do I know what the options are to satisfy IDEs and/or users thereof.) Running with Brandon Barnwell's point, is there a lot of code that gets a lot simpler (and who gets to define "simpler"?) with late binding default values? Can that same code achieve the same outcome by being refactored the same way as I did bisect_right? I'm willing to accept that my revised bisect_right is horrible by some reasonably objective standard, too.
a bit OT: If you want to bisect a slice, then bisect a slice:
result = bisect_right(a[lo:hi], x)
Oh, wait, what if a has billions of elements and creating a slice containing a million or two out of the middle is too expensive?
Yup, that's why Iproposed about a year ago on this list that there should be a way to get a slice view (or slice iterator, AKA islice) easily :-) You can use itertools.islice for this though -- oops, no you can't: TypeError Traceback (most recent call last) <ipython-input-36-9efe49979a8e> in <module> ----> 1 result = bisect.bisect_right(islice(a, lo, hi), x) TypeError: object of type 'itertools.islice' has no len() We really do need a slice view :-) -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On 2021-10-26 19:47, Chris Angelico wrote:
help(bisect.bisect)
bisect_right(a, x, lo=0, hi=None, *, key=None) ... Optional args lo (default 0) and hi (default len(a)) bound the slice of a to be searched.
In the docstring, both lo and hi are given useful, meaningful defaults. In the machine-readable signature, which is also what would be used for tab completion or other tools, lo gets a very useful default, but hi gets a default of None.
How would tab completion work with this new feature? How could a late-bound default be tab-completed? Usually when I've used tab-completion with function arguments it's only completing the argument name, not the default value.
For the key argument, None makes a very meaningful default; it means that no transformation is done. But in the case of hi, the default really and truly is len(a), but because of a technical limitation, that can't be written that way.
Here is some code: def foo(): for a in x: print(a) for b in x: print(b) other_func(foo) Due to a technical limitation, I cannot write foo as a lambda to pass it inline as an argument to foo. Is this also a problem? If so, should we develop the ever-whispered-about multiline lambda syntax? Yes, there are some kinds of default behaviors that cannot be syntactically written inline in the function signature. I don't see that as a huge problem. The fact that you want to write `len(a)` in such a case is not, for me, sufficient justification for all the other cans of worms opened by this proposal (as seen in various subthreads on this list), to say nothing of the additional burden on learners or on readers of code who now must add this to their list of things they have to grok when reading code.
Suppose this function were written as:
bisect_right(a, x, lo=None, hi=None, *, key=None)
Do we really need the ability to specify actual function defaults? I mean, you can just have a single line of code "if lo is None: lo = 0" and it'd be fine. Why do we need the ability to put "lo=0" in the signature? I put it to you that the same justification should be valid for hi.
I understand what you're saying, but I just don't agree. For one thing, functions are written once and called many times. Normal default arguments provide a major leverage effect: a single default argument value can simplify many calls to the function from many call sites, because you can (potentially) omit certain arguments at every call. Late-bound defaults provide no additional simplicity at the call sites; all they do is permit a more terse spelling at the function definition site. Since the definition only has to be written once, a bit more verbosity there is not so much of a burden. Also, the attempt to squeeze both kinds of defaults into the signature breaks the clear and simple rule we have, which is that the signature line is completely evaluated right away (i.e., it is part of the enclosing scope). Right now we have a clean separation between the function definition and the body, which this proposal will muddy quite profoundly. The problem is exacerbated by the proposed syntaxes, nearly all of which I find ugly to varying degrees. But I think even with the best syntax, the underlying problem remains that switching back and forth between definition-scope and body-scope within the signature is confusing. Finally, I think a big problem with the proposal for me is that it really only targets what I see as a quite special case, which is late-bound expressions that are small enough to readably fit in the argument list. All of the arguments (har har) about readability go out the window if people start putting anything complex in there. Similarly if there is any more complex logic required (such as combinations of arguments whose defaults are interdependent in nontrivial ways) that cannot easily be expressed in separable argument defaults, we're still going to have to do it in the function body anyway. So we are adding an entirely new complication to the basic argument syntax just to handle (what I see as) a quite narrow range of expressions. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On 27/10/2021 03:12, Brendan Barnwell wrote:
On 2021-10-26 17:41, Christopher Barker wrote:
Python used to be such a simple language, not so much anymore :-(
I quite agree, and I feel like this is my biggest reason why I don't want this "feature" (or any of another gazillion features that have been suggested and/or accepted here, including the walrus). The benefit of this PEP simply does not justify the added complexity of the language as a whole. Using None (or some sentinel) and including code to check for it works fine. It's not worth it to add a whole new layer of behavior to something as basic as argument-passing just to avoid having to type `if arg is None`.
While I disagree with you re this particular feature, I sympathise with the general sentiment. Perhaps the first objective of Python 4 should be to get rid of as many features as possible. Just like when you're forced to throw out all your precious old junk (school reports, prizes, the present from Aunt Maud, theatre programmes, books you never read, clothes you never wear .....). Nah, who am I kidding? Each feature will have its band of devotees that will defend it TO THE DEATH! Of course, what it should REALLY have are all MY favorite features, including some that haven't been added yet.😁 Rob Cliffe
On 2021-10-26 19:50, Rob Cliffe via Python-ideas wrote:
On 27/10/2021 03:12, Brendan Barnwell wrote:
On 2021-10-26 17:41, Christopher Barker wrote:
Python used to be such a simple language, not so much anymore :-(
I quite agree, and I feel like this is my biggest reason why I don't want this "feature" (or any of another gazillion features that have been suggested and/or accepted here, including the walrus). The benefit of this PEP simply does not justify the added complexity of the language as a whole. Using None (or some sentinel) and including code to check for it works fine. It's not worth it to add a whole new layer of behavior to something as basic as argument-passing just to avoid having to type `if arg is None`.
While I disagree with you re this particular feature, I sympathise with the general sentiment. Perhaps the first objective of Python 4 should be to get rid of as many features as possible. Just like when you're forced to throw out all your precious old junk (school reports, prizes, the present from Aunt Maud, theatre programmes, books you never read, clothes you never wear .....).
Nah, who am I kidding? Each feature will have its band of devotees that will defend it TO THE DEATH! Of course, what it should REALLY have are all MY favorite features, including some that haven't been added yet.😁 Rob Cliffe
Now you're talking! 100% agree! Assuming of course that by "MY favorite features" you mean, well, MY favorite features. . . :-) -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Wed, Oct 27, 2021 at 1:52 PM Rob Cliffe via Python-ideas <python-ideas@python.org> wrote:
On 27/10/2021 03:12, Brendan Barnwell wrote:
On 2021-10-26 17:41, Christopher Barker wrote:
Python used to be such a simple language, not so much anymore :-(
I quite agree, and I feel like this is my biggest reason why I don't want this "feature" (or any of another gazillion features that have been suggested and/or accepted here, including the walrus). The benefit of this PEP simply does not justify the added complexity of the language as a whole. Using None (or some sentinel) and including code to check for it works fine. It's not worth it to add a whole new layer of behavior to something as basic as argument-passing just to avoid having to type `if arg is None`.
While I disagree with you re this particular feature, I sympathise with the general sentiment. Perhaps the first objective of Python 4 should be to get rid of as many features as possible. Just like when you're forced to throw out all your precious old junk (school reports, prizes, the present from Aunt Maud, theatre programmes, books you never read, clothes you never wear .....).
Nah, who am I kidding? Each feature will have its band of devotees that will defend it TO THE DEATH! Of course, what it should REALLY have are all MY favorite features, including some that haven't been added yet.😁
One truism of language design is that the simpler the language is (and the easier to explain to a novice), the harder it is to actually use. For instance, we don't *need* async/await, or generators, or list comprehensions, or for loops, or any of those other tools for processing partial data; all we really need is a class with the appropriate state management. And there are people who genuinely prefer coding a state machine to writing a generator function. No problem! You're welcome to. But the language is richer for having these tools, and we can more easily express our logic using them. Each feature adds to the complexity of the language, but if they are made as orthogonal as possible, they generally add linearly. But they add exponentially to the expressiveness. Which ultimately means that orthogonality is the greatest feature in language design; it allows you to comprehend features one by one, and build up your mental picture of the code in simple ways, while still having the full power that it offers. As an example of orthogonality, Python's current argument defaults don't care whether you're working with integers, strings, lists, or anything else. They always behave the same way: using one specific value (object) to be the value given if one is not provided. They're also completely orthogonal with argument passing styles (positional and keyword), and which ones are valid for which parameters. And orthogonal again with type annotations and type comments. All these features go in the 'def' statement - or 'lambda', which has access to nearly all the same features (it can't have type comments, but everything else works) - but you don't have to worry about exponential complexity, because there's no conflicts between them. One of my goals here is to ensure that the distinction between early-bound and late-bound argument defaults is, again, orthogonal with everything else. You should be able to change "x=None" to "x=>[]" without having to wonder whether you're stopping yourself from adding a type annotation in the future. This is why I'm strongly inclined to syntaxes that adorn the equals sign, rather than those which put tokens elsewhere (eg "@x=[]"), because it keeps the default part self-contained. (It's actually quite fascinating how language design and game design work in parallel. The challenges in making a game fair, balanced, and fun are very similar to the challenges in making a language usable, elegant, and clean. I guess it's because, ultimately, both design challenges are about the humans who'll use the thing, and humans are still humans.) ChrisA
On 2021-10-26 20:15, Chris Angelico wrote:
One truism of language design is that the simpler the language is (and the easier to explain to a novice), the harder it is to actually use. For instance, we don't *need* async/await, or generators, or list comprehensions, or for loops, or any of those other tools for processing partial data; all we really need is a class with the appropriate state management. And there are people who genuinely prefer coding a state machine to writing a generator function. No problem! You're welcome to. But the language is richer for having these tools, and we can more easily express our logic using them.
Each feature adds to the complexity of the language, but if they are made as orthogonal as possible, they generally add linearly. But they add exponentially to the expressiveness. Which ultimately means that orthogonality is the greatest feature in language design; it allows you to comprehend features one by one, and build up your mental picture of the code in simple ways, while still having the full power that it offers.
These are fascinating and great points, but again I see the issues slightly differently. I wouldn't agree that "the simpler the language the harder it is to use". That to me implies an equivalence with "the more complex the language the easier it is to use", which hopefully we agree is untrue. Rather, both extremely simple AND extremely complex languages are difficult to use. The goal is a "sweet spot" in which you add complexity in just the right areas to achieve maximum expressibility with minimum cognitive load. And I think Python overall has done a good job of this, better than pretty much any other language I know of. And I think part of that good design has involved not "sweating the small stuff" in the sense of trying to add lots of special syntax to handle every mildly annoying pain point, but instead focusing on a relatively small number of well-chosen building blocks (iteration, for instance) and providing a clean combinatoric framework to facilitate the exponential expressiveness you describe.
As an example of orthogonality, Python's current argument defaults don't care whether you're working with integers, strings, lists, or anything else. They always behave the same way: using one specific value (object) to be the value given if one is not provided. They're also completely orthogonal with argument passing styles (positional and keyword), and which ones are valid for which parameters. And orthogonal again with type annotations and type comments. All these features go in the 'def' statement - or 'lambda', which has access to nearly all the same features (it can't have type comments, but everything else works) - but you don't have to worry about exponential complexity, because there's no conflicts between them.
One of my goals here is to ensure that the distinction between early-bound and late-bound argument defaults is, again, orthogonal with everything else. You should be able to change "x=None" to "x=>[]" without having to wonder whether you're stopping yourself from adding a type annotation in the future. This is why I'm strongly inclined to syntaxes that adorn the equals sign, rather than those which put tokens elsewhere (eg "@x=[]"), because it keeps the default part self-contained.
So what I would say here is that argument-passing styles are in the sweet spot and argument binding time is not. One reason (as I mentioned in another post on this thread) is that argument-passing occurs at every call site, so simplifying it has the kind of exponential benefit you describe. But argument binding only happens once, when the function is defined, so its benefits scale with how many functions you write, not how many calls you make. Nonetheless two different argument-binding syntaxes still impose a cognitive burden, since now to be able to read Python code a person has to be able to understand both syntaxes instead of just one. The combination of "benefit scales only with the number of definitions" and "have to know two syntaxes all the time", for me, makes the cost/benefit ratio not pencil out. There are also other issues, such as the difficulty of finding a good syntax. I think it's typical of Pythonic style to avoid cramming too much into too small a space. If we have a long expression, we can break it up into pieces assigned to separate variables. If we have a long function, we can break it up into separate functions. But there is no straightforward way to "break up" a single functions signature into multiple signatures (while still having just one function). This means that trying to introduce late-binding logic into the signature requires us to cram it into this uniquely restricted space. And this in turn means that a lot is riding on the choice of a syntax that is visually optimal, because there isn't going to be any way to "factor it out" into something more readable. (Other than, of course, the solution we currently have, which is to put the logic in the function body.) Also there is the question of how this new orthogonal dimension relates to existing dimensions (i.e., is it "linearly independent"). Right now I think we have a pretty simple rule: If you need complicated logic to express as an early-bound default, you move that logic before the function, assign whatever you need to a variable, and then use that variable as the default. If on the other hand you need complicated logic to express a late-bound default, you pick some sentinel (often None) to act as an early-bound placeholder, and move the logic into the function body. Now it's true that we have asymmetry, in that SIMPLE logic can be readably inlined as an early-bound default, whereas even simple logic cannot be inlined as a late-bound default because there is no inline way to express late-bound defaults. But I still think it's worth noticing that the new syntax is not going to add a "full dimension". It is only going to be useful for VERY SIMPLE late-bound default values. If the late-bound default is too complicated, inlining it will backfire and make things LESS readable. So although the syntax is orthogonal, it is not really separating out early and late default binding; it is only separating "simple late default binding". Finally, even if the change is orthogonal to existing dimensions of function-calling, there is still a cost to adding this new dimension in the first place (i.e., one more thing to know). And there still needs to be a decision made about whether the cognitive burden of adding that dimension is worth the gain in expressiveness. Of course there is a lot of subjectivity here and it seems I am in the minority, but to me the ability to concisely express short, simple expressions for late-binding defaults doesn't fall in that sweet spot. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Wed, Oct 27, 2021 at 5:14 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
Now it's true that we have asymmetry, in that SIMPLE logic can be readably inlined as an early-bound default, whereas even simple logic cannot be inlined as a late-bound default because there is no inline way to express late-bound defaults. But I still think it's worth noticing that the new syntax is not going to add a "full dimension". It is only going to be useful for VERY SIMPLE late-bound default values. If the late-bound default is too complicated, inlining it will backfire and make things LESS readable. So although the syntax is orthogonal, it is not really separating out early and late default binding; it is only separating "simple late default binding".
And that right there is the crux of it. The simple cases ARE quite common. Of course there will always be cases that don't work that way, so yes, there will always be the need to use sentinels and put the logic inside the function; but there are plenty where code will benefit from putting the default into the signature, just as we already have with early-bound defaults. It's easy to argue against a feature by showing that it can be abused. For instance, I could rewrite your def function thus: def foo(): for a in x: print(a) for b in x: print(b) other_func(lambda: [print(a) for lst in x,x for a in lst].append(0)]) Tada! I've worked around a technical limitation. Is this good code? No. Would some code benefit from a multi-line lambda function? Definitely. Workarounds can be horrifically clunky, and then they provide a strong incentive to do things better. Or they can be fairly insignificant, which provides a much weaker incentive. (In this case, it's probably fine to just use def!) But they're still workarounds, and there is always benefit to being able to expressing your logic without having to fight the language's limitations. ChrisA
On 27/10/2021 08:56, Chris Angelico wrote:
On Wed, Oct 27, 2021 at 5:14 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
But I still think it's worth noticing that the new syntax is not going to add a "full dimension". It is only going to be useful for VERY SIMPLE late-bound default values. If the late-bound default is too complicated, inlining it will backfire and make things LESS readable. So although the syntax is orthogonal, it is not really separating out early and late default binding; it is only separating "simple late default binding". And that right there is the crux of it. The simple cases ARE quite common. Of course there will always be cases that don't work that way, so yes, there will always be the need to use sentinels and put the logic inside the function; but there are plenty where code will benefit from putting the default into the signature, just as we already have with early-bound defaults.
It's easy to argue against a feature by showing that it can be abused.
+1. The same argument (Brendan's) could be used against having e.g. list comprehensions. Rob Cliffe
On 2021-10-26 20:15, Chris Angelico wrote:
One of my goals here is to ensure that the distinction between early-bound and late-bound argument defaults is, again, orthogonal with everything else. You should be able to change "x=None" to "x=>[]" without having to wonder whether you're stopping yourself from adding a type annotation in the future. This is why I'm strongly inclined to syntaxes that adorn the equals sign, rather than those which put tokens elsewhere (eg "@x=[]"), because it keeps the default part self-contained.
Another point that I forgot to mention when replying to this before: You are phrasing this in terms of orthogonality in argument-passing. But why think of it that way? If we think of it in terms of expression evaluation, your proposal is quite non-orthogonal, because you're essentially creating a very limited form of deferred evaluation that works only in function arguments. In a function argument, people will be able to do `x=>[]`, but they won't be able to do that anywhere else. So you're creating a "mode" for deferred evaluation. This is why I don't get why you seem so resistant to the idea of a more general deferred evaluation approach to this problem. Generalizing deferred evaluation somehow would make the proposal MORE orthogonal to other features, because it would mean you could use a deferred expression as an argument in the same way you could use it in other places. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Sun, Oct 31, 2021 at 9:17 AM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-26 20:15, Chris Angelico wrote:
One of my goals here is to ensure that the distinction between early-bound and late-bound argument defaults is, again, orthogonal with everything else. You should be able to change "x=None" to "x=>[]" without having to wonder whether you're stopping yourself from adding a type annotation in the future. This is why I'm strongly inclined to syntaxes that adorn the equals sign, rather than those which put tokens elsewhere (eg "@x=[]"), because it keeps the default part self-contained.
Another point that I forgot to mention when replying to this before:
You are phrasing this in terms of orthogonality in argument-passing. But why think of it that way? If we think of it in terms of expression evaluation, your proposal is quite non-orthogonal, because you're essentially creating a very limited form of deferred evaluation that works only in function arguments. In a function argument, people will be able to do `x=>[]`, but they won't be able to do that anywhere else. So you're creating a "mode" for deferred evaluation.
This is why I don't get why you seem so resistant to the idea of a more general deferred evaluation approach to this problem. Generalizing deferred evaluation somehow would make the proposal MORE orthogonal to other features, because it would mean you could use a deferred expression as an argument in the same way you could use it in other places.
Please expand on this. How would you provide an expression that gets evaluated in *someone else's context*? The way I've built it, the expression is written and compiled in the context that it will run in. The code for the default expression is part of the function that it serves. If I were to generalize this in any way, it would be to separate two parts: "optional parameter, if omitted, leave unbound" and "if local is unbound: do something". Not to "here's an expression, go evaluate it later", which requires a lot more compiler help. ChrisA
On 10/26/2021 7:38 PM, Rob Cliffe via Python-ideas wrote:
PS Can I echo Guido's plea that people don't derail this PEP by trying to shoehorn deferred-evaluation-objects (or whatever you want to call them) into it? As Chris A says, that's a separate idea and should go into a separate PEP. If I need a screwdriver, I buy a screwdriver, not an expensive Swiss Army knife.
As I've said before, I disagree with this. If you're going to introduce this feature, you need some way of building an inspect.Signature object that refers to the code to be executed. My concern is that if we add something that has deferred evaluation of code, but we don't think of how it might interact with other future uses of deferred evaluation, we might not be able to merge the two ideas. Maybe there's something that could be factored out of PEP 649 (Deferred Evaluation Of Annotations Using Descriptors) that could be used with PEP 671? That said, I'm still -1 on PEP 671. Eric
I'm -100 now on "deferred evaluation, but contorted to be useless outside of argument declarations." At first I thought it might be harmless, but nothing I really care about. After the discussion, I think the PEP would be actively harmful to future Python features. On Sat, Oct 30, 2021, 3:57 PM Eric V. Smith <eric@trueblade.com> wrote:
On 10/26/2021 7:38 PM, Rob Cliffe via Python-ideas wrote:
PS Can I echo Guido's plea that people don't derail this PEP by trying to shoehorn deferred-evaluation-objects (or whatever you want to call them) into it? As Chris A says, that's a separate idea and should go into a separate PEP. If I need a screwdriver, I buy a screwdriver, not an expensive Swiss Army knife.
As I've said before, I disagree with this. If you're going to introduce this feature, you need some way of building an inspect.Signature object that refers to the code to be executed. My concern is that if we add something that has deferred evaluation of code, but we don't think of how it might interact with other future uses of deferred evaluation, we might not be able to merge the two ideas.
Maybe there's something that could be factored out of PEP 649 (Deferred Evaluation Of Annotations Using Descriptors) that could be used with PEP 671?
That said, I'm still -1 on PEP 671.
Eric
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/CCFPVN... Code of Conduct: http://python.org/psf/codeofconduct/
On 2021-10-30 15:07, David Mertz, Ph.D. wrote:
I'm -100 now on "deferred evaluation, but contorted to be useless outside of argument declarations."
At first I thought it might be harmless, but nothing I really care about. After the discussion, I think the PEP would be actively harmful to future Python features.
I'm not sure I'm -100, but still a hard -1, maybe -10. I agree it seems totally absurd to add a type of deferred expression but restrict it to only work inside function definitions. That doesn't make any sense. If we have a way to create deferred expressions we should try to make them more generally usable. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Sat, 30 Oct 2021 at 23:13, Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-30 15:07, David Mertz, Ph.D. wrote:
I'm -100 now on "deferred evaluation, but contorted to be useless outside of argument declarations."
At first I thought it might be harmless, but nothing I really care about. After the discussion, I think the PEP would be actively harmful to future Python features.
I'm not sure I'm -100, but still a hard -1, maybe -10.
I agree it seems totally absurd to add a type of deferred expression but restrict it to only work inside function definitions. That doesn't make any sense. If we have a way to create deferred expressions we should try to make them more generally usable.
I was in favour of the idea, but having seen the implications I'm now -0.5, moving towards -1. I'm uncomfortable with *not* having a "proper" mechanism for building signature objects and other introspection (I don't consider having the expression as a string and requiring consumers to eval it, to be "proper"). And so, I think the implication is that this feature would need some sort of real deferred expression to work properly - and I'd rather deferred expressions were defined as a standalone mechanism, where the full range of use cases (including, but not limited to, late-bound defaults!) can be considered. Paul
On Sun, Oct 31, 2021 at 9:20 AM Paul Moore <p.f.moore@gmail.com> wrote:
On Sat, 30 Oct 2021 at 23:13, Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-30 15:07, David Mertz, Ph.D. wrote:
I'm -100 now on "deferred evaluation, but contorted to be useless outside of argument declarations."
At first I thought it might be harmless, but nothing I really care about. After the discussion, I think the PEP would be actively harmful to future Python features.
I'm not sure I'm -100, but still a hard -1, maybe -10.
I agree it seems totally absurd to add a type of deferred expression but restrict it to only work inside function definitions. That doesn't make any sense. If we have a way to create deferred expressions we should try to make them more generally usable.
I was in favour of the idea, but having seen the implications I'm now -0.5, moving towards -1. I'm uncomfortable with *not* having a "proper" mechanism for building signature objects and other introspection (I don't consider having the expression as a string and requiring consumers to eval it, to be "proper"). And so, I think the implication is that this feature would need some sort of real deferred expression to work properly - and I'd rather deferred expressions were defined as a standalone mechanism, where the full range of use cases (including, but not limited to, late-bound defaults!) can be considered.
Bear in mind that the status quo is, quite honestly, a form of white lie. In the example of bisect: def bisect(a, hi=None): ... def bisect(a, hi=>len(a)): ... neither form of the signature will actually say that the default value is the length of a. In fact, I have never said that the consumer should eval it. There is fundamentally no way to determine the true default value for hi without first figuring out what a is. So which is better: to have the value None, or to have a marker saying "this will be calculated later, and here's a human-readable description: len(a)"? I know that status quo wins a stalemate, but you're holding the new feature to a FAR higher bar than current idioms. ChrisA
On 2021-10-30 15:35, Chris Angelico wrote:
Bear in mind that the status quo is, quite honestly, a form of white lie. In the example of bisect:
def bisect(a, hi=None): ... def bisect(a, hi=>len(a)): ...
neither form of the signature will actually say that the default value is the length of a. In fact, I have never said that the consumer should eval it. There is fundamentally no way to determine the true default value for hi without first figuring out what a is.
The way you use the term "default value" doesn't quite work for me. More and more I think that part of the issue here is that "hi=>len(a)" we aren't providing a default value at all. What we're providing is default *code*. To me a "value" is something that you can assign to a variable, and current (aka "early bound") argument defaults are that. But these new late-bound arguments aren't really default "values", they're code that is run under certain circumstances. If we want to make them values, we need to define "evaluate len(a) later" as some kind of first-class value.
So which is better: to have the value None, or to have a marker saying "this will be calculated later, and here's a human-readable description: len(a)"?
Let me say that in a different way. . . :-) Which is better, to have a marker saying "this will be calculated later", or to have a marker saying "this will be calculated later, and here's a human-readable description"? My point is that None is already a marker. It's true it's not a special-purpose marker meaning "this will be calculated later", but I think in practice it is a marker that says "be sure to read the documentation to understand what passing None will do here". Increasingly it seems to me as if you are placing inordinate weight on the idea that the benefit of default arguments is providing a "human readable" description in the default help() and so on. And, to be frank, I just don't care about that. We can already provide human-readable descriptions in documentation and we should do that instead of trying to create gimped human-readable descriptions that only work in special cases. Or, to put it even more bluntly, from my perspective, having help() show something maybe sort of useful just in the case where the person wrote very simple default-argument logic and didn't take the time to write a real docstring is simply not a worthwhile goal. So really the status quo is "you can already have the human-readable description but you have to type it in the docstring yourself". I don't see that as a big deal. So yes, the status quo is better, because it is not really any worse, and it avoids the complications that are arising in this thread (i.e., what order are the arguments evaluated in, can they reference each other, what symbol do we use, how do we implement it without affecting existing introspection, etc.). -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Sun, Oct 31, 2021 at 9:54 AM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-30 15:35, Chris Angelico wrote:
Bear in mind that the status quo is, quite honestly, a form of white lie. In the example of bisect:
def bisect(a, hi=None): ... def bisect(a, hi=>len(a)): ...
neither form of the signature will actually say that the default value is the length of a. In fact, I have never said that the consumer should eval it. There is fundamentally no way to determine the true default value for hi without first figuring out what a is.
The way you use the term "default value" doesn't quite work for me. More and more I think that part of the issue here is that "hi=>len(a)" we aren't providing a default value at all. What we're providing is default *code*. To me a "value" is something that you can assign to a variable, and current (aka "early bound") argument defaults are that. But these new late-bound arguments aren't really default "values", they're code that is run under certain circumstances. If we want to make them values, we need to define "evaluate len(a) later" as some kind of first-class value.
This is correct. In fact, the way it is in the syntax, the second one is a "default expression". But the Signature object can't actually have the default expression, so it would have to have either a default value, or some sort of marker. Argument defaults, up to Python 3.11, are always default values. If PEP 671 is accepted, it will make sense to have default values AND default expressions.
So which is better: to have the value None, or to have a marker saying "this will be calculated later, and here's a human-readable description: len(a)"?
Let me say that in a different way. . . :-)
Which is better, to have a marker saying "this will be calculated later", or to have a marker saying "this will be calculated later, and here's a human-readable description"?
My point is that None is already a marker. It's true it's not a special-purpose marker meaning "this will be calculated later", but I think in practice it is a marker that says "be sure to read the documentation to understand what passing None will do here".
That's true. The trouble is that it isn't uniquely such a marker, and in fact is very often the actual default value. When you call dict.get(), the second argument has a default of None, and if you omit it, you really truly do get None as a result. Technically, for tools that look at func.__defaults__, I have Ellipsis doing that kind of job. But (a) that's far less common than None, and (b) there's a way to figure out whether it's a real default value or a marker. Of course, there will still be functions that have pseudo-defaults, so you can never be truly 100% sure, but at least you get an indication for those functions that actually use default expressions. So what you have is a marker saying "this is either the value None, or something that will be calculated later". Actually there are multiple markers; None might mean that, but so might "<object object at 0x7f27e77f0570>", which is more likely to mean that it'll be calculated later, but harder to recognize reliably. And in all cases, it might not be a value that's calculated later, but it might be a change in effect (maybe causing an exception to be raised rather than a value being returned). The markers currently are very ad-hoc and can't be depended on by tools. There is fundamentally no way to do better than a marker, but we can at least have more useful markers.
Increasingly it seems to me as if you are placing inordinate weight on the idea that the benefit of default arguments is providing a "human readable" description in the default help() and so on. And, to be frank, I just don't care about that. We can already provide human-readable descriptions in documentation and we should do that instead of trying to create gimped human-readable descriptions that only work in special cases. Or, to put it even more bluntly, from my perspective, having help() show something maybe sort of useful just in the case where the person wrote very simple default-argument logic and didn't take the time to write a real docstring is simply not a worthwhile goal.
Interesting. Then why have default arguments at all? What's the point of saying "if omitted, the default is zero" in a machine-readable way? After all, you could just have it in the docstring. And there are plenty of languages where that's the case. I'm of the opinion that having more information machine-readable is always better. Are you saying that it isn't? Or alternatively, that it's only useful when it fits in a strict subset of constant values (values that don't depend on anything else, and can be calculated at function definition time)?
So really the status quo is "you can already have the human-readable description but you have to type it in the docstring yourself". I don't see that as a big deal. So yes, the status quo is better, because it is not really any worse, and it avoids the complications that are arising in this thread (i.e., what order are the arguments evaluated in, can they reference each other, what symbol do we use, how do we implement it without affecting existing introspection, etc.).
And it also allows an easy transformation for functions that currently are experiencing issues from things being too constant. Consider: default_timeout = 500 def connect(timeout=default_timeout): If that default can be changed at run time, how do you fix the function? By my proposal, you just mark the default as being late-bound. With the status quo, now you need to bury the real default in the body of the function, and make a public statement that the default is None (or something else). That's not a true statement, since None doesn't really make sense as a default, but that's what you have to do to work around a technical limitation. ChrisA
On 2021-10-30 16:12, Chris Angelico wrote:
Increasingly it seems to me as if you are placing inordinate weight on the idea that the benefit of default arguments is providing a "human readable" description in the default help() and so on. And, to be frank, I just don't care about that. We can already provide human-readable descriptions in documentation and we should do that instead of trying to create gimped human-readable descriptions that only work in special cases. Or, to put it even more bluntly, from my perspective, having help() show something maybe sort of useful just in the case where the person wrote very simple default-argument logic and didn't take the time to write a real docstring is simply not a worthwhile goal.
Interesting. Then why have default arguments at all? What's the point of saying "if omitted, the default is zero" in a machine-readable way? After all, you could just have it in the docstring. And there are plenty of languages where that's the case.
The point of default arguments is to allow users of the function to omit arguments at the call site. It doesn't have anything to do with docstrings. Or do you mean why not just have all omitted arguments set to some kind of "undefined" value and then check each one in the body of the function and replace it with a default if you want to? Well, for one thing it makes for cleaner error handling, since Python can tell at the time of the call that a required argument wasn't supplied and raise that error right away. It's sort of like a halfway type check, where you're not actually checking that the correct types of arguments were passed, but at least you know that the arguments that need to be passed were passed and not left out entirely. For another thing, it does mean that if you know the default at the time you're defining the function, you can specify it then. What you can't do is specify the default if you don't know the default at function definition time, but only know "how you're going to decide what value to use" (which is a process, not a value).
I'm of the opinion that having more information machine-readable is always better. Are you saying that it isn't? Or alternatively, that it's only useful when it fits in a strict subset of constant values (values that don't depend on anything else, and can be calculated at function definition time)?
Now wait a minute, before you said the goal was for it to be human readable, but now you're saying it's about being machine readable! :-) What I am saying is that there is a qualitative difference between "I know now (at function definition time) what value to use for this argument if it's missing" and "I know now (at function definition time) *what I will do* if this argument is missing". Specifying "what you will do" is naturally what you do inside the function. It's a process to be done later, it's logic, it's code. It is not the same as finalizing an actual VALUE at function definition time. So yes, there is a qualitative difference between: # this if argument is undefined: argument = some_constant_value # and this if argument is undefined: # arbitrary code here I mean, the difference is that in one case arbitrary code is allowed! That's a big difference. Based on some of your other posts, I'm guessing that what you mean about machine readability is that you appreciate certain kinds of labor-saving "self-documentation" techniques, whereby when we write the machine-readable code, the interpreter automatically derives some human-readable descriptions for stuff. For instance when we write `def foo` we're just defining an arbitrary symbol to be used elsewhere in the code, but if we get an exception Python doesn't just tell us "exception in function number 1234" or the line number, but also tells us the function name. And yeah, I agree that can be useful. And I agree that it would be "nice" if we could write "len(a)" without quotes as machine-readable code, and then have that stored as some human-readable thing that could be shown when appropriate. But if that's nice, why is it only nice in function arguments? Why is it only nice to be able to associate the code `len(a)` with the human-readable string "len(a)" just when that string happens to occur in a function signature? On top of that, even if I agree that that is useful, I see the benefit of that in this case (generating docstrings based on default arguments) as very marginal. I think I agree with the spirit of what you mean "having more information machine-readable is always good", but of course I don't agree that that's literally true --- because you have to balance that good against other goods. In this case, perhaps most notably, we have to balance it against the cognitive load of having two different ways to write arguments, which will have quite different semantics, which based on current proposals are going to differ in a single character, and both of which can be interleaved arbitrarily in the function signature. That's leaving aside all the other questions about the more subtle details of the proposal (like mutual references between defaults), which will only increase the potential cognitive burden for code readers. So yes, it's true that adding convenience functions to derive human-readable forms from machine-readable code is handy, but it's not ALWAYS automatically good regardless of other considerations, and I don't see that it outweighs the costs here. The benefit of autogenerating the string "len(a)" from the argument spec isn't quite zero but it's pretty tiny. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Sun, Oct 31, 2021 at 12:17 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-30 16:12, Chris Angelico wrote:
Increasingly it seems to me as if you are placing inordinate weight on the idea that the benefit of default arguments is providing a "human readable" description in the default help() and so on. And, to be frank, I just don't care about that. We can already provide human-readable descriptions in documentation and we should do that instead of trying to create gimped human-readable descriptions that only work in special cases. Or, to put it even more bluntly, from my perspective, having help() show something maybe sort of useful just in the case where the person wrote very simple default-argument logic and didn't take the time to write a real docstring is simply not a worthwhile goal.
Interesting. Then why have default arguments at all? What's the point of saying "if omitted, the default is zero" in a machine-readable way? After all, you could just have it in the docstring. And there are plenty of languages where that's the case.
The point of default arguments is to allow users of the function to omit arguments at the call site. It doesn't have anything to do with docstrings.
That's optional parameters. It's entirely possible to have optional parameters with no concept of defaults; it means the function can be called with or without those arguments, and the corresponding parameters will either be set to some standard "undefined" value, or in some other way marked as not given.
Or do you mean why not just have all omitted arguments set to some kind of "undefined" value and then check each one in the body of the function and replace it with a default if you want to? Well, for one thing it makes for cleaner error handling, since Python can tell at the time of the call that a required argument wasn't supplied and raise that error right away.
That's the JavaScript way - every parameter is optional. But it's entirely possible to have some mandatory and some optional, while still not having a concept of defaults. Python, so far, has completely conflated those two concepts: if a parameter is optional, it always has a default argument value. (The converse is implied - mandatory args can't have defaults.)
It's sort of like a halfway type check, where you're not actually checking that the correct types of arguments were passed, but at least you know that the arguments that need to be passed were passed and not left out entirely.
Given that Python doesn't generally have any sort of argument type checking, that's exactly what we'll get by default.
For another thing, it does mean that if you know the default at the time you're defining the function, you can specify it then. What you can't do is specify the default if you don't know the default at function definition time, but only know "how you're going to decide what value to use" (which is a process, not a value).
Right. That's the current situation.
I'm of the opinion that having more information machine-readable is always better. Are you saying that it isn't? Or alternatively, that it's only useful when it fits in a strict subset of constant values (values that don't depend on anything else, and can be calculated at function definition time)?
Now wait a minute, before you said the goal was for it to be human readable, but now you're saying it's about being machine readable! :-)
Truly machine readable is the best: any tool can know exactly what will happen. That is fundamentally not possible when the default value is calculated. Mostly machine readable means that the machine can figure out what the default is, even if it doesn't know what that means. My proposal (not in the reference implementation as yet) is to have late-bound defaults contain a marker saying "the default will be len(a)", even though the "len(a)" part would be just a text string.
What I am saying is that there is a qualitative difference between "I know now (at function definition time) what value to use for this argument if it's missing" and "I know now (at function definition time) *what I will do* if this argument is missing". Specifying "what you will do" is naturally what you do inside the function. It's a process to be done later, it's logic, it's code. It is not the same as finalizing an actual VALUE at function definition time. So yes, there is a qualitative difference between:
# this if argument is undefined: argument = some_constant_value
# and this if argument is undefined: # arbitrary code here
I mean, the difference is that in one case arbitrary code is allowed! That's a big difference.
Right. That is a very real difference, which is why there is a very real difference between early-bound and late-bound defaults. But both are argument defaults.
Based on some of your other posts, I'm guessing that what you mean about machine readability is that you appreciate certain kinds of labor-saving "self-documentation" techniques, whereby when we write the machine-readable code, the interpreter automatically derives some human-readable descriptions for stuff. For instance when we write `def foo` we're just defining an arbitrary symbol to be used elsewhere in the code, but if we get an exception Python doesn't just tell us "exception in function number 1234" or the line number, but also tells us the function name.
And yeah, I agree that can be useful. And I agree that it would be "nice" if we could write "len(a)" without quotes as machine-readable code, and then have that stored as some human-readable thing that could be shown when appropriate. But if that's nice, why is it only nice in function arguments? Why is it only nice to be able to associate the code `len(a)` with the human-readable string "len(a)" just when that string happens to occur in a function signature?
It would be very nice to have that feature for a number of places. It's been requested for assertions, for instance. If that subfeature becomes more generally available, the language will be the richer for it.
So yes, it's true that adding convenience functions to derive human-readable forms from machine-readable code is handy, but it's not ALWAYS automatically good regardless of other considerations, and I don't see that it outweighs the costs here. The benefit of autogenerating the string "len(a)" from the argument spec isn't quite zero but it's pretty tiny.
It's mainly about writing expressive code, which can then be interpreted by humans AND machines. It's about writing function defaults as function defaults, not working around a technical limitation. It's about writing function headers such that they have what function headers should have, allowing the function body to contain only the function body. We could just write every function with *args, **kwargs, and then do all argument checking inside the function. We don't do this, because it's the job of the function header to manage this. It's not the function body's job to replace placeholders with actual values when arguments are omitted. ChrisA
On Sat, Oct 30, 2021 at 6:32 PM Chris Angelico <rosuav@gmail.com> wrote:
We could just write every function with *args, **kwargs, and then do all argument checking inside the function. We don't do this, because it's the job of the function header to manage this. It's not the function body's job to replace placeholders with actual values when arguments are omitted.
This is actually a great point, and I don't think it' sa straw man -- a major project I'm soaring on is highly dynamic with a lot of subclassing with complex __init__ signatures. And due to laziness or, to be generous, attempts to be DRY, we have a LOT of mostly *args, **kwargs parameterizations, and the result is that an, e.g. typo in a parameter name doesn't get caught till the very top of the class hierarchy, and it's REALLY hard to tell where the issue is. (and there is literally code manipulating kwargs, like: thing = kwargs.get('something') I'm pushing to refactor that code somewhat to have more clearly laid out function definitions, even if that means some repetition -- after all, we need to document it anyway, so why not have the interpreter do some of the checking for us? The point is: clearly specifying what's required, what's optional, and what the defaults are if optional, is really, really useful -- and this PEP will add another very handy feature to that. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Sat, Oct 30, 2021 at 06:52:33PM -0700, Christopher Barker wrote:
The point is: clearly specifying what's required, what's optional, and what the defaults are if optional, is really, really useful -- and this PEP will add another very handy feature to that.
+1 Earlier I said that better help() is "icing on the cake". I like icing on my cake, and I think that having late-bound defaults clearly readable without any extra effort is a good thing. The status quo is that if you introspect a function's parameters, you will see the sentinel, not the actual default value, or the expression that gives the default value, that the body of the function actually uses. >>> inspect.signature(bisect.bisect_left) <Signature (a, x, lo=0, hi=None, *, key=None)> I challenge anyone to honestly say that if that signature read: <Signature (a, x, lo=0, hi=len(a), *, key=None)> they would not be able to infer the meaning, or that Python would be a worse language if the interpreter managed the evaluation of that default so you didn't have to. And if you really want to manage the late evaluation of defaults yourself, you will still be able to. -- Steve
On Sun, Oct 31, 2021 at 3:31 PM Steven D'Aprano <steve@pearwood.info> wrote:
>>> inspect.signature(bisect.bisect_left) <Signature (a, x, lo=0, hi=None, *, key=None)>
I challenge anyone to honestly say that if that signature read:
<Signature (a, x, lo=0, hi=len(a), *, key=None)>
they would not be able to infer the meaning, or that Python would be a worse language if the interpreter managed the evaluation of that default so you didn't have to.
There is a downside: it is possible to flat-out lie to the interpreter, by mutating bisect_left.__defaults__, so that help() will give a completely false signature. But if you want to shoot yourself in the foot, there are already plenty of gorgeous guns available. Behold, the G3SG1 "High Seas" of footguns:
def spam(x, y, z="foo", *, count=4): ... ... def ham(a, *, n): ... ... spam.__wrapped__ = ham inspect.signature(spam) <Signature (a, *, n)>
Ahhhhh whoops. We just managed to lie to ourselves. Good job, us. ChrisA
On Sun, Oct 31, 2021 at 03:43:25PM +1100, Chris Angelico wrote:
There is a downside: it is possible to flat-out lie to the interpreter, by mutating bisect_left.__defaults__, so that help() will give a completely false signature.
>>> def func(arg="finest green eggs and ham"): ... pass ... >>> inspect.signature(func) <Signature (arg='finest green eggs and ham')> >>> >>> func.__defaults__ = ("yucky crap",) >>> inspect.signature(func) <Signature (arg='yucky crap')> If help, or some other tool is caching the function signature, perhaps it shouldn't :-)
But if you want to shoot yourself in the foot, there are already plenty of gorgeous guns available.
Indeed. Beyond avoiding segmentation faults, I don't think we need to care about people who mess about with the public attributes of functions. You can touch, you can even change them, but keeping the function working is no longer our responsibility at that point. If you change the defaults, you shouldn't get a seg fault when you call the function, but you might get an exception. -- Steve
On Sun, Oct 31, 2021 at 4:15 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Sun, Oct 31, 2021 at 03:43:25PM +1100, Chris Angelico wrote:
There is a downside: it is possible to flat-out lie to the interpreter, by mutating bisect_left.__defaults__, so that help() will give a completely false signature.
>>> def func(arg="finest green eggs and ham"): ... pass ... >>> inspect.signature(func) <Signature (arg='finest green eggs and ham')> >>> >>> func.__defaults__ = ("yucky crap",) >>> inspect.signature(func) <Signature (arg='yucky crap')>
If help, or some other tool is caching the function signature, perhaps it shouldn't :-)
Yep, but with late-bound defaults, there is a slight difference. With early-bound ones, you do have a guarantee that the signature and the behaviour are synchronized; with late-bound, the behaviour is encoded in the function, and the signature has (or will have, once I write that part) some sort of snapshot, either the AST or a source code snippet. (At the moment, they all just show Ellipsis.) So you could reach in and replace the __defaults_extra__ and change how the signature looks:
def foo(a=[], b=>[]): ... ... dis.dis(foo) 1 0 QUERY_FAST 1 (b) 2 POP_JUMP_IF_TRUE 4 (to 8) 4 BUILD_LIST 0 6 STORE_FAST 1 (b) >> 8 LOAD_CONST 0 (None) 10 RETURN_VALUE foo.__defaults__ ([], Ellipsis) foo.__defaults_extra__ (None, '')
The first slot of __defaults_extra__ indicates that the first default is early-bound, but the second one will be the description - "[]" - that would get used in inspect/help. Replacing that would let you change what the default appears to be. I don't think this is a major problem. It's no worse than other things you can mess with, and if you do that sort of thing, you get only what you asked for; there's no way you can get a segfault or even an exception, as long as you use either None or a string. (I should probably have some validation to make sure that those are the only two types in the tuple. Will jot that down as a TODO.) Changing whether the extra slot is None or not is amusing, though still not particularly significant. If you have an early-bound default of Ellipsis, changing extra from None to a string will pretend that it's a late-bound with that value. (If the default value isn't Ellipsis, then according to the spec, the extra should be ignored; but it's possible that some tools will end up using extra first, which would mean they'd be deceived regardless of the actual value.) The behaviour will actually be UnboundLocalError, but inspecting the signature would show the claimed value. On the flip side, if you have a late-bound default and change the extra from a string to None, it will turn it into an early default of Ellipsis, with a small amount of dead code at the start of the function. All of this is implementation details though. What I'll document is that changing __defaults_extra__ requires a tuple of Nones and/or strings (and __kwdefaults_extra__ requires a dict mapping strings to None and/or strings), and I won't recommend actually changing it.
But if you want to shoot yourself in the foot, there are already plenty of gorgeous guns available.
Indeed. Beyond avoiding segmentation faults, I don't think we need to care about people who mess about with the public attributes of functions. You can touch, you can even change them, but keeping the function working is no longer our responsibility at that point.
If you change the defaults, you shouldn't get a seg fault when you call the function, but you might get an exception.
Exactly, and that's what happens. I suppose in some cases it might be nice to get the exception when you assign to __defaults_extra__, but it's not that big a deal if it results in an exception when you call the function. Generally, I would expect that most uses of these dunders will be read-only, or copying them from some other function. Not a lot else. ChrisA
On Sun, Oct 31, 2021 at 5:25 PM Chris Angelico <rosuav@gmail.com> wrote:
I don't think this is a major problem. It's no worse than other things you can mess with, and if you do that sort of thing, you get only what you asked for; there's no way you can get a segfault or even an exception, as long as you use either None or a string.
(I should probably have some validation to make sure that those are the only two types in the tuple. Will jot that down as a TODO.)
Since __kwdefaults__ and __kwdefaults_extra__ are mutable dicts, there's not a lot of point trying to validate them on attribute assignment. Oh well. Would have been nicer to catch errors earlier. Not that it makes a huge difference. None is the most significant value here (it means "that's an early-bound default, even though it's Ellipsis"), and everything else currently just means "that's a late-bound default". ChrisA
On 2021-10-30 18:29, Chris Angelico wrote:
What I am saying is that there is a qualitative difference between "I know now (at function definition time) what value to use for this argument if it's missing" and "I know now (at function definition time) *what I will do* if this argument is missing". Specifying "what you will do" is naturally what you do inside the function. It's a process to be done later, it's logic, it's code. It is not the same as finalizing an actual VALUE at function definition time. So yes, there is a qualitative difference between:
# this if argument is undefined: argument = some_constant_value
# and this if argument is undefined: # arbitrary code here
I mean, the difference is that in one case arbitrary code is allowed! That's a big difference.
Right. That is a very real difference, which is why there is a very real difference between early-bound and late-bound defaults. But both are argument defaults.
I don't 100% agree with that. I mean, here's another way to come at this. Suppose we have this under the proposal (using some random syntax picked at random): def foo(a=1, b="two", c@=len(b), d@=a+c): You keep saying that c and d "are argument default" just like a and b. So can you tell me what the argument default for each of those arguments? The default for argument a is an integer. The default for argument b is a string. Can you tell me, in comparable terms, what the defaults for arguments c and d are? Currently, every argument default is a first-class value. As I understand it, your proposal breaks that assumption, and now argument defaults can be some kind of "construct" that is not a first class value, not any kind of object, just some sort of internal entity that no one can see or manipulate in any way, it just gets automatically evaluated later. I really don't like that. One of the things I like about Python is the "everything is an object" approach under which most of the things that programmers work with, apart from a very few base syntactic constructs, are objects. Many previous expansions to the language, like decorators, context managers, the iteration protocol, etc., worked by building on this object model. But this proposal seems to diverge quite markedly from that. If the "late-bound default" is not an object of some kind just like the early-bound ones are, then I can't agree that both "are argument defaults". -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Sun, Oct 31, 2021 at 1:03 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-30 18:29, Chris Angelico wrote:
What I am saying is that there is a qualitative difference between "I know now (at function definition time) what value to use for this argument if it's missing" and "I know now (at function definition time) *what I will do* if this argument is missing". Specifying "what you will do" is naturally what you do inside the function. It's a process to be done later, it's logic, it's code. It is not the same as finalizing an actual VALUE at function definition time. So yes, there is a qualitative difference between:
# this if argument is undefined: argument = some_constant_value
# and this if argument is undefined: # arbitrary code here
I mean, the difference is that in one case arbitrary code is allowed! That's a big difference.
Right. That is a very real difference, which is why there is a very real difference between early-bound and late-bound defaults. But both are argument defaults.
I don't 100% agree with that.
I mean, here's another way to come at this. Suppose we have this under the proposal (using some random syntax picked at random):
def foo(a=1, b="two", c@=len(b), d@=a+c):
You keep saying that c and d "are argument default" just like a and b. So can you tell me what the argument default for each of those arguments?
The default for a is the integer 1. The default for b is the string "two". The default for c is the length of b. The default for d is the sum of a and c.
The default for argument a is an integer. The default for argument b is a string. Can you tell me, in comparable terms, what the defaults for arguments c and d are?
You're assuming that every default is a *default value*. That is the current situation, but it is by no means the only logical way a default can be defined. See above: c's default is the length of b, which is presumably an integer, and d's default is the sum of that and a, which is probably also an integer (unless a is passed as a float or something).
Currently, every argument default is a first-class value. As I understand it, your proposal breaks that assumption, and now argument defaults can be some kind of "construct" that is not a first class value, not any kind of object, just some sort of internal entity that no one can see or manipulate in any way, it just gets automatically evaluated later.
I really don't like that. One of the things I like about Python is the "everything is an object" approach under which most of the things that programmers work with, apart from a very few base syntactic constructs, are objects. Many previous expansions to the language, like decorators, context managers, the iteration protocol, etc., worked by building on this object model. But this proposal seems to diverge quite markedly from that.
What object is this? a if random.randrange(2) else b Is that an object? No; it's an expression. It's a rule. Not EVERYTHING is an object. Every *value* is an object, and that isn't changing. Or here, what value does a have? def f(): if 0: a = 1 print(a) Does it have a value? No. Is it an object? No. The lack of value is not itself a value, and there is no object that represents it.
If the "late-bound default" is not an object of some kind just like the early-bound ones are, then I can't agree that both "are argument defaults".
Then we disagree. I see them both as perfectly valid defaults - one is a default value, the other is a default expression. ChrisA
On 2021-10-30 19:11, Chris Angelico wrote:
I mean, here's another way to come at this. Suppose we have this under the proposal (using some random syntax picked at random):
def foo(a=1, b="two", c@=len(b), d@=a+c):
You keep saying that c and d "are argument default" just like a and b. So can you tell me what the argument default for each of those arguments?
The default for a is the integer 1. The default for b is the string "two". The default for c is the length of b. The default for d is the sum of a and c.
The default for argument a is an integer. The default for argument b is a string. Can you tell me, in comparable terms, what the defaults for arguments c and d are?
You're assuming that every default is a*default value*. That is the current situation, but it is by no means the only logical way a default can be defined. See above: c's default is the length of b, which is presumably an integer, and d's default is the sum of that and a, which is probably also an integer (unless a is passed as a float or something).
Well, at least that clarifies matters. :-) I was already -1 on this but this moves me to a firm -100. "The length of b" is a description in English, not any kind of programming construct. "The length of b" has no meaning in Python. What we store for the default (even if we don't want to call it a value) has to be a Python construct, not a human-language description. I could say "the default of b is the notion of human frailty poured into golden goblet", and that would be just as valid as "the length of b" as a description and just as meaningless in terms of Python's data model. The default of every argument should be a first-class value. That's how things are now, and I think that's a very useful invariant to have. If we want to break it we need a lot more justification than "I don't like typing if x is None". Apart from all the other things about this proposal I don't support, I don't support the creation of a mysterious "expression" which is not a first class value and cannot be used or evaluated in any way except automatically in the context of calling a function. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Sun, Oct 31, 2021 at 2:23 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-30 19:11, Chris Angelico wrote:
I mean, here's another way to come at this. Suppose we have this under the proposal (using some random syntax picked at random):
def foo(a=1, b="two", c@=len(b), d@=a+c):
You keep saying that c and d "are argument default" just like a and b. So can you tell me what the argument default for each of those arguments?
The default for a is the integer 1. The default for b is the string "two". The default for c is the length of b. The default for d is the sum of a and c.
The default for argument a is an integer. The default for argument b is a string. Can you tell me, in comparable terms, what the defaults for arguments c and d are?
You're assuming that every default is a*default value*. That is the current situation, but it is by no means the only logical way a default can be defined. See above: c's default is the length of b, which is presumably an integer, and d's default is the sum of that and a, which is probably also an integer (unless a is passed as a float or something).
Well, at least that clarifies matters. :-)
I was already -1 on this but this moves me to a firm -100.
"The length of b" is a description in English, not any kind of programming construct. "The length of b" has no meaning in Python. What we store for the default (even if we don't want to call it a value) has to be a Python construct, not a human-language description. I could say "the default of b is the notion of human frailty poured into golden goblet", and that would be just as valid as "the length of b" as a description and just as meaningless in terms of Python's data model.
We have a very good construct to mean "the length of b". It looks like this: len(b) In CPython bytecode, it looks something like this: LOAD_GLOBAL "len" LOAD_FAST "b" CALL_FUNCTION with 1 argument (Very approximately, anyway.) And that's exactly what PEP 671 uses: in the source code, it uses len(b), and in the compiled executable, the corresponding bytecode sequence. Since we don't have a way to define human frailty and golden goblets, we don't have a good way to encode that in either source code or bytecode.
The default of every argument should be a first-class value. That's how things are now, and I think that's a very useful invariant to have. If we want to break it we need a lot more justification than "I don't like typing if x is None".
How important is this? Yes, it's the status quo, but if we evaluate every proposal by how closely it retains the status quo, nothing would ever be changed. Let's look at type checking for a moment, and compare a few similar functions: def spam(stuff: list, n: int): ... Takes a list and an integer. Easy. The integer is required. def spam(stuff: list, n: int = None): if n is None: n = len(stuff) ... Takes a list, and either an integer or None. What does None mean? Am I allowed to pass None, or is it just a construct that makes the parameter optional? _sentinel = object() def spam(stuff: list, n: int = _sentinel): if n is _sentinel: n = len(stuff) ... Takes a list, and either an integer or.... some random object. Can anyone with type checking experience (eg MyPy etc) say how this would best be annotated? Putting it like this doesn't work, nor does Optional[int]. def spam(stuff: list, n: int => len(stuff)): ... MyPy (obviously) doesn't yet support this syntax, so I can't test this, but I would assume that it would recognize that len() returns an integer, and accept this. (That's what it does if I use "= len(stuff)" and a global named stuff.) Currently, argument defaults have to be able to be evaluated at function definition time. It's fine if they can't be precomputed at compile time. Why do we have this exact restriction, neither more nor less? Is that really as important an invariant as you say?
Apart from all the other things about this proposal I don't support, I don't support the creation of a mysterious "expression" which is not a first class value and cannot be used or evaluated in any way except automatically in the context of calling a function.
You use similarly mysterious "expressions" all the time. They can't be treated as first-class values, but they can be used in the contexts they are in. This is exactly the same. You've probably even used expressions that are only evaluated conditionally: recip = 1 / x if x else 0 Is the "1 / x" a first-class value here? No - and that's why if/else is a compiler construct rather than a function. If you wouldn't use this feature, that's fine, but it shouldn't stand or fall on something that the rest of the language doesn't follow. ChrisA
On 2021-10-30 20:44, Chris Angelico wrote:
The default of every argument should be a first-class value. That's how things are now, and I think that's a very useful invariant to have. If we want to break it we need a lot more justification than "I don't like typing if x is None".
How important is this? Yes, it's the status quo, but if we evaluate every proposal by how closely it retains the status quo, nothing would ever be changed. Let's look at type checking for a moment, and compare a few similar functions: <snip>
As I've said before, the problem is that the benefit of this feature is too small to justify this large change to the status quo.
You use similarly mysterious "expressions" all the time. They can't be treated as first-class values, but they can be used in the contexts they are in. This is exactly the same. You've probably even used expressions that are only evaluated conditionally:
recip = 1 / x if x else 0
Is the "1 / x" a first-class value here? No - and that's why if/else is a compiler construct rather than a function.
No, but that example illustrates why: the expression there is not a first-class value, but that's fine because it also has no independent status. The expression as a whole is evaluated and the expression as a whole has a value and that is what you work with. You don't get to somehow stash away just the 1/x part and use it later, but without evaluating it yet. That is not even close to the situation envisioned in the proposal under discussion, because in the proposal this mysterious expression (the argument default), although not a first-class value, is going to be stored and used independently, yet without evaluating it and without "reifying" it into a function or other such object It's obvious that there are tons of things that aren't first-class values in Python (and in any language). The plus sign isn't a first class value, the sequence of characters `a = "this".cou` in source code isn't a first class value. But these usages aren't the same as what's in the proposal under discussion here. I'm not sure if you actually disagree with this or if you're just being disingenuous with your examples. Anyway, I appreciate your engaging with me in this discussion, especially since I'm just some random guy whose opinion is not of major importance. :-) But I think we're both kind of just repeating ourselves at this point. I acknowledge that your proposal is essentially clear and is aimed at serving a genuine use case, but I still see it as involving too much of a departure from existing conventions, too much hairiness in the details, and too little real benefit, to justify the change. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Sun, Oct 31, 2021 at 3:08 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-30 20:44, Chris Angelico wrote:
The default of every argument should be a first-class value. That's how things are now, and I think that's a very useful invariant to have. If we want to break it we need a lot more justification than "I don't like typing if x is None".
How important is this? Yes, it's the status quo, but if we evaluate every proposal by how closely it retains the status quo, nothing would ever be changed. Let's look at type checking for a moment, and compare a few similar functions: <snip>
As I've said before, the problem is that the benefit of this feature is too small to justify this large change to the status quo.
You use similarly mysterious "expressions" all the time. They can't be treated as first-class values, but they can be used in the contexts they are in. This is exactly the same. You've probably even used expressions that are only evaluated conditionally:
recip = 1 / x if x else 0
Is the "1 / x" a first-class value here? No - and that's why if/else is a compiler construct rather than a function.
No, but that example illustrates why: the expression there is not a first-class value, but that's fine because it also has no independent status. The expression as a whole is evaluated and the expression as a whole has a value and that is what you work with. You don't get to somehow stash away just the 1/x part and use it later, but without evaluating it yet. That is not even close to the situation envisioned in the proposal under discussion, because in the proposal this mysterious expression (the argument default), although not a first-class value, is going to be stored and used independently, yet without evaluating it and without "reifying" it into a function or other such object
It's actually the same. There's no "stashing away" of the expression part; it's just part of the function, same as everything else is. (I intend to have a string representation of it stashed away, but that IS a first-class value - a str object, or a PyUnicode if you look in the C API.) You can't use that later in any way other than by calling the function while not passing it the corresponding argument. You cannot use it independently, and you cannot retrieve an unevaluated version of it.
It's obvious that there are tons of things that aren't first-class values in Python (and in any language). The plus sign isn't a first class value, the sequence of characters `a = "this".cou` in source code isn't a first class value. But these usages aren't the same as what's in the proposal under discussion here. I'm not sure if you actually disagree with this or if you're just being disingenuous with your examples.
Right. Some things are simply invalid, others are compiler constructs. I'm pointing out that compiler constructs are very real things, despite not being values. I'm not sure about your ".cou" example; the only reason that isn't a first-class value is that, when you try to look it up, you'll get an error. Or are you saying that an assignment statement isn't a value? If so, then it's the same thing again: a compiler construct, a syntactic feature. In general, syntactic features fall into three broad categories: 1) Literals, which have clear values 2) Expressions, which will yield values when evaluated 3) Everything else. Mostly statements. No value ever. For instance, the notation >> 42j << is a complex literal (an imaginary number), and >> a+b << is an expression representing a sum. Thanks to constant folding, we can pretend that >> 3+4j << is a literal too, although technically it's an expression. Argument defaults are *always* expressions. They can be evaluated at function definition time, or - if PEP 671 is accepted - at function call time. The expression itself is nothing more than a compiler construct, and isn't a first-class value. If it's early-bound, then it'll be evaluated at def time to yield a value which then gets saved on the function object and then gets assigned any time the parameter has no value; if it's late-bound, then any time the parameter has no value, it'll be evaluated at call time, to yield a value which then gets assigned to the parameter. Either way, the same steps happen, just in a different order.
Anyway, I appreciate your engaging with me in this discussion, especially since I'm just some random guy whose opinion is not of major importance. :-) But I think we're both kind of just repeating ourselves at this point. I acknowledge that your proposal is essentially clear and is aimed at serving a genuine use case, but I still see it as involving too much of a departure from existing conventions, too much hairiness in the details, and too little real benefit, to justify the change.
I think everyone's opinion is of major importance here, because you're taking the time to discuss the feature. I could go on a long rant about this, but the short version is: neither representative democracy nor pure democracy is nearly as good as a system where those in charge (here, the PSF) can get informed debate from that specific subset of people who actually care about something :) And it's fine for you to believe that this shouldn't happen. As long as the debate is respectful, courteous, professional, and based on facts, not people ("this idea sucks because YOU thought of it"), it's worth having! I'm happy to continue answering questions or countering arguments, because I believe that this IS of more value than its costs; but I'm also open to being proven wrong on that point (and believe you me, the work of implementing it showed me that the cost isn't all where I thought it would be). ChrisA
Brendan Barnwell writes:
As I've said before, the problem is that the benefit of this feature is too small to justify this large change to the status quo.
I don't disagree with this (but the whole thing is a YAGNI fvo "you" = "me" so I'm uncomfortable agreeing as long "as x = x or default" keeps working ;-).
That is not even close to the situation envisioned in the proposal under discussion, because in the proposal this mysterious expression (the argument default), although not a first-class value, is going to be stored and used independently,
As far as I can see, this is exactly similar to the Lisp &aux, in the sense that there's no let form that a macro can grab in the body of the definition.[1] But the defaulted argument (see what I did there?) in both cases *is* a first-class value, stored in the usual place. In the case of Lisp &aux, as the current binding of the symbol on entry to the body of the defun, and in the case of Chris's proposal, as the current binding of the name on entry to the function body. It's always true in the case of Python that intropecting code is a fraught enterprise. Even in the case of def foo(x, y): return x + y foo(1, 1) it's not possible to introspect *how* foo produced "2" without messing with the byte code. I don't see this as very different: as has been pointed out several times, def bar(x = None): x = [] if x is None else x return x cannot be introspected without messing with the byte code, either. In both cases, the expression that produces the first-class object that x is bound to is explicit in the source code, just in different places, and invisible in the compiled code (unless you're willing to mess with the byte code, in which case you know where to find it after all). Footnotes: [1] True, said macro *can* see the &aux clause in the lambda form, reconstruct the let form, and do its evil work on the lambda, but such omniscience is possible because Lisp is the Language of the Gods, and rarely used by frail mortals.
On Sat, Oct 30, 2021 at 7:03 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
Right. That is a very real difference, which is why there is a very real difference between early-bound and late-bound defaults. But both are argument defaults.
I don't 100% agree with that.
I mean, here's another way to come at this. Suppose we have this under the proposal (using some random syntax picked at random):
def foo(a=1, b="two", c@=len(b), d@=a+c):
You keep saying that c and d "are argument default" just like a and b. So can you tell me what the argument default for each of those arguments?
I'm trying to figure out if this is a (english language) semantics issue or a real conceptual issue. Yes, we don't know what the default "value" (by either the general english definition or the python definition), but we do know that the default will be set to the result of the depression when evaluated in the context of the function, which is very clear to me, at least. The default for argument a is an integer. The default for argument
b is a string. Can you tell me, in comparable terms, what the defaults for arguments c and d are?
yes: the default for c is the result of evaluating `len(b)`, and default to d is the result of evaluating `a+c` in contrast, what are defaults in this case? def foo(a=1, b="two", c=None, d=None): obviously, they are None -- but how useful is that? How about: def foo(a=1, b="two", c=None, d=None): """ ... if None, c is computed as the length of the value of b, and d is computed as that length plus a """ if c is None: c = len(b) if d is None: d = a + c Is that really somehow more clear? And you'd better hope that the docstring matches the code! I'm having a really hard time seeing how this PEP would make anything less clear or confusing. -CHB
Currently, every argument default is a first-class value. As I understand it, your proposal breaks that assumption, and now argument defaults can be some kind of "construct" that is not a first class value, not any kind of object, just some sort of internal entity that no one can see or manipulate in any way, it just gets automatically evaluated later.
I really don't like that. One of the things I like about Python is the "everything is an object" approach under which most of the things that programmers work with, apart from a very few base syntactic constructs, are objects. Many previous expansions to the language, like decorators, context managers, the iteration protocol, etc., worked by building on this object model. But this proposal seems to diverge quite markedly from that.
If the "late-bound default" is not an object of some kind just like the early-bound ones are, then I can't agree that both "are argument defaults".
-- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/Q7OGRK... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On 2021-10-30 at 18:54:51 -0700, Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-30 18:29, Chris Angelico wrote:
Right. That is a very real difference, which is why there is a very real difference between early-bound and late-bound defaults. But both are argument defaults.
I don't 100% agree with that.
This seems to be the crux of this whole sub-discussion. This whole thing scratches an itch I don't have, likely because of the way I learned to design interfaces on all levels. A week or so ago, I was firmly in Brendan Barnwell's camp. I really don't like how the phrase "default value" applies to PEP-671's late binding, and I'm sure that there will remain cases in which actual code inside the function will be required. But I'm beginning to see the logic behind the arguments (pun intended) for the PEP. As prior art, consider Common Lisp's lambda expressions, which are effectively anonymous functions (such expressions are often bound to names, which is how Common Lisp creates named functions); see https://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node64.html for reference. The human description language/wording is different; , but what Python spells "default value," Common Lisp spells "initform." Python is currently much less flexible about when and in what context default values are evaluated; PEP-671 attempts to close that gap, but is hampered by certain technical and emotional baggage. (OTOH, Common Lisp's lambda expressions take one more step and include so-called "aux variables," which aren't parameters at all, but variables local to the function itself. I don't have enough background or context to know why these are included in a lambda expression.)
On Sun, Oct 31, 2021 at 2:43 PM <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:
On 2021-10-30 at 18:54:51 -0700, Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-30 18:29, Chris Angelico wrote:
Right. That is a very real difference, which is why there is a very real difference between early-bound and late-bound defaults. But both are argument defaults.
I don't 100% agree with that.
This seems to be the crux of this whole sub-discussion. This whole thing scratches an itch I don't have, likely because of the way I learned to design interfaces on all levels. A week or so ago, I was firmly in Brendan Barnwell's camp. I really don't like how the phrase "default value" applies to PEP-671's late binding, and I'm sure that there will remain cases in which actual code inside the function will be required. But I'm beginning to see the logic behind the arguments (pun intended) for the PEP.
Current versions of the PEP do not use the term "default value" when referring to late binding (or at least, if I've made a mistake there, then please point it out so I can fix it). I'm using the term "default expression", or just "default" (to cover both values and expressions). And yes; there will always be cases where you can't define the default with a simple expression. For instance, a one-arg lookup might raise an exception where a two-arg one could return a default value. That's currently best written with a dedicated object: _sentinel = object() def fetch(thing, default=_sentinel): ... attempt to get stuff if thing in stuff: return stuff[thing] if default is _sentinel: raise ThingNotFoundError return default In theory, optional arguments without defaults could be written something like this: def fetch(thing, default=pass): ... as above if not exists default: raise ThingNotFoundError return default But otherwise, there has to be some sort of value for every parameter. (I say this as a theory, but actually, the reference implementation of PEP 671 has code very similar to this. There's a bytecode QUERY_FAST which yields True if a local has a value, False if not. It's more efficient than "try: default; True; except UnboundLocalError: False" but will have the same effect.)
The human description language/wording is different; , but what Python spells "default value," Common Lisp spells "initform." Python is currently much less flexible about when and in what context default values are evaluated; PEP-671 attempts to close that gap, but is hampered by certain technical and emotional baggage.
Lisp's execution model is quite different from Python's, but I'd be curious to hear more about this. Can you elaborate? ChrisA
On Sun, Oct 31, 2021 at 02:56:36PM +1100, Chris Angelico wrote:
Current versions of the PEP do not use the term "default value" when referring to late binding (or at least, if I've made a mistake there, then please point it out so I can fix it). I'm using the term "default expression", or just "default" (to cover both values and expressions).
I was just thinking of suggesting that to you, so I'm glad to see you're much faster on the uptake than I am! Of course all parameters are syntactically an expression, including now: # the status quo def func(arg=CONFIG.get('key', NULL)): The default expression is evaluated at function definition time, and the result of that (an object, a.k.a. a value) is cached in the function object for later use. With late-binding: def func(@arg=CONFIG.get('key', NULL)): the expression is stashed away somewhere (implementation details), in some form (source code? byte-code? an AST?) rather than immediately evaluated. At function call time, the expression is evaluated, and the result (an object, a.k.a. a value) is bound to the parameter. In neither case is it correct to say that the default value of arg is the *expression* `CONFIG.get('key', NULL)`, it is in both the early and late bound cases the *result* of *using* (evaluating) the expression to generate a value. https://en.wikipedia.org/wiki/Use%E2%80%93mention_distinction I'm fairly confident that everyone understands that: "the default value is CONFIG.get('key', NULL)" is shorthand for the tediously long and pedantic explanation that it's not the expression itself that is the default value, but the result of evaluating the expression. Just like we understand it here: if arg is None: arg = CONFIG.get('key', NULL) The only difference is when the expression is evaluated. If we can understand that arg gets set to the result of the expression in the second case (the None sentinel), we should be able to understand it in if the syntax changes to late-bound parameters. -- Steve
On Sun, Oct 31, 2021 at 4:37 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Sun, Oct 31, 2021 at 02:56:36PM +1100, Chris Angelico wrote:
Current versions of the PEP do not use the term "default value" when referring to late binding (or at least, if I've made a mistake there, then please point it out so I can fix it). I'm using the term "default expression", or just "default" (to cover both values and expressions).
I was just thinking of suggesting that to you, so I'm glad to see you're much faster on the uptake than I am!
Of course all parameters are syntactically an expression, including now:
# the status quo def func(arg=CONFIG.get('key', NULL)):
The default expression is evaluated at function definition time, and the result of that (an object, a.k.a. a value) is cached in the function object for later use. With late-binding:
def func(@arg=CONFIG.get('key', NULL)):
the expression is stashed away somewhere (implementation details), in some form (source code? byte-code? an AST?) rather than immediately evaluated. At function call time, the expression is evaluated, and the result (an object, a.k.a. a value) is bound to the parameter.
The code for it is part of the byte-code, and I'm planning to have either the source code or the AST (or a reconstituted source code) stored for documentation purposes. This, in fact, is true at compilation time regardless of whether it's early-bound or late-bound. Consider: def make_func(): def func(arg=CONFIG.get('key', NULL)): ... Is the expression for this default argument value "stashed away" somewhere? Well, kinda, I guess. It's part of the code that the 'def' statement produces, and will be run when make_func() runs. The difference is that here: def make_func(): def func(arg=>CONFIG.get('key', NULL)): ... the code is part of func, rather than make_func. So you're absolutely right: either way, the default is an expression. I'm using the term "default expression" to mean that the expression is evaluated at call time rather than def time, but I'm open to other terminology.
In neither case is it correct to say that the default value of arg is the *expression* `CONFIG.get('key', NULL)`, it is in both the early and late bound cases the *result* of *using* (evaluating) the expression to generate a value.
https://en.wikipedia.org/wiki/Use%E2%80%93mention_distinction
I'm fairly confident that everyone understands that:
"the default value is CONFIG.get('key', NULL)"
is shorthand for the tediously long and pedantic explanation that it's not the expression itself that is the default value, but the result of evaluating the expression. Just like we understand it here:
if arg is None: arg = CONFIG.get('key', NULL)
The only difference is when the expression is evaluated.
Exactly. ChrisA
On 2021-10-31 at 14:56:36 +1100, Chris Angelico <rosuav@gmail.com> wrote:
On Sun, Oct 31, 2021 at 2:43 PM <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:
On 2021-10-30 at 18:54:51 -0700, Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-30 18:29, Chris Angelico wrote:
Right. That is a very real difference, which is why there is a very real difference between early-bound and late-bound defaults. But both are argument defaults.
I don't 100% agree with that.
This seems to be the crux of this whole sub-discussion. This whole thing scratches an itch I don't have, likely because of the way I learned to design interfaces on all levels. A week or so ago, I was firmly in Brendan Barnwell's camp. I really don't like how the phrase "default value" applies to PEP-671's late binding, and I'm sure that there will remain cases in which actual code inside the function will be required. But I'm beginning to see the logic behind the arguments (pun intended) for the PEP.
Current versions of the PEP do not use the term "default value" when referring to late binding (or at least, if I've made a mistake there, then please point it out so I can fix it). I'm using the term "default expression", or just "default" (to cover both values and expressions).
I still see anything more complicated than a constant or an extremely simple expression (len(a)? well, ok, maybe; (len(a) if is_prime((len(a)) else next_larger_prime(len(a)))? don't push it) as no longer being a default, but something more serious, but I don't have a better name for it than "computation" or "part of the function" or even "business logic" or "a bad API."
And yes; there will always be cases where you can't define the default with a simple expression. For instance, a one-arg lookup might raise an exception where a two-arg one could return a default value ...
Am I getting ahead of myself, or veering into the weeds, if I ask whether you can catch the exception or what the stacktrace might show? (At this point, that's probably more of a rhetorical question. Again, this is an itch I don't have, so I probably won't use it much.)
The human description language/wording is different; , but what Python spells "default value," Common Lisp spells "initform." Python is currently much less flexible about when and in what context default values are evaluated; PEP-671 attempts to close that gap, but is hampered by certain technical and emotional baggage.
Lisp's execution model is quite different from Python's, but I'd be curious to hear more about this. Can you elaborate?
https://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node64.html explains it in great detail with examples; you're interested in &optional, initform, and possibly supplied-p. Summarizing what I think is relevant: (lambda (&optional x)) is a function with an optional parameter called x. Calling that function without a parameter results in x being bound to nil (Lisp's [overloaded] canonical "false"/undefined/null value) in the function body. (lambda (&optional (x 4))) is a function with an optional parameter called x with an initform. Calling that function without a parameter results in x being bound to the value 4 in the function body. Calling that function with a parameter results in x being bound to the value of that parameter. (lambda (&optional (x 4 p))) is a function with an optional parameter called x with an initform and a supplied-p parameter called p. Calling that function without a parameter results in p being bound to nil and x being bound to the value 4 in the function body. Calling that function with a parameter results in p being bound to t (Lisp's canonical "true" value) and x being bound to the value of that parameter. By default (no pun intended), all of that happens at function call time, before the function body begins, but Lisp has ways of forcing evaluation to take place at other times. (lambda (a &optional (hi (length a)))) works as expected; it's a function that takes one parameter called a (presumably a sequence), and an optional parameter called hi (which defaults to the length of a, but can be overridden by the calling code). I am by no means an expert, and again, I tend not to use optional parameters and default values (aka initforms) except in the simplest ways.
On Sat, Oct 30, 2021 at 10:51:42PM -0700, 2QdxY4RzWzUUiLuE@potatochowder.com wrote:
I still see anything more complicated than a constant or an extremely simple expression (len(a)? well, ok, maybe; (len(a) if is_prime((len(a)) else next_larger_prime(len(a)))? don't push it) as no longer being a default, but something more serious, but I don't have a better name for it than "computation" or "part of the function" or even "business logic" or "a bad API."
There is no benefit to using an actual constant as a late-bound default. If the value is constant, then why delay evaluation? You're going to get the same constant one way or another. So linters should flag misuse like: func(@arg=0) List and dict displays ("literals") like [] and {} are a different story, but they aren't constants. I agree that extremely complex expressions fall under the category of "Don't Do That". But that's a code review and/or linter problem to solve. Most uses of late-binding defaults are going to be short and simple, such as: * an empty list or dict display; * call a function; * access an attribute of self; * len of another argument. Right now, defaults can be set to arbitrarily complex expressions. When is the last time you saw a default that was uncomfortably complex? -- Steve
On Sun, Oct 31, 2021 at 4:55 PM <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:
On 2021-10-31 at 14:56:36 +1100, Chris Angelico <rosuav@gmail.com> wrote:
On Sun, Oct 31, 2021 at 2:43 PM <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:
On 2021-10-30 at 18:54:51 -0700, Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-30 18:29, Chris Angelico wrote:
Right. That is a very real difference, which is why there is a very real difference between early-bound and late-bound defaults. But both are argument defaults.
I don't 100% agree with that.
This seems to be the crux of this whole sub-discussion. This whole thing scratches an itch I don't have, likely because of the way I learned to design interfaces on all levels. A week or so ago, I was firmly in Brendan Barnwell's camp. I really don't like how the phrase "default value" applies to PEP-671's late binding, and I'm sure that there will remain cases in which actual code inside the function will be required. But I'm beginning to see the logic behind the arguments (pun intended) for the PEP.
Current versions of the PEP do not use the term "default value" when referring to late binding (or at least, if I've made a mistake there, then please point it out so I can fix it). I'm using the term "default expression", or just "default" (to cover both values and expressions).
I still see anything more complicated than a constant or an extremely simple expression (len(a)? well, ok, maybe; (len(a) if is_prime((len(a)) else next_larger_prime(len(a)))? don't push it) as no longer being a default, but something more serious, but I don't have a better name for it than "computation" or "part of the function" or even "business logic" or "a bad API."
If your function header doesn't fit in the source code for your function header, then you're probably doing things that are too complicated :) Nothing's changing there.
And yes; there will always be cases where you can't define the default with a simple expression. For instance, a one-arg lookup might raise an exception where a two-arg one could return a default value ...
Am I getting ahead of myself, or veering into the weeds, if I ask whether you can catch the exception or what the stacktrace might show?
(At this point, that's probably more of a rhetorical question. Again, this is an itch I don't have, so I probably won't use it much.)
Ah, sorry, I wasn't too clear here. Consider this API: _sentinel = object() def get_thing(name, default=_sentinel): populate_thing_cache(name) if name in thing_cache: return thing_cache[name] if default is not _sentinel: return default raise ThingNotFoundError In this case, there's no late-bound default value that could be used here. The code to test whether we got one argument or two has to happen down below. So this sort of code wouldn't change as a result of PEP 671; it still needs to be able to be called with either one argument or two, and the distinction can't be written as "if default is _sentinel: default = ..." at the top of the function. But in terms of catching exceptions from default expressions: No, you can't, unless you wrap it in a function or something. I don't expect this sort of thing to be a common need, and if it is, the exception-catching part probably needs its own descriptive name.
The human description language/wording is different; , but what Python spells "default value," Common Lisp spells "initform." Python is currently much less flexible about when and in what context default values are evaluated; PEP-671 attempts to close that gap, but is hampered by certain technical and emotional baggage.
Lisp's execution model is quite different from Python's, but I'd be curious to hear more about this. Can you elaborate?
https://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node64.html explains it in great detail with examples; you're interested in &optional, initform, and possibly supplied-p. Summarizing what I think is relevant:
(lambda (&optional x)) is a function with an optional parameter called x. Calling that function without a parameter results in x being bound to nil (Lisp's [overloaded] canonical "false"/undefined/null value) in the function body.
(lambda (&optional (x 4))) is a function with an optional parameter called x with an initform. Calling that function without a parameter results in x being bound to the value 4 in the function body. Calling that function with a parameter results in x being bound to the value of that parameter.
(lambda (&optional (x 4 p))) is a function with an optional parameter called x with an initform and a supplied-p parameter called p. Calling that function without a parameter results in p being bound to nil and x being bound to the value 4 in the function body. Calling that function with a parameter results in p being bound to t (Lisp's canonical "true" value) and x being bound to the value of that parameter.
Ah okay. So what this gives you is a very clear indication of whether the default value was used or not. I'm not officially recommending this.... actually, let's make that stronger: This is bad code, but it's possible...
def foo(n=>(n_unset := True) and 123): ... try: n_unset ... except UnboundLocalError: print("n was set to", n) ... print("The value of n is:", n) ... foo() The value of n is: 123 foo(123) n was set to 123 The value of n is: 123
Don't. Just don't. :) ChrisA
I'm not sure this answers Chris's question about "technical and emotional baggage," but I hope to clarify the Lisp model a bit. 2QdxY4RzWzUUiLuE@potatochowder.com writes:
As prior art, consider Common Lisp's lambda expressions[...]. The human description language/wording is different; , but what Python spells "default value," Common Lisp spells "initform." Python is currently much less flexible about when and in what context default values are evaluated; PEP-671 attempts to close that gap, but is hampered by certain technical and emotional baggage.
No, they're equally flexible about when. The difference is that Lisp *never* evaluates the initform at definition time, while Python *always* evaluates defaults at definition time. Lisp's approach is more consistent conceptually than Chris's proposal.[1] That is, in Lisp the initform is conceptually always a form to be evaluated[2], while in Chris's approach there are default values and default expressions. In practice, Lisp marks default values by wrapping them in a quote form[3]. Thus where Lisp always has an object that is introspectable in the usual way, Chris's proposal has an invisible thunk that can't be introspected that way because it's inlined into the compiled function. From the naive programmer's point of view, Chris vs. Lisp just flips the polarity of marked vs. unmarked. The difference only becomes apparent if you want to do a sort of metaprogramming by manipulating the default in some way. This matters in Lisp because of macros, which can and do cut deeply into list structure of code, and revise it there. I guess it might matter in MacroPy, but I don't see a huge loss in Python 3.
(OTOH, Common Lisp's lambda expressions take one more step and include so-called "aux variables," which aren't parameters at all, but variables local to the function itself. I don't have enough background or context to know why these are included in a lambda expression.)
It's syntactic sugar that does nothing more nor less than wrap a let form binding the aux variables around the body of the definition. Footnotes: [1] But then everything in Lisp is more consistent because the List is the One DataType to Rule Them All and in the darkness bind them. [2] Even the default default of nil is conceptually evaluated: (eval nil) => nil. The compiler is allowed to optimize the evaluation away if it can determine the result, as in the case of nil or a lambda form (both of which eval to themselves). [3] (quote x) or 'x is a special form that does not evaluate its argument, and returns it as-is when the quote form is evaluated.
On Sat, Oct 30, 2021 at 06:54:51PM -0700, Brendan Barnwell wrote:
I mean, here's another way to come at this. Suppose we have this under the proposal (using some random syntax picked at random):
def foo(a=1, b="two", c@=len(b), d@=a+c):
You keep saying that c and d "are argument default" just like a and b. So can you tell me what the argument default for each of those arguments?
Sure. If you fail to provide an argument for c, the value that is bound to c by default (i.e. the default argument) is len(b), whatever that happens to be. If you fail to provide a value for d, the value that is bound to d by default is a+c. There is nothing in the concept of "default argument" that requires it to be known at function-definition time, or compile time, or when the function is typed into the editor. Do you have a problem understanding me if I say that strftime defaults to the current time? Surely you don't imagine that I mean that it defaults to 15:34:03, which was the time a few seconds ago when I wrote the words "current time". And you probably will understand me if I say that on POSIX systems such as Linux, the default permissions on newly created files is (indirectly) set by the umask.
Currently, every argument default is a first-class value.
And will remain so. This proposal does not add second-class values to the language. By the time the body of the function is entered, the late-bound parameters will have had their defaults evaluated, and the result bound to the parameter. Inside the function object itself, there will be some kind of opaque blob (possibly a function?) that holds the default's expression for later evaluation. That blob itself will be a first class object, like every other object in Python, even if its internal structure is partially or fully opaque.
As I understand it, your proposal breaks that assumption, and now argument defaults can be some kind of "construct" that is not a first class value, not any kind of object, just some sort of internal entity that no one can see or manipulate in any way, it just gets automatically evaluated later.
Kind of like functions themselves :-) >>> (lambda x, y: 2*x + 3**y).__code__.co_code b'd\x01|\x00\x14\x00d\x02|\x01\x13\x00\x17\x00S\x00' The internal structure of that co_code object is a mystery, it is not part of the Python language, only of the implementation, but it remains a first-class value. (My *guess* is that this may be the raw byte-code of the function body.) One difference will be, regardless of how the expression for the late-bound default is stored, there will be at the very least an API to extract a human-readable string representing the expression. -- Steve
On Sun, Oct 31, 2021 at 3:57 PM Steven D'Aprano <steve@pearwood.info> wrote:
Kind of like functions themselves :-)
>>> (lambda x, y: 2*x + 3**y).__code__.co_code b'd\x01|\x00\x14\x00d\x02|\x01\x13\x00\x17\x00S\x00'
The internal structure of that co_code object is a mystery, it is not part of the Python language, only of the implementation, but it remains a first-class value.
(My *guess* is that this may be the raw byte-code of the function body.)
It is; and fortunately, we have a handy tool for examining it:
dis.dis(b'd\x01|\x00\x14\x00d\x02|\x01\x13\x00\x17\x00S\x00') 0 LOAD_CONST 1 2 LOAD_FAST 0 4 BINARY_MULTIPLY 6 LOAD_CONST 2 8 LOAD_FAST 1 10 BINARY_POWER 12 BINARY_ADD 14 RETURN_VALUE
The exact meaning of this string depends on Python implementation and version. Importantly, though, this bytecode must be interpreted within a particular context (here, the context is the lambda function's code and the function itself), which provides meanings for consts and name lookups. There's no sensible way to inject this into some other function and expect it to mean "2*x + 3**y"; at best, it would actually take co_consts[1] * <co_varnames[0]> + co_consts[2] ** <co_varnames[1]>, but that might not mean anything either. ChrisA
On Sun, Oct 31, 2021 at 12:29:18PM +1100, Chris Angelico wrote:
That's optional parameters. It's entirely possible to have optional parameters with no concept of defaults; it means the function can be called with or without those arguments, and the corresponding parameters will either be set to some standard "undefined" value, or in some other way marked as not given.
That's certainly possible with languages like bash where there are typically (never?) no explicit parameters, the function is expected to pop arguments off the argument list and deal with them as it sees fit. Then a missing argument is literally missing from the argument list, and all the parameter binding logic that the Python interpreter handles for you has to be handled manually by the programmer. Bleh. We could emulate that in Python by having the interpreter flag "optional without a default" parameters in such a way that the parameter remains unbound when called without an argument, but why would we want such a thing? That's truly a YAGNI anti-feature. If you really want to emulate bash, we can just declare the function to take `*args` and manage it ourselves, like bash.
Python, so far, has completely conflated those two concepts: if a parameter is optional, it always has a default argument value. (The converse is implied - mandatory args can't have defaults.)
If a mandatory parameter has a default argument, it would never be used, because the function is always called with an argument for that parameter. So we have a four-way table: Mandatory Optional Parameters Parameters ------------- ----------- ------------ No default: okay YAGNI [1] Default: pointless okay ------------- ----------- ------------ [1] And if you do need it, it is easy to emulate with a sentinel: def func(arg=None): if arg is None: del arg process(arg) # May raise UnboundLocalError None of this is relevant to the question of when the default values should be evaluated. -- Steve
On Sun, Oct 31, 2021 at 3:16 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Sun, Oct 31, 2021 at 12:29:18PM +1100, Chris Angelico wrote:
That's optional parameters. It's entirely possible to have optional parameters with no concept of defaults; it means the function can be called with or without those arguments, and the corresponding parameters will either be set to some standard "undefined" value, or in some other way marked as not given.
That's certainly possible with languages like bash where there are typically (never?) no explicit parameters, the function is expected to pop arguments off the argument list and deal with them as it sees fit. Then a missing argument is literally missing from the argument list, and all the parameter binding logic that the Python interpreter handles for you has to be handled manually by the programmer. Bleh.
We could emulate that in Python by having the interpreter flag "optional without a default" parameters in such a way that the parameter remains unbound when called without an argument, but why would we want such a thing? That's truly a YAGNI anti-feature.
If you really want to emulate bash, we can just declare the function to take `*args` and manage it ourselves, like bash.
Python, so far, has completely conflated those two concepts: if a parameter is optional, it always has a default argument value. (The converse is implied - mandatory args can't have defaults.)
If a mandatory parameter has a default argument, it would never be used, because the function is always called with an argument for that parameter. So we have a four-way table:
Mandatory Optional Parameters Parameters ------------- ----------- ------------ No default: okay YAGNI [1] Default: pointless okay ------------- ----------- ------------
[1] And if you do need it, it is easy to emulate with a sentinel:
def func(arg=None): if arg is None: del arg process(arg) # May raise UnboundLocalError
None of this is relevant to the question of when the default values should be evaluated.
Agreed on all points, but I think the YAGNIness of it is less strong than you imply. Rather than a sentinel object, this would be a sentinel lack-of-object. The point of avoiding sentinels is, like everywhere else, that the sentinel isn't really a value - it's just a marker saying "this arg wasn't passed". So I agree with you that this isn't a feature of huge value, and it's not one that I'm pushing for, but it is at least internally consistent and well-defined. Usually, what ends up happening is that there's code somewhere saying "if arg is _sentinel:" (or "if arg is None:"), which would have exactly the same meaning without the sentinel, if we had a way to say "if arg isn't set". ChrisA
On Sat, Oct 30, 2021 at 03:52:14PM -0700, Brendan Barnwell wrote:
The way you use the term "default value" doesn't quite work for me. More and more I think that part of the issue here is that "hi=>len(a)" we aren't providing a default value at all. What we're providing is default *code*. To me a "value" is something that you can assign to a variable, and current (aka "early bound") argument defaults are that.
I think you are twisting the ordinary meaning of "default value" to breaking point in order to argue against this proposal. When we talk about "default values" for function parameters, we always mean something very simple: when you call the function, you can leave out the argument for some parameter from your call, and it will be automatically be assigned a default value by the time the code in the body of the function runs. It says nothing about how that default value is computed, or where it comes from; it says nothing about whether it is evaluated at compile time, or function creation time, or when the function is called. In this case, there are at least three models for providing that default value: 1. The value must be something which can be computed by the compiler, at compile time, without access to the runtime environment. Nothing that cannot be evaluated by static analysis can be used as the default. (In practice, that may limit defaults to literals.) 2. The value must be something which can be computed by the interpreter at function definition time. (Early binding.) This is the status quo for Python, but not for other languages like Smalltalk. 3. The value can be computed at function call time. (Late binding.) That's what Lisp (Smalltalk? others?) does. They are still default values, and it is specious to call them "code". From the perspective of the programmer, the parameter is bound to a value at call time, just like early binding. Recall that the interpreter still has to execute code to get the early bound value. It doesn't happen by magic: the interpreter needs to run code to fetch the precomputed value and bind it to the parameter. Late binding is *exactly* the same except we leave out the "pre". The thing you get bound to the parameter is still a value, not the code used to generate that value. The Python status quo (early binding, #2 above) is that if you want to delay the computation until call time, you have to use a sentinel, then calculate the default value yourself inside the body of the function, then bind it to the parameter manually. But that's just a work-around for lack of interpreter support for late binding.
But these new late-bound arguments aren't really default "values",
Of course they are, in the only sense that matters: when the body of the function runs, the parameter is bound to an object. That object is a value. How the interpreter got that value from, and when it was evaluated, it neither here nor there. It is still a value one way or another.
they're code that is run under certain circumstances. If we want to make them values, we need to define "evaluate len(a) later" as some kind of first-class value.
No we don't. Evaluating the default value later from outside the function's local scope will usually fail, and even if it doesn't fail, there's no use-cases for it. (Yet. If a good one comes up, we can add an API for it later.) And evaluating it from inside the function's local scope is unnecessary, as the interpreter has already done it for you. I believe that the PEP should declare that how the unevaluated defaults are stored prior to evaluation is a private implementation detail. We need an API (in the inspect module? as a method on function objects?) to allow consumers to query those defaults for human-readable text, e.g. needed by help(). But beyond that, I think they should be an opaque blob. Consider the way we implement comprehensions as functions. But we don't make that a language rule. It's just an implementation detail, and may change. Likewise unevaluated defaults may be functions, or ASTs, or blobs of compiled byte-code, or code objects, or even just stored as source code. Whatever the implementors choose.
Increasingly it seems to me as if you are placing inordinate weight on the idea that the benefit of default arguments is providing a "human readable" description in the default help() and so on. And, to be frank, I just don't care about that.
Better help() is just the icing on the cake. The bigger advantages include that we can reduce the need for special sentinels, and that default values no longer need to be evaluated by hand in the body of the function, as the interpreter handles them. And we write the default value in the function signature, where they belong, instead of in the body. Don't discount value of having the interpreter take on the grunt-work of evaluating defaults. Whatever extra complexity goes into the interpreter will be outweighed by the reduced complexity of a million functions no longer having to manually test for a sentinel and evaluate a default. Reducing grunt work is a good thing. Remember, there are languages (Perl? bash?) where there aren't even parameters to functions, the interpreter merely provides you with an argument list and you are responsible for popping the values out of the list and binding them to the variables you want. If we think that having to pop arguments from an argument list is unbelievably primitive, but are happy with having to check for a sentinel value then evaluate the actual desired default value yourself, then I think that we are falling for the Blub paradox. "Late bound defaults? Who needs them? It's bloat and needless frippery." "Default values? Who needs them? It's bloat and needless frippery." "Named parameters? Who needs them? It's bloat and needless frippery." "Functions? Who needs them? It's bloat and needless frippery." I have a mate who worked for a boss who was still arguing in favour of unstructured code without functions or named subroutines in the late 1990s. At the same time that they were working to fix the Y2K problem, his boss was telling them that GOTO was better than named functions, because it was no problem whatsoever to GOTO the line you wanted to jump to and then jump back again with another GOTO when you were finished. Anything more than that was just unnecessary bloat and frippery. -- Steve
On Sun, Oct 31, 2021 at 1:52 PM Steven D'Aprano <steve@pearwood.info> wrote:
I believe that the PEP should declare that how the unevaluated defaults are stored prior to evaluation is a private implementation detail. We need an API (in the inspect module? as a method on function objects?) to allow consumers to query those defaults for human-readable text, e.g. needed by help(). But beyond that, I think they should be an opaque blob.
I haven't re-posted, but while writing up the implementation, I did add a section on implementation details to the PEP. https://www.python.org/dev/peps/pep-0671/#implementation-details (FWIW, it's very similar to what you were saying in the other thread, only I chose to use Ellipsis, since it's commonly understood as a placeholder, rather than NotImplemented, which is a special signal for binary operators.) ChrisA
On Sat, 30 Oct 2021, Brendan Barnwell wrote:
I agree it seems totally absurd to add a type of deferred expression but restrict it to only work inside function definitions.
Functions are already a form of deferred evaluation. PEP 671 is an embellishment to this mechanism for some of the code in the function signature to actually get executed within the body scope, *just like the body of the function*. This doesn't seem weird to me.
If we have a way to create deferred expressions we should try to make them more generally usable.
Does anyone have a proposal for deferred expressions that could match the ease of use of PEP 671 in assigning a default argument of, say, `[]`? The proposals I've seen so far in this thread involve checking `isdeferred` and then resolving that deferred. This doesn't seem any easier than the existing sentinal approach for default arguments, whereas PEP 671 significantly simplifies this use-case. I also don't see how a function could distinguish a deferred default argument and a deferred argument passed in from another function. In my opinion, the latter would be really messy/dangerous to work with, because it could arbitrarily polute your scope. Whereas late-bound default arguments make a lot of sense: they're written in the function itself (just in the signature instead of the body), so we can see by looking at the code what happens. I've written code in dynamically scoped languages before. I don't recall enjoying it. But maybe I missed a proposal, or someone has an idea for how to fix these issues. Erik -- Erik Demaine | edemaine@mit.edu | http://erikdemaine.org/
On Sat, 30 Oct 2021, Erik Demaine wrote:
Functions are already a form of deferred evaluation. PEP 671 is an embellishment to this mechanism for some of the code in the function signature to actually get executed within the body scope, *just like the body of the function*.
I was thinking about what other forms of deferred evaluation Python has, and ran into descriptors [https://docs.python.org/3/howto/descriptor.html]. Classes support this mechanism for calling arbitrary code when accessing the attribute, instead of when calling the class: ``` class CallMeLater: '''Descriptor for calling a specified function with no arguments.''' def __init__(self, func): self.func = func def __get__(self, obj, objtype=None): return self.func() class Foo: early_list = [] late_list = CallMeLater(lambda: []) foo1 = Foo() foo2 = Foo() foo1.early_list == foo2.early_list == foo1.late_list == foo2.late_list foo1.early_list is foo2.early_list # the same [] foo1.late_list is not foo2.late_list # two different []s ``` Written this way, it feels quite a bit like early and late arguments to me. So this got me thinking: What if parameter defaults supported descriptors? Specifically, something like the following: If a parameter (passed or defaulted) has a __get__ method, call it with one argument (beyond self), namely, the function scope's locals(). Parameters are so processed in order from left to right. (PEPs 549 and 649 are somewhat related in that they also propose extending descriptors.) This would enable the following hand-rolled late-bound defaults (using two early-bound defaults): ``` def foo(early_list = [], late_list = CallMeLater(lambda: [])): ... ``` Or we could write a decorator to make this somewhat cleaner: ``` def late_defaults(func): '''Convert callable defaults into late-bound defaults''' func.__defaults__ = tuple( CallMeLater(default) if callable(default) else default for default in func.__defaults__ ) return func @late_defaults def foo(early_list = [], late_list = lambda: []): ... ``` It's also possible, but difficult, to write `end := len(a)` defaults: ``` class LateLength: '''Descriptor for calling len(specified name)''' def __init__(self, name): self.name = name def __get__(self, locals): return len(locals[self.name]) def __repr__(self): # This is bad form for repr, but it makes help(bisect) # output the "right" thing: end=len(a) return f'len({self.name})' def bisect(a, start=0, end=LateLength('a')): ... ``` One feature/bug of this approach is that someone calling the function could pass in a descriptor, and its __get__ method will get called by the function (immediately at the start of the call). Personally I find this dangerous, but those excited about general deferreds might like it? At least it's still executing the function in its natural scope; it's "just" the locals() dict that gets exposed, as an argument. Alternatively, we could forbid this (at least for now): perhaps a __get__ method only gets checked and called on a parameter when that parameter has its default value (e.g. `end is bisect.__defaults__[1]`). In addition to feeling safer (to me), this would enable a lot of optimization: * Parameters without defaults don't need any __get__ checking. * Default values could be checked for the presence of a __get__ method at function definition time (or when setting func.__defaults__), and that flag could get checked at function call time, and __get__ semantics occur only when that flag is set. (I'm not sure whether this would actually save time, though. Maybe if it were a global flag for the function, "any late-bound arguments here?". If not, old behavior and performance.) This proposal could be compatible with PEP 671. What I find nice about this proposal is that it's valid Python syntax today, just an extension of the data model. But I wouldn't necessarily want to use the ugly incantations above, and rather use some syntactic sugar on top of it -- and that's where PEP 671 could come in. What this proposal might offer is a *meaning* for that syntactic sugar, which is more general and perhaps more Pythonic (building on the existing Python data model). It provides another way to think about what the notation in PEP 671 means, and suggests a (different) mechanism to implement it. Some nice features: * __defaults__ naturally generalizes here; no need for auxiliary structures or different signatures for __defaults__. A tool looking at __defaults__ could either be aware of descriptors in this context or not. All other introspection should be the same. * It becomes possible to skip a positional argument again: pass in the value in __defaults__ and it will behave as if that argument wasn't passed. * The syntactic sugar could build a __repr__ (or some new dunder like __help__) that makes help() output the right thing, as in the example above. The use of locals() (as an argument to __get__) is rather ugly, and probably prevents name lookup optimization. Perhaps there's a better way, at least with the syntactic sugar. For eaxmple, in CPython, late-bound defaults using the syntactic sugar could compile the function to include some bytecode that sets the __get__ function's frame to be the function's frame before it gets called. Hmm, but then the function needs to know whether it's the default or something else that got passed in... What do people think? I'm still thinking about possible repurcussions, but it seems like a promising direction to explore... Erik -- Erik Demaine | edemaine@mit.edu | http://erikdemaine.org/
On Mon, Nov 1, 2021 at 2:39 AM Erik Demaine <edemaine@mit.edu> wrote:
On Sat, 30 Oct 2021, Erik Demaine wrote:
Functions are already a form of deferred evaluation. PEP 671 is an embellishment to this mechanism for some of the code in the function signature to actually get executed within the body scope, *just like the body of the function*.
I was thinking about what other forms of deferred evaluation Python has, and ran into descriptors [https://docs.python.org/3/howto/descriptor.html]. Classes support this mechanism for calling arbitrary code when accessing the attribute, instead of when calling the class:
``` class CallMeLater: '''Descriptor for calling a specified function with no arguments.''' def __init__(self, func): self.func = func def __get__(self, obj, objtype=None): return self.func()
class Foo: early_list = [] late_list = CallMeLater(lambda: [])
foo1 = Foo() foo2 = Foo() foo1.early_list == foo2.early_list == foo1.late_list == foo2.late_list foo1.early_list is foo2.early_list # the same [] foo1.late_list is not foo2.late_list # two different []s ```
Written this way, it feels quite a bit like early and late arguments to me. So this got me thinking:
What if parameter defaults supported descriptors? Specifically, something like the following:
If a parameter (passed or defaulted) has a __get__ method, call it with one argument (beyond self), namely, the function scope's locals(). Parameters are so processed in order from left to right.
(PEPs 549 and 649 are somewhat related in that they also propose extending descriptors.)
This is incompatible with the existing __get__ method, so it should get a different name. Also, functions have a __get__ method, so you definitely don't want to have everything that takes a callback run into this. Let's say it's __delayed__ instead.
This would enable the following hand-rolled late-bound defaults (using two early-bound defaults):
``` def foo(early_list = [], late_list = CallMeLater(lambda: [])): ... ```
Or we could write a decorator to make this somewhat cleaner:
``` def late_defaults(func): '''Convert callable defaults into late-bound defaults''' func.__defaults__ = tuple( CallMeLater(default) if callable(default) else default for default in func.__defaults__ ) return func
@late_defaults def foo(early_list = [], late_list = lambda: []): ... ```
It's also possible, but difficult, to write `end := len(a)` defaults:
``` class LateLength: '''Descriptor for calling len(specified name)''' def __init__(self, name): self.name = name def __get__(self, locals): return len(locals[self.name]) def __repr__(self): # This is bad form for repr, but it makes help(bisect) # output the "right" thing: end=len(a) return f'len({self.name})'
def bisect(a, start=0, end=LateLength('a')): ... ```
I'm having a LOT of trouble seeing this as an improvement.
One feature/bug of this approach is that someone calling the function could pass in a descriptor, and its __get__ method will get called by the function (immediately at the start of the call). Personally I find this dangerous, but those excited about general deferreds might like it? At least it's still executing the function in its natural scope; it's "just" the locals() dict that gets exposed, as an argument.
Yes, which means you can't access nonlocals or globals, only locals. So it has a subset of functionality in an awkward way.
Alternatively, we could forbid this (at least for now): perhaps a __get__ method only gets checked and called on a parameter when that parameter has its default value (e.g. `end is bisect.__defaults__[1]`).
That part's not a problem; if this has language support, it could be much more explicit: "if the end parameter was not set".
This proposal could be compatible with PEP 671. What I find nice about this proposal is that it's valid Python syntax today, just an extension of the data model. But I wouldn't necessarily want to use the ugly incantations above, and rather use some syntactic sugar on top of it -- and that's where PEP 671 could come in. What this proposal might offer is a *meaning* for that syntactic sugar, which is more general and perhaps more Pythonic (building on the existing Python data model). It provides another way to think about what the notation in PEP 671 means, and suggests a (different) mechanism to implement it.
I'm not seeing this as less ugly. You have the exact same problems, plus some more, AND it becomes impossible to have an object with this method as an early default - that's the sentinel problem.
Some nice features:
* __defaults__ naturally generalizes here; no need for auxiliary structures or different signatures for __defaults__. A tool looking at __defaults__ could either be aware of descriptors in this context or not. All other introspection should be the same.
You've just highlighted the sentinel problem: there is no value which can be used in __defaults__ that couldn't have been a viable early-bound default.
* It becomes possible to skip a positional argument again: pass in the value in __defaults__ and it will behave as if that argument wasn't passed.
That's not as valuable as you might think. Faking that an argument wasn't passed - that is, passing an argument that pretends that an argument wasn't passed - is already dubious, and it doesn't work with *args. It would also prevent the safety check that I used above; you have to completely conflate "passed this value" and "didn't pass any value".
The use of locals() (as an argument to __get__) is rather ugly, and probably prevents name lookup optimization.
Yes. It also prevents use of anything other than locals. For instance, you can't have global helper functions, or anything like that; you could use something like len() from the builtins, but you couldn't use a function defined in the same module. Passing both globals and locals would be better, but still imperfect; and it incurs double lookups every time.
Perhaps there's a better way, at least with the syntactic sugar. For eaxmple, in CPython, late-bound defaults using the syntactic sugar could compile the function to include some bytecode that sets the __get__ function's frame to be the function's frame before it gets called. Hmm, but then the function needs to know whether it's the default or something else that got passed in...
Yes, it does. Which doesn't work if you want to be able to pass the default to pretend that nothing was passed.
What do people think? I'm still thinking about possible repurcussions, but it seems like a promising direction to explore...
Sure. Explore anything you like! But I don't think that this is any less ugly than either the status quo or PEP 671, both of which involve actual real code being parsed by the compiler. Chrisa
On Mon, 1 Nov 2021, Chris Angelico wrote:
This is incompatible with the existing __get__ method, so it should get a different name. Also, functions have a __get__ method, so you definitely don't want to have everything that takes a callback run into this. Let's say it's __delayed__ instead.
Right, good point. I'm clearly still learning about descriptors. :-)
I'm having a LOT of trouble seeing this as an improvement.
It's not meant to be an improvement exactly, more of a compatible explanation of how PEP 671 works -- in the same way that `instance.method` doesn't "magically" make a bound method, but rather checks whether `instance.method` has a `__get__` attribute, and if so, calls it with `instance` as an argument, instead of returning `instance.method` directly. This mechanism makes the whole `instance.method` less magic, more introspectable, more overridable, etc., e.g. making classmethod and similar decorators possible. I'm trying to do the same thing with PEP 671 (though possibly failing :-)).
At least it's still executing the function in its natural scope; it's "just" the locals() dict that gets exposed, as an argument.
Yes, which means you can't access nonlocals or globals, only locals. So it has a subset of functionality in an awkward way.
My actual intent was to just be able to access the arguments, which are all locals to the function. [Conceptually, I was thinking of the arguments being in their own object, and then getting accessed once like attributes, which triggered __get__ if defined -- but this view isn't very good, in particular because we don't want to redefine what it means to pass functions as arguments!] But the __delayed__ method is already a function, so it has its own locals, nonlocals, and globals. The difference is that those are in the frame of __delayed__, which is outside the function with the defaults, and I wanted to access that function's arguments -- hence passing in the function's locals().
Alternatively, we could forbid this (at least for now): perhaps a __get__ method only gets checked and called on a parameter when that parameter has its default value (e.g. `end is bisect.__defaults__[1]`).
That part's not a problem; if this has language support, it could be much more explicit: "if the end parameter was not set".
True. I was trying to preserve the "skip this argument" property, but it might make more sense to call __delayed__ only when the argument is omitted. This might make it possible for defaults with __delayed__ methods to actually be evaluated in the function's scope, which would make it more compatible with the current PEP 671.
AND it becomes impossible to have an object with this method as an early default - that's the sentinel problem.
That's true. I guess my point is that these *are* early defaults, but act very much like late defaults. Functions or function calls just treat these early defaults specially because they have a __delayed__ method. I agree it's not perfect, but is there a context where you'd actually want to have an early default that is one of these objects? The point to add a method to an early default that makes the early default behave like a late default. So this feels like expected behavior...?
The use of locals() (as an argument to __get__) is rather ugly, and probably prevents name lookup optimization.
Yes. It also prevents use of anything other than locals. For instance, you can't have global helper functions, or anything like that; you could use something like len() from the builtins, but you couldn't use a function defined in the same module. Passing both globals and locals would be better, but still imperfect; and it incurs double lookups every time.
That wasn't my intent. The __delayed__ method is still a function, and has its own locals, nonlocals, and globals. It can still call len (as my example code did) -- it's just the len visible from the __delayed__ function, not the len visible from the function with the default parameter. It's true that this approach would prevent implementing something like this: ``` def foo(a => (b := 5)): nonlocal b ``` I'm not sure that that is particularly important: I just wanted the default expression to be able to access the arguments and the surrounding scopes.
Sure. Explore anything you like! But I don't think that this is any less ugly than either the status quo or PEP 671, both of which involve actual real code being parsed by the compiler.
This proposal was meant to help define what the compiler with PEP 671 parsed code *into*. Erik -- Erik Demaine | edemaine@mit.edu | http://erikdemaine.org/
On Sun, Oct 31, 2021 at 9:09 AM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
I'm -100 now on "deferred evaluation, but contorted to be useless outside of argument declarations."
At first I thought it might be harmless, but nothing I really care about. After the discussion, I think the PEP would be actively harmful to future Python features.
It's nothing more than an implementation detail. If you want to suggest - or better yet, write - an alternative implementation, I would welcome it. Can you explain how this is "actively harmful"? ChrisA
On Sat, Oct 30, 2021, 6:29 PM Chris Angelico <rosuav@gmail.com> wrote:
At first I thought it might be harmless, but nothing I really care about. After the discussion, I think the PEP would be actively harmful to future Python features.
It's nothing more than an implementation detail. If you want to suggest -
or better yet, write - an alternative implementation, I would welcome it. Can you explain how this is "actively harmful"?
Both the choice of syntax and the discussion of proposed implementation (both yours and Steven's) would make it more difficult later to advocate and implement a more general "deferred" mechanism in the future. If you were proposing the form that MAL and I proposed (and a few others basically agreed) of having a keyword like 'defer' that could, in concept, only be initially available in function signatures but later be extended to other contexts, I wouldn't see a harm. Maybe Steven's `@name=` could accommodate that too. I'm not sure what I think of a general statement like: @do_later = fun1(data) + fun2(data) I.e. we expect to evaluate the first class object `do_later` in some other context, but only if requested within a program branch where `data` is in scope. The similarity to decorators feels wrong, even though I think it's probably not ambiguous syntactically. In a sense, the implementation doesn't matter as much if the syntax is something that could be used more widely. Clearly, adding something to the dunders of a function object isn't a general mechanism, but if behavior was kept consistent, the underlying implementation could change in principle. Still, the check for a sentinel in the first few lines of a function body is easy and fairly obvious, as well as long-standing. New syntax for a trivial use is just clutter and cognitive burden for learners and users.
On Sun, Oct 31, 2021 at 12:31 PM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
On Sat, Oct 30, 2021, 6:29 PM Chris Angelico <rosuav@gmail.com> wrote:
At first I thought it might be harmless, but nothing I really care about. After the discussion, I think the PEP would be actively harmful to future Python features.
It's nothing more than an implementation detail. If you want to suggest - or better yet, write - an alternative implementation, I would welcome it. Can you explain how this is "actively harmful"?
Both the choice of syntax and the discussion of proposed implementation (both yours and Steven's) would make it more difficult later to advocate and implement a more general "deferred" mechanism in the future.
If you were proposing the form that MAL and I proposed (and a few others basically agreed) of having a keyword like 'defer' that could, in concept, only be initially available in function signatures but later be extended to other contexts, I wouldn't see a harm. Maybe Steven's `@name=` could accommodate that too.
I'm not sure what I think of a general statement like:
@do_later = fun1(data) + fun2(data)
I.e. we expect to evaluate the first class object `do_later` in some other context, but only if requested within a program branch where `data` is in scope.
The problem here is that you're creating an object that can be evaluated in someone else's scope. I'm not creating that. I'm creating something that gets evaluated in its own scope - the function currently being defined. If you want to create a "deferred" type, go ahead, but it won't conflict with this. There wouldn't be much to gain by restricting it to function arguments. Go ahead and write up a competing proposal - it's much more general than this.
The similarity to decorators feels wrong, even though I think it's probably not ambiguous syntactically.
The way you've written it, it's bound to an assignment, which seems very odd. Are you creating an arbitrary object which can be evaluated in some other context? Wouldn't that be some sort of constructor call?
In a sense, the implementation doesn't matter as much if the syntax is something that could be used more widely. Clearly, adding something to the dunders of a function object isn't a general mechanism, but if behavior was kept consistent, the underlying implementation could change in principle.
Still, the check for a sentinel in the first few lines of a function body is easy and fairly obvious, as well as long-standing. New syntax for a trivial use is just clutter and cognitive burden for learners and users.
A trivial use? But a very common one. The more general case would be of value, but would also be much more cognitive burden. But feel free to write something up and see how that goes. Maybe it will make a competing proposal to PEP 671; or maybe it'll end up being completely independent. ChrisA
I'm a bit confused as to why folks are making pronouncements about their support for this PEP before it's even finished, but, oh well. As for what seems like one major issue: Yes, this is a kind of "deferred" evaluation, but it is not a general purpose one, and that, I think, is the strength of the proposal, it's small and specific, and, most importantly, the scope in which the expression will be evaluated is clear and simple. In contrast, a general deferred object would, to me, be really confusing about what scope it would get evaluated in -- I can't even imagine how I would do that -- how the heck am I supposed to know what names will be available in some function scope I pass this thing into??? Also, this would only allow a single expression, not an arbitrary amount of code -- if we're going to have some sort of "deferred object" -- folks will very soon want more than that, and want full deferred function evaluation. So that really is a whole other kettle of fish, and should be considered entirely separately. As for inspect -- yes, it would be great for these late-evaluated defaults to have a good representation there, but I can only see that as opening the door to more featureful deferred object, certainly not closing it. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
Warning: Bear of Very Little Brain talking. Aren't we in danger of missing the wood for the (deferred-evaluation) trees here? Late-bound-arguments defaults are USEFUL. Being able to specify that a default value is a new empty list or a new empty set is USEFUL. Plenty of people have asked on Stack Overflow why it doesn't "work" already. Yes, lots of those people will still ask, but with late bound defaults they can get a completely satisfactory answer. Using None or other sentinel to mean "no value was passed" is basically a hack. A kludge. It is MUCH CLEANER to get a default value when no value is passed without resorting to such a hack. Which is fine if the default value is constant (please don't waste your time quibbling with me if I get the technicalities wrong) and can be evaluated early, but needs a late-bound-default if ... well, if it needs to be evaluated late. Furthermore, having a late-bound default which refers to preceding parameters such as ..., hi:=len(a) ... is totally clear to anyone who understands function parameter default values (i.e. pretty much anybody who can read or write functions). And again VERY convenient. As Chris A has already said, the idiom if param = None: param = [] # or whatever is VERY frequent. And saving 2 lines of code in a particular scenario may be no great virtue, but when it happens so VERY frequently (with no loss of clarity) it certainly is. PEP 671, if accepted, will undoubtedly be USEFUL to Python programmers. As for deferred evaluation objects: First, Python already has various ways of doing deferred evaluation: Lambdas Strings, which can be passed to eval / exec / compile. You can write decorator(s) for functions to make them defer their evaluation. You can write a class of deferred-evaluation objects. None of these ways is perfect. Each have their pros and cons. The bottom line, if I understand correctly (maybe I don't) is that there has to be a way of specifying (implicitly or explicitly) when the (deferred) evaluation occurs, and also what the evaluation context is (e.g. for eval, locals and globals must be specified, either explicitly or implicitly). So maybe it would be nice if Python had its own "proper" deferred-evaluation model. But it doesn't. And as far as I know, there isn't a PEP, either completed, or at least well on the way, which proposes such a model. (Perhaps it's a really difficult problem. Perhaps there is not enough interest for someone to have already tried it. Perhaps there are so many ways of doing it, as CHB says below, that it's hard to decide which. I don't know.) If there were, it would be perfectly reasonable to ask how it would interact with PEP 671. But as it's just vapourware, it seems wrong to stall PEP 671 (how long? indefinitely?) because of the POSSIBILITY that it MIGHT turn out to more convenient for some future model (IF that ever happens) if PEP 671 had been implemented in a different way. [I just saw a post from David Mertz giving an idea, using a "delay" keyword, before I finished composing this email. But that is just a sketch, miles away from a fully-fledged proposal or PEP. And probably just one of very many that have appeared over the years on envelope backs before vanishing into the sunset ...] Best wishes Rob Cliffe On 31/10/2021 02:08, Christopher Barker wrote:
I'm a bit confused as to why folks are making pronouncements about their support for this PEP before it's even finished, but, oh well.
As for what seems like one major issue:
Yes, this is a kind of "deferred" evaluation, but it is not a general purpose one, and that, I think, is the strength of the proposal, it's small and specific, and, most importantly, the scope in which the expression will be evaluated is clear and simple. In contrast, a general deferred object would, to me, be really confusing about what scope it would get evaluated in -- I can't even imagine how I would do that -- how the heck am I supposed to know what names will be available in some function scope I pass this thing into??? Also, this would only allow a single expression, not an arbitrary amount of code -- if we're going to have some sort of "deferred object" -- folks will very soon want more than that, and want full deferred function evaluation. So that really is a whole other kettle of fish, and should be considered entirely separately.
As for inspect -- yes, it would be great for these late-evaluated defaults to have a good representation there, but I can only see that as opening the door to more featureful deferred object, certainly not closing it.
-CHB
-- Christopher Barker, PhD (Chris)
Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
_______________________________________________ Python-ideas mailing list --python-ideas@python.org To unsubscribe send an email topython-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived athttps://mail.python.org/archives/list/python-ideas@python.org/message/DYH5LZ... Code of Conduct:http://python.org/psf/codeofconduct/
On Sun, Oct 31, 2021 at 2:20 PM Rob Cliffe via Python-ideas <python-ideas@python.org> wrote:
As for deferred evaluation objects: First, Python already has various ways of doing deferred evaluation: Lambdas Strings, which can be passed to eval / exec / compile. You can write decorator(s) for functions to make them defer their evaluation. You can write a class of deferred-evaluation objects. None of these ways is perfect. Each have their pros and cons. The bottom line, if I understand correctly (maybe I don't) is that there has to be a way of specifying (implicitly or explicitly) when the (deferred) evaluation occurs, and also what the evaluation context is (e.g. for eval, locals and globals must be specified, either explicitly or implicitly).
That last part is the most important here: it has to be evaluated *in the context of the function*. That's the only way for things like "def f(a, n=len(a)):" to be possible. Every piece of code in Python is executed, if it is ever executed, in the context that it was written in. Before f-strings were implemented, this was debated in some detail, and there is no easy way to transfer a context around usefully. ChrisA
On Sun, Oct 31, 2021 at 02:24:10PM +1100, Chris Angelico wrote:
That last part is the most important here: it has to be evaluated *in the context of the function*. That's the only way for things like "def f(a, n=len(a)):" to be possible.
Agreed so far.
Every piece of code in Python is executed, if it is ever executed, in the context that it was written in.
I don't think that's quite right. We can eval() and exec() source code, ASTs and code objects in any namespace we have access to, including plain old dicts, with some limitations. (E.g. we can't get access to other function's namespace, not even if we have their locals() dict. At least not in CPython.) In the case of default expressions: def func(spam=early_expression, @eggs=late_expression): early_expression is evaluated in the scope surrounding func (it has to be since func doesn't exist yet!) and late_expression needs to be evaluated inside func's scope, rather than the scope it was written in. -- Steve
On Sun, Oct 31, 2021 at 6:36 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Sun, Oct 31, 2021 at 02:24:10PM +1100, Chris Angelico wrote:
That last part is the most important here: it has to be evaluated *in the context of the function*. That's the only way for things like "def f(a, n=len(a)):" to be possible.
Agreed so far.
Every piece of code in Python is executed, if it is ever executed, in the context that it was written in.
I don't think that's quite right. We can eval() and exec() source code, ASTs and code objects in any namespace we have access to, including plain old dicts, with some limitations. (E.g. we can't get access to other function's namespace, not even if we have their locals() dict. At least not in CPython.)
True, I was a bit sloppy with my definitions there; let me try that again. Every piece of compiled Python code is executed, if it is ever executed, in a context defined by the location where it was compiled. With eval/exec, they're compiled in their own dedicated context (at least, as of Py3 - I don't think I fully understand what Py2 did there). You can provide a couple of dictionaries to help define that context, but it's still its own dedicated context. ASTs don't have contexts yet, but at the point where you compile it the rest of the way, it gets one. Code objects have their contexts fully defined. To my knowledge, there is no way to run code in any context other than the one it was compiled in, although you can come close by updating a globals dictionary. You can't get closure references (nonlocals) for eval/exec, and I don't think it's possible to finish compiling AST to bytecode in any way that allows you to access more nonlocals.
In the case of default expressions:
def func(spam=early_expression, @eggs=late_expression):
early_expression is evaluated in the scope surrounding func (it has to be since func doesn't exist yet!) and late_expression needs to be evaluated inside func's scope, rather than the scope it was written in.
Actually, by the time you're compiling that line of code, func DOES exist, to some extent. You can't compile a def statement without simultaneously compiling both its body (the scope of func) and its surrounding context (whatever that was written in). It's a little messier now since you can have each of those contexts getting code added to it, but that's still a limited number of options - for instance, you can't have a default expression that gets evaluated in the context of an imported module, nor one that's evaluated in the *caller's* context. (I'm aware that I'm using the word "context" here to mean something that exists at compilation time, and elsewhere I've used the same word to mean something that only exists at run time. Unfortunately, English has only so many ways to express the same sorts of concepts, so we end up reusing. Sorry.) ChrisA
On 10/30/2021 10:08 PM, Christopher Barker wrote:
I'm a bit confused as to why folks are making pronouncements about their support for this PEP before it's even finished, but, oh well. I think it's safe to say people are opposed to the PEP as it current stands, not in it's final, as yet unseen, shape. But I'm willing to use other words that "I'm -1 on PEP 671". You can read my opposition as "as it currently stands, I'm -1 on PEP 671". As for what seems like one major issue:
Yes, this is a kind of "deferred" evaluation, but it is not a general purpose one, and that, I think, is the strength of the proposal, it's small and specific, and, most importantly, the scope in which the expression will be evaluated is clear and simple.
And to me and others, what you see as a strength, and seem opposed to changing, we see as a fatal flaw. What if the walrus operator could only be used in "for" loops? What if f-strings were only available in function parameters? What if decorators could only be used on free-standing functions, but not on object methods? In all of these cases, what could be a general-purpose tool would have been restricted to one specific context. That would make the language more confusing to learn. I feel you're proposing the same sort of thing with late-bound function argument defaults. And I think it's a mistake. If these features had been added in their limited form above, would it be possible to extend them in the future? As they were ultimately implemented, yes, of course. But it's entirely possible that if we were proposing the limited version above we could make a design decision that would prevent them from being more widely used in the future. The most obvious being the syntax used to specify them, but I think that's not the only consideration.
In contrast, a general deferred object would, to me, be really confusing about what scope it would get evaluated in -- I can't even imagine how I would do that -- how the heck am I supposed to know what names will be available in some function scope I pass this thing into??? Also, this would only allow a single expression, not an arbitrary amount of code -- if we're going to have some sort of "deferred object" -- folks will very soon want more than that, and want full deferred function evaluation. So that really is a whole other kettle of fish, and should be considered entirely separately.
And again, this is where we disagree. I think it should be considered in the full context of places it might be useful. I (and I think others) are concerned that we'd be painting ourselves into a corner with this proposal. For example, if the delayed evaluation were available as text via inspect.Signature, we'd be stuck with supporting that forever, even if we later added delayed evaluation objects to the language. I also have other problems with the PEP, not specifically about restricting the scope of where deferred evaluations are allowed. Most importantly, that it doesn't add enough expressiveness to the language to justify its existence as a new feature that everyone would have to learn. But also things such as: Where do exceptions get caught and handled (only by the caller)? How would you pass in "just use the default" from a wrapper function? And others, but even if they were all addressed except for the restricted nature of the feature, I'd still be -1 on the PEP. Eric
On Sun, Oct 31, 2021 at 11:47 PM Eric V. Smith <eric@trueblade.com> wrote:
On 10/30/2021 10:08 PM, Christopher Barker wrote:
I'm a bit confused as to why folks are making pronouncements about their support for this PEP before it's even finished, but, oh well. I think it's safe to say people are opposed to the PEP as it current stands, not in it's final, as yet unseen, shape. But I'm willing to use other words that "I'm -1 on PEP 671". You can read my opposition as "as it currently stands, I'm -1 on PEP 671". As for what seems like one major issue:
Yes, this is a kind of "deferred" evaluation, but it is not a general purpose one, and that, I think, is the strength of the proposal, it's small and specific, and, most importantly, the scope in which the expression will be evaluated is clear and simple.
And to me and others, what you see as a strength, and seem opposed to changing, we see as a fatal flaw.
What if the walrus operator could only be used in "for" loops? What if f-strings were only available in function parameters? What if decorators could only be used on free-standing functions, but not on object methods?
In all of these cases, what could be a general-purpose tool would have been restricted to one specific context. That would make the language more confusing to learn. I feel you're proposing the same sort of thing with late-bound function argument defaults. And I think it's a mistake.
Deferred expressions are not the same as late-bound argument defaults. What is the correct behaviour here? def foo(a=>[1,2,3], b=>len(a)): a.append(4) print(b) And what is the correct behaviour here? def foo(a=defer [1,2,3], b=defer len(a)): a.append(4) print(b) When is 'a' evaluated and the list constructed? When is the length calculated and stored in 'b'? With argument defaults, it's clear: this happens as the function is called. (See other thread for a subtlety about whether this happens during frame construction or as the function-proper begins execution, but that is a minor distinction that doesn't affect non-generators very much.) With deferreds, the usual expectation is that they are evaluated on usage, which could be a very different point. Late-bound defaults are NOT "deferreds but limited to function headers". They are quite different. You can think of them as a sort of deferred expression if that helps, but they're not a specialization of a more general feature. ChrisA
On Sun, Oct 31, 2021, 8:59 AM Chris Angelico
def foo1(a=>[1,2,3], b=>len(a)): a.append(4) print(b)
And what is the correct behaviour here?
def foo2(a=defer [1,2,3], b=defer len(a)): a.append(4) print(b)
This is a nice example. I agree they are different in the natural reading of each. Specifically, suppose both features had been added to the language. I would expect foo1() to print "3" and foo2() to print "4". This is also a good example of why the more general feature is BETTER. It is easy to emulate the foo1() behavior with 'defer', but impossible to emulate foo2() using '=>'. E.g. def foo3(a=defer [1,2,3], b=defer len(a)): # behaves like foo1() b = b # or eval_b = b and use new name in body a.append(4) print(b) Note this: def foo4(a=defer [1,2,3], b=defer len(a)) print(b) # prints 3 In order to print we actually need to walk a DAG. 'b' is an "unevaluated" object, but the interpreter would need to recognize that it depends on unevaluated 'a' ... and so on, however far up the tree it needed to walk to have only regular values (or raise a NameError maybe). This is all precisely prior art, and is what is done by Dask Delayed: https://docs.dask.org/en/stable/delayed.html I think it would be better as actual syntax, but generally Dask already does what I want. The amazingly powerful thing about constructing a DAG of deferred computation is that you can find only intermediate results in a complex tree if that is all you concretely need. I recognize that this is more complex than the niche case of late evaluation of formal parameters. But I consider that niche case trivial, and certainly not worth special syntax. In contrast, the niche case falls out seamlessly from the more general idea. In terms of other prior art, deferred evaluation is the default behavior in Haskell. I admit that I find strictly functional language with no mutability a PITA. But inasmuch as Haskell has some elegance, and sometimes reasonably fast performance, it is largely because delayed evaluation is baked in.
On Mon, Nov 1, 2021 at 2:59 AM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
On Sun, Oct 31, 2021, 8:59 AM Chris Angelico
def foo1(a=>[1,2,3], b=>len(a)): a.append(4) print(b)
And what is the correct behaviour here?
def foo2(a=defer [1,2,3], b=defer len(a)): a.append(4) print(b)
This is a nice example. I agree they are different in the natural reading of each.
Specifically, suppose both features had been added to the language. I would expect foo1() to print "3" and foo2() to print "4".
This is also a good example of why the more general feature is BETTER. It is easy to emulate the foo1() behavior with 'defer', but impossible to emulate foo2() using '=>'.
I'd actually say that this is a good example of why the more general feature is DIFFERENT. The emulation argument is good, but we can already emulate late-binding behaviour using early-binding, and we can emulate deferred evaluation using functions, and so on; the fact that you can emulate one thing with another does not mean that it's of no value to have it. Deferred evaluation has its own set of problems, its own set of features, its own set of edge cases. I strongly encourage you to write up a detailed specification as a completely separate proposal.
E.g.
def foo3(a=defer [1,2,3], b=defer len(a)): # behaves like foo1() b = b # or eval_b = b and use new name in body a.append(4) print(b)
Note this:
def foo4(a=defer [1,2,3], b=defer len(a)) print(b) # prints 3
In order to print we actually need to walk a DAG. 'b' is an "unevaluated" object, but the interpreter would need to recognize that it depends on unevaluated 'a' ... and so on, however far up the tree it needed to walk to have only regular values (or raise a NameError maybe).
This is all precisely prior art, and is what is done by Dask Delayed: https://docs.dask.org/en/stable/delayed.html
I think it would be better as actual syntax, but generally Dask already does what I want.
The amazingly powerful thing about constructing a DAG of deferred computation is that you can find only intermediate results in a complex tree if that is all you concretely need.
I recognize that this is more complex than the niche case of late evaluation of formal parameters. But I consider that niche case trivial, and certainly not worth special syntax.
In contrast, the niche case falls out seamlessly from the more general idea.
All this is excellent and very useful, but I don't think it's the same thing as function defaults.
In terms of other prior art, deferred evaluation is the default behavior in Haskell. I admit that I find strictly functional language with no mutability a PITA. But inasmuch as Haskell has some elegance, and sometimes reasonably fast performance, it is largely because delayed evaluation is baked in.
When mutability does not exist, deferred evaluation becomes more a matter of optimization. Plus, people code to the language they're working in - if, for example, it's normal to write deeply recursive algorithms in a language that heavily optimizes recursion, that doesn't necessarily mean that it's better to write those same algorithms the same way in other languages. If both PEP 671 and some form of deferred expression were both accepted, I think their best interaction would be in introspection (help(), inspect, etc) - the descriptive part of a late-evaluated default could be reconstructed from the AST on demand. ChrisA
On Mon, Nov 1, 2021 at 5:15 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 1/11/21 4:59 am, David Mertz, Ph.D. wrote:
b = b
I don't want to live in a universe where this could be anything other than a no-op in Python.
Be careful what you say: there are some technicalities. If you mean that it won't change the behaviour of the object referred to by b, then I absolutely agree, but there are ways that this can be more than a no-op. Notably, it has very good meaning as a keyword argument (it means "pass b along, named b"), and as a function parameter (meaning "accept b, defaulting to b from the outer scope"); and even as a stand-alone statement, it isn't technically meaningless (it'll force b to be a local). But yes, I agree that I don't want this to force the evaluation of something, which continues to be called b. Even though that's technically possible already if you have a weird namespace, I wouldn't call that a good way to write code. ChrisA
Agreed, class namespaces are weird. :-) On Sun, Oct 31, 2021 at 23:38 Chris Angelico <rosuav@gmail.com> wrote:
On Mon, Nov 1, 2021 at 5:15 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 1/11/21 4:59 am, David Mertz, Ph.D. wrote:
b = b
I don't want to live in a universe where this could be anything other than a no-op in Python.
Be careful what you say: there are some technicalities. If you mean that it won't change the behaviour of the object referred to by b, then I absolutely agree, but there are ways that this can be more than a no-op. Notably, it has very good meaning as a keyword argument (it means "pass b along, named b"), and as a function parameter (meaning "accept b, defaulting to b from the outer scope"); and even as a stand-alone statement, it isn't technically meaningless (it'll force b to be a local).
But yes, I agree that I don't want this to force the evaluation of something, which continues to be called b. Even though that's technically possible already if you have a weird namespace, I wouldn't call that a good way to write code.
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/HQI3UL... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido (mobile)
On Mon, Nov 1, 2021 at 5:57 PM Guido van Rossum <guido@python.org> wrote:
Agreed, class namespaces are weird. :-)
Ah yes, I forgot about class namespaces. I was thinking about deliberately wonky namespaces where the ns dict has a __missing__ method or something, but inside a class, "b = b" actually has a very useful meaning :) It still doesn't change the behaviour of the object b though. ChrisA
On 2021-10-31 05:57, Chris Angelico wrote:
Deferred expressions are not the same as late-bound argument defaults.
You keep saying this, but I still don't get the point. Descriptors are not the same as properties, but we can implement properties with descriptors. Decorators are not the same as callables, but we can implement decorators with callables. The plus sign is not the same as an __add__ method, but we can define the implementation of + in terms of __add__ (and __radd__) methods on objects. As far as I can tell the only real difference is that you seem very intent on the idea that the default argument is ONLY evaluated right at the beginning of the function and not later. But so what? A general deferred expression would still allow you to do that, it would just also allow you to evaluate the deferred expression at some other point. That might be useful or it might not, but I don't see how it prevents a deferred object from serving the function of a late-bound default. The question isn't whether deferred objects and late-bound defaults "are the same", but whether one can provide a more general framework within which the other can be expressed. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Mon, Nov 1, 2021 at 5:03 AM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-10-31 05:57, Chris Angelico wrote:
Deferred expressions are not the same as late-bound argument defaults.
You keep saying this, but I still don't get the point. Descriptors are not the same as properties, but we can implement properties with descriptors. Decorators are not the same as callables, but we can implement decorators with callables. The plus sign is not the same as an __add__ method, but we can define the implementation of + in terms of __add__ (and __radd__) methods on objects.
As far as I can tell the only real difference is that you seem very intent on the idea that the default argument is ONLY evaluated right at the beginning of the function and not later. But so what? A general deferred expression would still allow you to do that, it would just also allow you to evaluate the deferred expression at some other point. That might be useful or it might not, but I don't see how it prevents a deferred object from serving the function of a late-bound default. The question isn't whether deferred objects and late-bound defaults "are the same", but whether one can provide a more general framework within which the other can be expressed.
The whole point of deferred expressions is the time delay, but function defaults have to be evaluated immediately upon entering the function. You're contorting all manner of things to try to come up with something that would work, and the result is, in my opinion, quite inelegant; it doesn't do everything that function defaults need to, requires inordinate amounts of code to accomplish simple tasks, and makes implications that are entirely unnecessary. Deferred expressions of various kinds are certainly possible, but they are not a useful implementation of argument defaults, due to those exact contortions. ChrisA
On 2021-10-31 05:57, Chris Angelico wrote:
Deferred expressions are not the same as late-bound argument defaults.
You keep saying this, but I still don't get the point. Descriptors are not the same as properties, but we can implement properties with descriptors. Decorators are not the same as callables, but we can implement decorators with callables. The plus sign is not the same as an __add__ method, but we can define the implementation of + in terms of __add__ (and __radd__) methods on objects.
As far as I can tell the only real difference is that you seem very intent on the idea that the default argument is ONLY evaluated right at the beginning of the function and not later. But so what? A general deferred expression would still allow you to do that, it would just also allow you to evaluate the deferred expression at some other point. Wonderful! +1,000,000. So are you going to write a PEP explaining exactly how you propose to implement deferred expressions and how they would be used? And perhaps
On 31/10/2021 18:00, Brendan Barnwell wrote: provide a reference implementation? Then we can see how it works, and how it can be used to replace or improve PEP 671? Say, next month? I really don't want to be rude, but I can't think of anything more appropriate to express my frustration/impatience (and I'm not the one doing the work! I can only guess how Chris A feels) with this foot-dragging, than the adage (possibly from poker): "Put up or shut up". The first car would never have been built (in 1885 or thereabouts) if the investors had insisted it wasn't worth doing unless it had air bags, satellite GPS, in-car radio and cruise control. PEP 671 will be USEFUL to Python programmers. We want it! (When do we want it? Now!) Best wishes Rob Cliffe
[snip]
On Sun, Oct 31, 2021, 5:39 PM Rob Cliffe via Python-ideas
PEP 671 will be USEFUL to Python programmers. We want it! (When do we want it? Now!)
This feels dishonest. I believe I qualify as a Python programmer. I started using Python 1.4 in 1998. The large majority of my work life since then had been programming Python. I've written a bunch of books on Python. I was a director of the PSF. I DO NOT want PEP 671, and do not feel it would be a net benefit to Python programmers. I am not on the SC, so indeed I won't make the decision. But I *will* continue to teach Python. New syntax adds a burden to learners, and it should not be introduced without sufficient benefit to merit that burden. This proposal does not come close to that threshold. I was the first in this thread to propose a far more general capability that I think WOULD meet the cost/benefit balance... and I proposed it because I think there is a way to meet the niche need that also has wide enough application to warrant new syntax. Python isn't Perl. Not every piece of syntax that *someone* might occasionally use for a narrow need should be included in the language.
On Mon, Nov 1, 2021 at 8:56 AM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
I am not on the SC, so indeed I won't make the decision. But I *will* continue to teach Python. New syntax adds a burden to learners, and it should not be introduced without sufficient benefit to merit that burden. This proposal does not come close to that threshold.
How do you currently teach about mutable argument defaults? With this proposal, it will become trivially easy to have an argument default that's evaluated fresh every time. Does that not count as beneficial? ChrisA
On Sun, Oct 31, 2021, 6:11 PM Chris Angelico
On Mon, Nov 1, 2021 at 8:56 AM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
I am not on the SC, so indeed I won't make the decision. But I *will* continue to teach Python. New syntax adds a burden to learners, and it should not be introduced without sufficient benefit to merit that burden. This proposal does not come close to that threshold.
How do you currently teach about mutable argument defaults? With this proposal, it will become trivially easy to have an argument default that's evaluated fresh every time. Does that not count as beneficial?
I teach folks to use a sentinel. Yes, it is genuinely a thing to learn, but it takes far less mental effort than special syntax and a different evaluation model. At least 99% of the time, the None sentinel is fine and the best choice.... Yes, I know there are RARE cases where None isn't a good sentinel, but I can't recall the last time I encountered that situation. I've myself made errors with mutable defaults. Perhaps not for a few years, but certainly years after I should have known better. Yes, there genuinely is a possible bug with using defaults. However, I believe that having two different kinds of default parameter bindings would lead to a much larger number of bugs than the status quo. I think this would be true even if the syntax made the distinction obvious... And the '=>' syntax is far from intuitive to start with. It *could* be memorized, but it's definitely not intuitive. This isn't just beginners either. For example, in a few days, I'm giving a short talk about the pitfalls of using lru_cache with mutable arguments to folks at my work. They have, typically, 5-10 years Python experience, yet that's an error I've found in production code. This really isn't the same issue as this PEP, but it's an example of where just a little extra complexity gets experienced developers confused.
On 2021-10-31 15:08, Chris Angelico wrote:
On Mon, Nov 1, 2021 at 8:56 AM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
I am not on the SC, so indeed I won't make the decision. But I *will* continue to teach Python. New syntax adds a burden to learners, and it should not be introduced without sufficient benefit to merit that burden. This proposal does not come close to that threshold.
How do you currently teach about mutable argument defaults? With this proposal, it will become trivially easy to have an argument default that's evaluated fresh every time. Does that not count as beneficial?
This is a good question. When I've taught Python, I teach about mutable argument defaults by building on what's been taught about mutable objects in general. As we know, there are many things about mutable objects that can confuse newbies (or even people with some experience), and many of them have nothing to do with default arguments. For instance, people are sometimes surprised if they do x = [1, 2, 3] some_function(x) . . . and then find that x has changed in the calling environment because `some_function` mutated it. Or sometimes they're surprised "from the inside" because they're the one writing `some_function` and they mutate the argument and didn't realize that could disrupt other code that calls their function. And so on. Once people understand the general need to be careful about mutable objects and where they're mutated, using them as object defaults is not really a huge additional obstacle. Basically you just have to make clear that defaults are evaluated only once, when they write the function, and not again and again each time it is called. If people understand that, they will basically understand mutable argument defaults, because that is essentially the same situation in the example above, and various other cases. It just means "be careful when you mutate a mutable object someone else might be using that object too". Of course, they will forget this and make mistakes. So having late-bound defaults would be a bit of a help. But my point is that many of the problems learners have with mutable default arguments are really problems with mutable objects in general, and having late-bound defaults won't help with those more general confusions. So the situation is the same as before: yes, there will be a bit of a benefit, but the benefit is limited. Also, from a teaching perspective, there is an important cost as well, which is that you have to teach students about the new syntax and how it differs from early-bound defaults, and students have to develop the skill of reading function signatures that are now (potentially) more complex than they were before. Based on my own (admittedly limited) experience I'm not sure if late-bound defaults would be a net win in terms of teaching ease. Right now the situation is actually not too crazy to explain because the handling of mutable defaults (i.e., the "if arg is None" stuff) goes into the function body and can be described as part of other general "prep work" that functions may need to do at the beginning (like, say, converting inputs to lowercase or doing some sanity checks on arguments). But the new proposal is something that would actually have to be learned as a separate thing because of its in-between nature (i.e., now the argument list becomes a mix of things, some of which execute in the defining context and some in the calling context). -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
I definitely agree with that sentiment, with beginners I don't even talk about function defaults at first, and when I do, it's when we have already have a talk about mutables so I can just say that you almost never want a mutable default but rather use None as a sentinel. It's not that hard and it serves as a reminder of how mutables work, so it's actually good for teaching! I don't look forward to having to add yet another side note about syntactic sugar that does not really add much value (it saves a few characters but it's less clear and relying on code to document the parameters is a bit meh imo). Because I won't burden beginners who are already having to ingest a lot of thing with a new model of evaluation. I guess an alternative could be to only teach late binding but since all the code written so far is early bound, it's not practical. Cheers, E On Mon, 1 Nov 2021, 01:05 Brendan Barnwell, <brenbarn@brenbarn.net> wrote:
On 2021-10-31 15:08, Chris Angelico wrote:
On Mon, Nov 1, 2021 at 8:56 AM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
I am not on the SC, so indeed I won't make the decision. But I *will* continue to teach Python. New syntax adds a burden to learners, and it should not be introduced without sufficient benefit to merit that burden. This proposal does not come close to that threshold.
How do you currently teach about mutable argument defaults? With this proposal, it will become trivially easy to have an argument default that's evaluated fresh every time. Does that not count as beneficial?
This is a good question. When I've taught Python, I teach about mutable argument defaults by building on what's been taught about mutable objects in general. As we know, there are many things about mutable objects that can confuse newbies (or even people with some experience), and many of them have nothing to do with default arguments. For instance, people are sometimes surprised if they do
x = [1, 2, 3] some_function(x)
. . . and then find that x has changed in the calling environment because `some_function` mutated it. Or sometimes they're surprised "from the inside" because they're the one writing `some_function` and they mutate the argument and didn't realize that could disrupt other code that calls their function. And so on.
Once people understand the general need to be careful about mutable objects and where they're mutated, using them as object defaults is not really a huge additional obstacle. Basically you just have to make clear that defaults are evaluated only once, when they write the function, and not again and again each time it is called. If people understand that, they will basically understand mutable argument defaults, because that is essentially the same situation in the example above, and various other cases. It just means "be careful when you mutate a mutable object someone else might be using that object too".
Of course, they will forget this and make mistakes. So having late-bound defaults would be a bit of a help. But my point is that many of the problems learners have with mutable default arguments are really problems with mutable objects in general, and having late-bound defaults won't help with those more general confusions.
So the situation is the same as before: yes, there will be a bit of a benefit, but the benefit is limited. Also, from a teaching perspective, there is an important cost as well, which is that you have to teach students about the new syntax and how it differs from early-bound defaults, and students have to develop the skill of reading function signatures that are now (potentially) more complex than they were before.
Based on my own (admittedly limited) experience I'm not sure if late-bound defaults would be a net win in terms of teaching ease. Right now the situation is actually not too crazy to explain because the handling of mutable defaults (i.e., the "if arg is None" stuff) goes into the function body and can be described as part of other general "prep work" that functions may need to do at the beginning (like, say, converting inputs to lowercase or doing some sanity checks on arguments). But the new proposal is something that would actually have to be learned as a separate thing because of its in-between nature (i.e., now the argument list becomes a mix of things, some of which execute in the defining context and some in the calling context).
-- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/OVITAF... Code of Conduct: http://python.org/psf/codeofconduct/
On Mon, Nov 01, 2021 at 09:39:01AM +0100, Evpok Padding wrote:
I don't look forward to having to add yet another side note about syntactic sugar that does not really add much value (it saves a few characters but it's less clear
This proposal is not about saving a few characters. We could keep the PEP and change the syntax to use a long keyword: def func(arg=late_binding_through_delayed_evaluation expression) (where "expression" is an actual expression) and it would still have the same benefits. Just more awkward to type :-)
and relying on code to document the parameters is a bit meh imo).
Do you think it is a problem that help() can introspect function signatures and report what the actual defaults are, rather than whatever lies are put in the docstring? I think that is a fantastic feature for Python, but it only applies to early defaults. def func(arg=''): """Return a thing. Arguments: arg is a string, defaults to space. """ When possible, the single source of truth for a function's defaults should be the actual parameter defaults, regardless of when the default is evaluated.
Because I won't burden beginners who are already having to ingest a lot of thing with a new model of evaluation.
We're not proposing this feature for the benefit of newbies and beginners. Python is remarkably beginner friendly, but it's not Scratch. The first edition of Learning Python (Mark Lutz and David Ascher) didn't introduce function defaults until page 122, right at the end of Chapter Four. And of course they had to mention the mutable object gotcha. We do people a disservice if we don't introduce at least the concept of when the default is evaluated. If we don't, they will invariably trip into the "mutable default" gotcha on their own and confuse themselves. I don't think the distinction between early and late binding is a hard concept to teach. What does this do? def func(arg=print("Hello")): return arg func() func() That's all you need to show to demonstrate early binding. Now this: def func(arg=late print("Hello")): return arg func() func() Having *both* options available, and teaching it as a choice, will (I think) make it easier to teach, not harder. "Write `arg=default` if you want the default to be evaluated just once, and `arg=late default` if you want the default to be evaluated each time it is needed. If you are not sure which one you need, use the first." That's the sort of advice I would have loved when I was a newbie. Short, simple, straight to the point, and not reliant on knowing what "is None" means. -- Steve
Steven D'Aprano writes:
"Write `arg=default` if you want the default to be evaluated just once, and `arg=late default` if you want the default to be evaluated each time it is needed. If you are not sure which one you need, use the first."
Which of course is ambiguous, since the argument may be referenced many times in the function body or only late in the body. Someone who doesn't yet understand how early binding of mutable defaults works is also somewhat likely to misunderstand "when needed" as a promise that resolves to an object the first time it is referenced, or as a thunk that gets run every time it is referenced, both of which are incorrect. What you should write to be (more) accurate is Write `arg=default` if you want the default to be evaluted when the function is defined [and the value to be stored in the function to be used when it is called], and `arg=late default` if you want the default to be evaluated each time the function is called. If you are not sure which one you need, ask someone to help you, because using the wrong one is a common source of bugs. The part in brackets is a gloss that might be helpful to the beginner, who might experience a WTF at "evaluated when defined" To be honest, I was surprised you chose early binding for "when in doubt". I would expect that "early binding when late is appropriate" is a much more common bug for beginners than "late binding when early is appropriate". Of course I may be biased because it's the only bug now, but I would think that would continue to be true for beginners if late binding is made available. Especially when they're testing code in the interactive interpreter. There are probably many programs where the function is called with the argument missing only once, in which case it doesn't matter, until you invoke the function repeatedly in the same context.
On Wed, Nov 3, 2021 at 6:01 PM Stephen J. Turnbull <stephenjturnbull@gmail.com> wrote:
Steven D'Aprano writes:
"Write `arg=default` if you want the default to be evaluated just once, and `arg=late default` if you want the default to be evaluated each time it is needed. If you are not sure which one you need, use the first."
Which of course is ambiguous, since the argument may be referenced many times in the function body or only late in the body. Someone who doesn't yet understand how early binding of mutable defaults works is also somewhat likely to misunderstand "when needed" as a promise that resolves to an object the first time it is referenced, or as a thunk that gets run every time it is referenced, both of which are incorrect.
People will often have completely wrong understandings about things. While it's possible to avoid some of those, we can't hold the language back because *someone who doesn't understand* might misunderstand. Mutable objects in general tend to be misunderstood. Python has a strong policy of evaluating things immediately, or being very clear about when they will be evaluated (eg generators when you next() them). Resolving a promise to an object on first reference would break that pattern completely, so I would expect most people to assume that "each time it is needed" means "each time you omit the argument". And if they don't, they'll figure it out and go digging. Or not, as the case may be; I've seen misunderstandings linger in people's brains for a long time without ever being disproven. (My own brain included.)
What you should write to be (more) accurate is
Write `arg=default` if you want the default to be evaluted when the function is defined [and the value to be stored in the function to be used when it is called], and `arg=late default` if you want the default to be evaluated each time the function is called. If you are not sure which one you need, ask someone to help you, because using the wrong one is a common source of bugs.
The part in brackets is a gloss that might be helpful to the beginner, who might experience a WTF at "evaluated when defined"
And that's what happens when you need to be pedantically correct. Not particularly useful to a novice, especially with the FUD at the end.
To be honest, I was surprised you chose early binding for "when in doubt". I would expect that "early binding when late is appropriate" is a much more common bug for beginners than "late binding when early is appropriate". Of course I may be biased because it's the only bug now, but I would think that would continue to be true for beginners if late binding is made available. Especially when they're testing code in the interactive interpreter. There are probably many programs where the function is called with the argument missing only once, in which case it doesn't matter, until you invoke the function repeatedly in the same context.
There's a performance cost to late binding when the result will always be the same. The compiler could optimize "=>constant" to "=constant" at compilation time [1], but if it's not actually a compile-time constant, only a human can know that that can be done. But then the question becomes: should we recommend late-binding by default, with early-binding as an easy optimization? I'm disinclined to choose at this point, and will leave that up to educators and style guide authors. ChrisA [1] Technically, there'd still be a difference, but only if you mess with the function's dunders. So for safety, probably the compiler should never optimize it.
But then the question becomes: should we recommend late-binding by default, with early-binding as an easy optimization?
Given the decades of code that is using early binding now, I think it would be a really bad idea not to teach that as the default— folks absolutely need to understand early binding and it’s limitations. But at the end of the day, other than the legacy, I think having a late binding option will be a bit easier for newbies, but not radically different. -CHB I'm disinclined
to choose at this point, and will leave that up to educators and style guide authors.
ChrisA [1] Technically, there'd still be a difference, but only if you mess with the function's dunders. So for safety, probably the compiler should never optimize it. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/K2UD46... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
Chris Angelico writes:
While it's possible to avoid some of those, we can't hold the language back because *someone who doesn't understand* might misunderstand.
Opposing the proposal wasn't the point of quoting Steve, the point was to provide IMO improved language in case the proposal gets adopted. So far, I oppose this proposal because I don't need it, I don't see that anybody else needs it *enough to add syntax*, and I especially don't see that anybody else needs it enough to add syntax that in the opinion of some folks who know this stuff way better than me *might get in the way of a more general feature in the nearish future*. (I know you differ on that point and I understand why, I just don't yet agree.) None of that has anything to do with user misunderstanding, it's a different assessment of the benefits and costs of adoption.
Resolving a promise to an object on first reference would break that pattern completely, so I would expect most people to assume that "each time it is needed" means "each time you omit the argument".
Back to the educational issues: We're not discussing most people. I understand that the new syntax is well-defined and not hard to understand. To me, most people aren't an educational issue. We're discussing programmers new to all patterns Pythonic. You're already breaking the pattern of immediate evaluation (if they understand it), and the example of generators (well-understood before default arguments are?) shows that objects that appear to be defined may be evaluated when called for. Come to think of it, I don't think next() is the comparable point, it's when you call the generator function to get an iterable object. IMO, that's a better argument for your point of view.
And that's what happens when you need to be pedantically correct. Not particularly useful to a novice, especially with the FUD at the end.
I have no idea what you mean by "that's what happens," except that apparently you don't like it. As I see it, a novice will know what a function definition is, and where it is in her code. She will know what a function call is, and where it is in her code. She will have some idea of the order in which things "get done" (in Python, "executed", but she may or may not understand that def is an executable statement). She can see the items in question, or see that they're not there when the argument is defaulted. To me, that concrete explanation in terms of the code on the screen will be more useful *to the novice* than the more abstract "when needed". As for the "FUD", are you implying that you agree with Steve's proposed text? So that if the programmer is unsure, it's perfectly OK to use early binding, no bugs there?
There's a performance cost to late binding when the result will always be the same. The compiler could optimize "=>constant" to "=constant" at compilation time [1], but if it's not actually a compile-time constant, only a human can know that that can be done.
We're talking about folks new to the late-binding syntax and probably to Python who are in doubt. I don't think performance over correctness is what we want to emphasize here.
But then the question becomes: should we recommend late-binding by default, with early-binding as an easy optimization? I'm disinclined to choose at this point, and will leave that up to educators and style guide authors.
I think most educators will go with "when in doubt, ask a mentor if available, or study harder if not -- anyway, you'll get it soon, it's not that hard", or perhaps offer a much longer paragraph of concrete advice. Style guide authors should not touch it, because it's not a matter of style when either choice is a potential bug. And that's why I think the benefits are basically limited to introspecting the deferred object, whether it's a special deferred object or an eval-able equivalent string. The choice is unpleasant for proponents: if you choose a string, Eric's "but then we have to support strings even if we get something better" is a criticism, and if you choose a descriptor-like protocol, David's criticism that it can and should be more general comes to bear. You say that there's a deficiency with a generic deferred, in that in your plan late-bound argument expressions can access nonlocals -- but that's not a problem if the deferred is implemented as a closure. (This is an extension of a suggestion by Steven d'Aprano.) I would expect that anyway if generic deferreds are implemented, since they would be objects that could be returned from functions and "ejected" from the defining namespace. It seems to me that access to nonlocals is also a problem for functions with late-bound argument defaults, if such a function is returned as the value of the defining function. I suppose it would have to be solved in the same way, by making that function a closure. So the difference is whether the closure is in the default itself, or in the function where the default is defined. But the basic issue, and its solution, is the same. That might be "un-Pythonic" for deferreds, but at the moment I can't see why it's more un-Pythonic for deferreds than for local functions as return values. Regards, Steve
On Thu, Nov 4, 2021 at 5:28 AM Stephen J. Turnbull <stephenjturnbull@gmail.com> wrote:
And that's what happens when you need to be pedantically correct. Not particularly useful to a novice, especially with the FUD at the end.
I have no idea what you mean by "that's what happens," except that apparently you don't like it.
What I mean is that pedantically correct language inevitably ends up way too verbose to be useful in an educational context. (Please explain the behaviour of "yield from" in a generator. Ensure that you are absolutely precisely correct.)
As I see it, a novice will know what a function definition is, and where it is in her code. She will know what a function call is, and where it is in her code. She will have some idea of the order in which things "get done" (in Python, "executed", but she may or may not understand that def is an executable statement).
Given the number of people who assume that function definitions are declarations, it's clear that some things simply have to be learned.
She can see the items in question, or see that they're not there when the argument is defaulted. To me, that concrete explanation in terms of the code on the screen will be more useful *to the novice* than the more abstract "when needed".
As for the "FUD", are you implying that you agree with Steve's proposed text? So that if the programmer is unsure, it's perfectly OK to use early binding, no bugs there?
If the programmer is unsure, go ahead and pick something, then move on. It's better to just try something and go than to agonize over which one you should use. Tell people that something is crucially important to get right, and they're more likely to be afraid of it. Give them a viable default (pun partly intended) and it's far less scary.
There's a performance cost to late binding when the result will always be the same. The compiler could optimize "=>constant" to "=constant" at compilation time [1], but if it's not actually a compile-time constant, only a human can know that that can be done.
We're talking about folks new to the late-binding syntax and probably to Python who are in doubt. I don't think performance over correctness is what we want to emphasize here.
Maybe, but there's also a lot of value in defaulting to the fast option. For instance, in a lot of situations, these two will behave identically: for key in some_dict: for key in list(some_dict): We default to iterating over the object itself, even though that could break if you mutate the dict during the loop. The slower and less efficient form is reserved for situations where it matters. That said, it might be better in this case to recommend late-binding by default. But most importantly, either recommended default is better than making things sound scary.
But then the question becomes: should we recommend late-binding by default, with early-binding as an easy optimization? I'm disinclined to choose at this point, and will leave that up to educators and style guide authors.
I think most educators will go with "when in doubt, ask a mentor if available, or study harder if not -- anyway, you'll get it soon, it's not that hard", or perhaps offer a much longer paragraph of concrete advice. Style guide authors should not touch it, because it's not a matter of style when either choice is a potential bug.
I would initially just recommend early-binding by default, since it's going to have better cross-version compatibility. By the time that's no longer a consideration, I personally, and the world in general, will have a lot more experience with the feature, so we'll be able to make a more informed decision.
You say that there's a deficiency with a generic deferred, in that in your plan late-bound argument expressions can access nonlocals -- but that's not a problem if the deferred is implemented as a closure. (This is an extension of a suggestion by Steven d'Aprano.) I would expect that anyway if generic deferreds are implemented, since they would be objects that could be returned from functions and "ejected" from the defining namespace.
If the deferred is implemented as a closure, it would be useless for this proposal. Look at the clunky proposals created to support the bisect example, and the weird messes to do things that, with a little bit of compiler support, are just ordinary variable references.
It seems to me that access to nonlocals is also a problem for functions with late-bound argument defaults, if such a function is returned as the value of the defining function. I suppose it would have to be solved in the same way, by making that function a closure. So the difference is whether the closure is in the default itself, or in the function where the default is defined. But the basic issue, and its solution, is the same. That might be "un-Pythonic" for deferreds, but at the moment I can't see why it's more un-Pythonic for deferreds than for local functions as return values.
Not sure what you mean, but the way I've implemented it, if you refer to a nonlocal in a late-bound default, it makes a closure just the same as if you referred to it in the function body. This is another reason that generic deferreds wouldn't work, since the compiler knows about the late default while compiling the *parent* function (and can thus make closure cells as appropriate). ChrisA
Chris Angelico writes:
What I mean is that pedantically correct language inevitably ends up way too verbose to be useful in an educational context.
Nonsense. If you leave out the part in brackets and the "FUD", it's *much* shorter than what Steve wrote, and more accurate. What would require much more verbiage is a proper explanation of the behavior of mutable objects and how to figure out when you want early binding and when you want late binding. But none of us even tried to do that, so you can't hang that on me.
That said, it might be better in this case to recommend late-binding by default. But most importantly, either recommended default is better than making things sound scary.
So is no recommended default. Why not just tell the student the fact that early binding is faster?
If the deferred is implemented as a closure, it would be useless for this proposal.
I don't believe that, because this is exactly how Common Lisp handles late-bound default expressions, by creating a closure. In fact there's no dedicated syntax for early-bound defaults at all; you just quote them, and the quote expression gets evaluated. (The compiler is allowed, but not required, to optimize the evaluation away if it can prove that this does not ever affect the value of the argument.)
Chris Angelico writes:
While it's possible to avoid some of those, we can't hold the language back because *someone who doesn't understand* might misunderstand.
Opposing the proposal wasn't the point of quoting Steve, the point was to provide IMO improved language in case the proposal gets adopted.
So far, I oppose this proposal because I don't need it, I don't see that anybody else needs it *enough to add syntax*, and I especially don't see that anybody else needs it enough to add syntax that in the opinion of some folks who know this stuff way better than me *might get in the way of a more general feature in the nearish future*. Some people clearly need it, viz. those who use a default such as `[]` and then ask on Stack Overflow why they get surprising results. With late binding you can do anything that you can do with early binding, but not vice versa. And IMO late binding is actually more intuitive - YMMV. You say you don't need late binding, but I would be very surprised if you never found a situation where it was useful. (Starting with avoiding the 'if x==None: x==[]' idiom, but I am sure there will be others, like the `hi:=len(a)` example.) In short, if it was available, I think you would find uses for it. has shown no signs of emerging in the last decade or more, and
(I know you differ on that point and I understand why, I just don't yet agree.) None of that has anything to do with user misunderstanding, it's a different assessment of the benefits and costs of adoption.
Resolving a promise to an object on first reference would break that pattern completely, so I would expect most people to assume that "each time it is needed" means "each time you omit the argument".
Back to the educational issues: We're not discussing most people. I understand that the new syntax is well-defined and not hard to understand. To me, most people aren't an educational issue.
We're discussing programmers new to all patterns Pythonic. You're already breaking the pattern of immediate evaluation (if they understand it), and the example of generators (well-understood before default arguments are?) shows that objects that appear to be defined may be evaluated when called for. Come to think of it, I don't think next() is the comparable point, it's when you call the generator function to get an iterable object. IMO, that's a better argument for your point of view. Not sure what your point is. If people understand (in the example of generators) that objects that appear to be defined may be evaluated when called for, they shouldn't have too much difficulty understanding late-bound defaults. If they don't, they may still find late-bound defaults intuitive. And if not, well, everyone has to start learning somewhere.
And that's what happens when you need to be pedantically correct. Not particularly useful to a novice, especially with the FUD at the end.
I have no idea what you mean by "that's what happens," except that apparently you don't like it. As I see it, a novice will know what a function definition is, and where it is in her code. She will know what a function call is, and where it is in her code. She will have some idea of the order in which things "get done" (in Python, "executed", but she may or may not understand that def is an executable statement). She can see the items in question, or see that they're not there when the argument is defaulted. To me, that concrete explanation in terms of the code on the screen will be more useful *to the novice* than the more abstract "when needed". (Semi-jocular point) I know you're trying not to be sexist, and yet
On 03/11/2021 18:28, Stephen J. Turnbull wrote: there is no indication that it will in the "nearish future". (Define "nearish": 1 year, 5, 10, 20 ... never?) is a *very* tricky thing indeed to specify, and understand, and use is arguably a niche use case that not many people will want and if one day it finally appears, can probably be reconciled with an long-implemented PEP 671 anyway. In short, this "more general feature" is a myth. A phantom. And, frankly, an excuse to argue against a PEP which will have immediate benefit to some (lots of?) Python programmers. perhaps in a way you are. Can we adopt a convention of using the male pronoun for novice programmers and the female for experienced ones? After all, in the very early days of computing, virtually all the coders were (underpaid) women. 😁
As for the "FUD", are you implying that you agree with Steve's proposed text? So that if the programmer is unsure, it's perfectly OK to use early binding, no bugs there?
There's a performance cost to late binding when the result will always be the same. The compiler could optimize "=>constant" to "=constant" at compilation time [1], but if it's not actually a compile-time constant, only a human can know that that can be done.
We're talking about folks new to the late-binding syntax and probably to Python who are in doubt. I don't think performance over correctness is what we want to emphasize here.
But then the question becomes: should we recommend late-binding by default, with early-binding as an easy optimization? I'm disinclined to choose at this point, and will leave that up to educators and style guide authors.
I think most educators will go with "when in doubt, ask a mentor if available, or study harder if not -- anyway, you'll get it soon, it's not that hard", or perhaps offer a much longer paragraph of concrete advice. Style guide authors should not touch it, because it's not a matter of style when either choice is a potential bug.
And that's why I think the benefits are basically limited to introspecting the deferred object, whether it's a special deferred object or an eval-able equivalent string. The choice is unpleasant for proponents: if you choose a string, Eric's "but then we have to support strings even if we get something better" is a criticism, and if you choose a descriptor-like protocol, David's criticism that it can and should be more general comes to bear.
You say that there's a deficiency with a generic deferred, in that in your plan late-bound argument expressions can access nonlocals -- but that's not a problem if the deferred is implemented as a closure. (This is an extension of a suggestion by Steven d'Aprano.) I would expect that anyway if generic deferreds are implemented, since they would be objects that could be returned from functions and "ejected" from the defining namespace.
It seems to me that access to nonlocals is also a problem for functions with late-bound argument defaults, if such a function is returned as the value of the defining function. I suppose it would have to be solved in the same way, by making that function a closure. So the difference is whether the closure is in the default itself, or in the function where the default is defined. But the basic issue, and its solution, is the same. That might be "un-Pythonic" for deferreds, but at the moment I can't see why it's more un-Pythonic for deferreds than for local functions as return values.
Regards, Steve
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/K5YXB7... Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, Nov 3, 2021, 9:19 PM Rob Cliffe via Python-ideas
Some people clearly need it, viz. those who use a default such as `[]` and then ask on Stack Overflow why they get surprising results.
This is silly. All those folks on StackOverflow are told "use a sentinel." The fact beginners can make a mistake doesn't mean a feature is wrong, it means beginners are beginners. They don't NEED it, there are existing solutions. Even though I don't support this proposal, there are things that beginners ask about that we don't NEED but are still worth adding. For example, even though I was only lukewarm in support of the walrus operator, I agree it makes a some code constructs more concise and more readable. But it WAS new syntax to do the same thing that was already possible with an extra line or two before. I recognize that in many ways this proposal is similar. It's extra syntax to make a certain coding pattern shorter. I don't believe that's absurd, I just think the balance tips the other way. What this covers is less important than what the walrus operator covers, because all syntax proposed is uglier and less intuitive than walrus, and because it may obstruct a much more important general feature is like to have added. With late binding you can do anything that you can do with early binding,
but not vice versa. And IMO late binding is actually more intuitive - YMMV.
This seems exactly opposite the real situation. Late binding is completely and straightforwardly handled by a sentinel. Yes, it doesn't make the automatic help() that pretty. Yes it takes an extra line in the body. But the semantics are available. In contrast, how can a late binding call POSSIBLY know what the default value was at the point of function definition?! x = 1234 def foo(a, b=x): # ... whatever x = 567 foo(88) That definition-time value of 'x' is just lost. I don't consider that behavior especially important, I admit. There are plenty of names, and if you want one not to change, don't change it. Indeed, if Python 0.9 had come with late binding, my feelings about Python and it's popularity would probably be nearly identical. But it didn't. So now we are discussing confusing and subtle syntax variations for a niche use case, and I don't believe that's worthwhile.
On Thu, Nov 4, 2021 at 12:42 PM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
This seems exactly opposite the real situation. Late binding is completely and straightforwardly handled by a sentinel. Yes, it doesn't make the automatic help() that pretty. Yes it takes an extra line in the body. But the semantics are available.
In contrast, how can a late binding call POSSIBLY know what the default value was at the point of function definition?!
x = 1234 def foo(a, b=x): # ... whatever
x = 567 foo(88)
That definition-time value of 'x' is just lost.
No one is suggesting removing early-binding from the language. But even if we were, it wouldn't be hard to solve this purported impossibility with a decorator. For instance, the classic "loop to make a bunch of functions" problem could be solved this way: def snapshot(*values): def wrapper(f): @functools.wraps(f) def wrapped(*a, **kw): return f(*a, **kw, snapshot=values) return wrapped return wrapper for n in range(10): @snapshot(n) def func(btn, *, snapshot): print("Clicked on", snapshot[0]) new_button(onclick=func) Tada, early binding created by closure. Pretty much by definition, nothing that we create is truly new; it's just a question of how awkward it is to spell something. But the existing spelling for argument defaults will continue to have the existing semantics. That isn't changing. All that's changing is that there will be a new way to spell parameters with defaults, which will have slightly different semantics. (If Python had chosen late-binding and hadn't had decorators, it still wouldn't have been too hard to do things - all you'd need is an explicit closure at the definition site, which would, of course, achieve the same goal. But I think the snapshot decorator is more elegant, since it can be reused.) ChrisA
For example, even though I was only lukewarm in support of the walrus operator, I agree it makes a some code constructs more concise and more readable. But it WAS new syntax to do the same thing that was already possible with an extra line or two before.
It's extra syntax to make a certain coding pattern shorter. I don't believe that's absurd, I just think the balance tips the other way.
It’s a little more than just shorter. There is no way to universally spell “not specified”: None works fine in most cases, but not all. Custom sentinels can be confusing to users, etc. All that being said, like any other PEP, there are two questions: 1) will this be an improvement? 2) if so, is it worth the churn? And the SC will need to make those decisions. FWIW, I’m not totally sure where I come down on (2) myself. because it may obstruct a much more important general feature is like to
have added.
Could someone please flesh out this objection? I can’t see at all why having late bound defaults will obstruct the addition of a general purpose deferred evaluation system. Except maybe because we could no longer use late bound defaults as a use case for a deferred object. But as Chris A has made clear, a general purpose deferred object isn’t a great fit for this use case anyway. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Thu, Nov 4, 2021 at 3:56 PM Christopher Barker <pythonchb@gmail.com> wrote:
All that being said, like any other PEP, there are two questions:
1) will this be an improvement? 2) if so, is it worth the churn?
And the SC will need to make those decisions.
FWIW, I’m not totally sure where I come down on (2) myself.
To try to help people make their decisions on that point, allow me to try to summarize the churn that will be involved. 1) The grammar for function signatures becomes a little more complicated. To be fair, most of that complication won't actually come up (for instance, you'd never have both an annotation AND a type comment, even though the grammar says you might), but it's extra for tools to have to cope with. 2) Any tool that does introspection of functions (or anything that uses inspect.Signature objects) will need to be updated. 2a) Not just help() and friends; this includes things like clize, which uses defaults to configure argparse. 3) As with any change, documentation and recommendations will have to depend on the version ("for compatibility, use X, otherwise, use Y") 4) Anything that synthesizes function objects will need to consider how to handle late defaults. I don't think any of it is particularly onerous, but there are quite a few little places to check. ChrisA
Rob Cliffe via Python-ideas writes:
Some people clearly need it, viz. those who use a default such as `[]` and then ask on Stack Overflow why they get surprising results.
That's not a "need". That's a "misunderstanding".
You say you don't need late binding, but I would be very surprised if you never found a situation where it was useful. (Starting with avoiding the 'if x==None: x==[]' idiom,
I'm perfectly happy with 'x = [] if x is None else x'. I've been typing that or 'x = x or []' for 25 years; I'm not going to stop now just because new syntax has been added.
In short, if it was available, I think you would find uses for it.
Speak for yourself, not for me, please. For me, this is not even a "nice to have". I am pretty sure I will only use it if I contribute to a project where defaulting to None and testing in the body is ruled out by the style guide. What little doubt I have is that there is something that this is genuinely needed for, which is introspecting the default expression for a late-bound default. In the case of the '<expr> if x is None else x' idiom, you'd have to disassemble the byte code, if you can find it. Here, you'd have the default expression represented in some accessible form. But I have never yet introspected a function's arguments' defaults yet, so I suspect I won't do that for late-bound defaults either.
Come to think of it, I don't think next() is the comparable point, it's when you call the generator function to get an iterable object. IMO, that's a better argument for your point of view.
Not sure what your point is.
It's exactly what I wrote. In using a generator, next() corresponds to the reference to the argument in the body, while calling the generator function to get the generator's iterator corresponds to the evaluation of the default expression to bind to the argument.
(Semi-jocular point) I know you're trying not to be sexist, and yet perhaps in a way you are.
Point taken, even though it's expressed passive-aggressively. The "semi-jocular" just makes it worse, by the way. All you had to do is ask.
On 11/3/21 12:13 AM, Chris Angelico wrote:
Python has a strong policy of evaluating things immediately, or being very clear about when they will be evaluated
Putting the marker in the middle of the name binding expression is not "very clear" -- particularly since the advice is "ne spaces around '=' in function headers and calls". -- ~Ethan~
On Thu, Nov 4, 2021 at 5:54 AM Ethan Furman <ethan@stoneleaf.us> wrote:
On 11/3/21 12:13 AM, Chris Angelico wrote:
Python has a strong policy of evaluating things immediately, or being very clear about when they will be evaluated
Putting the marker in the middle of the name binding expression is not "very clear" -- particularly since the advice is "ne spaces around '=' in function headers and calls".
Not sure what you mean, but the distinction, if I'm interpreting your statement correctly, is actually the same as there will be with different languages. For instance, I tried this in JavaScript (specifically in Node.js):
function f(a=console.log("Evaluating a")) { ... console.log("Function body begins"); ... console.log("a is", a); ... } undefined f() Evaluating a Function body begins a is undefined undefined
Evaluating the argument default as the function begins makes perfect sense. Evaluating the argument default when you *refer to* the variable in question does NOT make sense. (In the JS example, that would be having "Function body begins" before "Evaluating a".) Proposals to have generic deferreds that get calculated when referenced would be incredibly surprising. "Immediately", when code is written in the function header, can be interpreted as "when the function is created" or "when the function is called", but should not be interpreted as "when you use this variable". ChrisA
On 11/3/21 2:31 PM, Chris Angelico wrote:
On Thu, Nov 4, 2021 at 5:54 AM Ethan Furman wrote:
On 11/3/21 12:13 AM, Chris Angelico wrote:
Python has a strong policy of evaluating things immediately, or being very clear about when they will be evaluated
Putting the marker in the middle of the name binding expression is not "very clear" -- particularly since the advice is "no spaces around '=' in function headers and calls".
[typo above fixed: 'ne' -> 'no'
Not sure what you mean,
I mean the same thing that D'Aprano has reiterated several times: def do_something_fun(target:Any, action:int=-1, permissions:int=>target.perm): pass vs def do_something_fun(target:Any, action:int=-1, @permissions:int=target.perm): pass Having the `@` in front instead of buried in the middle is clear, and just like the * and ** in `*args` and `**kwds` signals that those are different types of variables, the @ in `@permissions` signals that `permissions` is a different kind of variable -- and yes, the fact that it is late-bound does make it different; to claim otherwise is akin to claiming that `args` and `kwds` aren't different because in the end they are just names bound to objects.
[snip javascript example]
Is your javascript example trying to show that putting the sigil in front is nonsensical? If no, then what? If yes, then it is plain that you and I simply disagree and neither of us is going to convince the other. -- ~Ethan~
On Thu, Nov 4, 2021 at 9:33 AM Ethan Furman <ethan@stoneleaf.us> wrote:
On 11/3/21 2:31 PM, Chris Angelico wrote:
On Thu, Nov 4, 2021 at 5:54 AM Ethan Furman wrote:
On 11/3/21 12:13 AM, Chris Angelico wrote:
Python has a strong policy of evaluating things immediately, or being very clear about when they will be evaluated
Putting the marker in the middle of the name binding expression is not "very clear" -- particularly since the advice is "no spaces around '=' in function headers and calls".
[typo above fixed: 'ne' -> 'no'
Not sure what you mean,
I mean the same thing that D'Aprano has reiterated several times:
def do_something_fun(target:Any, action:int=-1, permissions:int=>target.perm): pass
vs
def do_something_fun(target:Any, action:int=-1, @permissions:int=target.perm): pass
Having the `@` in front instead of buried in the middle is clear, and just like the * and ** in `*args` and `**kwds` signals that those are different types of variables, the @ in `@permissions` signals that `permissions` is a different kind of variable -- and yes, the fact that it is late-bound does make it different; to claim otherwise is akin to claiming that `args` and `kwds` aren't different because in the end they are just names bound to objects.
[snip javascript example]
Is your javascript example trying to show that putting the sigil in front is nonsensical? If no, then what? If yes, then it is plain that you and I simply disagree and neither of us is going to convince the other.
It's demonstrating that a plain equals sign can mean late-binding in some languages and early-binding in others. Both of those are perfectly normal interpretations. You were quoting something where I was talking about deferreds that would be evaluated on usage, potentially much later in the function, and I'm saying that that makes no sense. I'm also saying that there is no difference between the variables, only the defaults, and therefore that they shouldn't be adorned in this way. But that's clearly something where neither of us is going to convince the other. ChrisA
On Mon, Nov 1, 2021 at 7:39 PM Evpok Padding <evpok.padding@gmail.com> wrote:
I definitely agree with that sentiment, with beginners I don't even talk about function defaults at first, and when I do, it's when we have already have a talk about mutables so I can just say that you almost never want a mutable default but rather use None as a sentinel. It's not that hard and it serves as a reminder of how mutables work, so it's actually good for teaching!
Or you can just say "but rather use => when defining the default", and then you don't have to explain more things like whether to use "== None" or "is None" just to show how to have a default that builds a new thing. ChrisA
Taking a step back: Suppose Python didn't have default values AT ALL for function parameters? Say that unpassed parameters were always set to some sentinel value (maybe None, maybe some special value NotPassed). Would we want to add them to the language? Surely almost everybody would say yes. (I can't believe anyone would be happy with removing them now.) Then there would be a discussion about whether the defaults should be calculated only once (i.e. early-bound) or on every function call (i.e. late-bound). Historically the decision was made to make them early-bound. I don't how that decision was arrived at, it was before my (Python) time. But consider this: AFAICS, **everything* you can do with early binding, you can do with late binding, but *not* vice versa*. (To simulate early binding if you actually only have late binding, simply put the default value in a global variable which you never change, and use that global variable as your default value. As is commonly done today.) So PEP 671 merely attempts to restore functionality that was (regrettably IMO) left out as a result of that early decision. Best wishes Rob Cliffe
On Fri, Nov 5, 2021 at 7:36 AM Rob Cliffe via Python-ideas <python-ideas@python.org> wrote:
But consider this: AFAICS, *everything* you can do with early binding, you can do with late binding, but *not* vice versa. (To simulate early binding if you actually only have late binding, simply put the default value in a global variable which you never change, and use that global variable as your default value. As is commonly done today.)
Everything you can do with either, you can do with the other. You just demonstrated one way, and if globals won't work, closures will. It's all about expressiveness and clarity of intent. ChrisA
Rob Cliffe via Python-ideas writes:
So PEP 671 merely attempts to restore functionality that was (regrettably IMO) left out as a result of that early decision.
This is a *new* feature, which adds syntax. A lot of contributors to this thread think it's useful enough to overcome the normal Pythonic reluctance to add (1) new features that are syntactic sugar for one-line statements, and (2) new syntax. Others disagree. This utility is either going to be enough to convince the SC, or it's not. It's clear that the battle lines are being drawn.[1] So let's stop trying to convince each other of whether this is a good proposal or not, and turn to making it the best proposal possible so as to give proponents the best chance of getting it in, and to give opponents the least unpalatable misfeature ;-) if it does get in. Still on the agenda as far as I can see: 1. Syntax. The proposals I can recall are a. x=>default b. *x=default c. x=@default d. maybe putting * or @ on the opposite component in b and c? e. a keyword before default such as "late" or "defer". Others? I believe Chris currently favors a. 2. The implementation. a. Keep an abstract representation of the default expression as a string in a dunder, and prefix the compiled body with code to evaluate it in the appropriate namespace. b. As in a, but the abstract representation is an AST or similar. c. Wrap the evaluation in a function (or function-like object) and invoke it before the compiled body (this was suggested by Steven d'Aprano as a compromise, I believe). d. Wrap the evalution in a general-purpose deferred object (this is not in the scope of PEP 671, discussion below). I believe Chris's current reference implementation is a (or if I got that wrong, closer to a than any of the others). It would be helpful to the discussion if Chris starts by striking any of the above that he's unwilling to implement. A question for Chris: In your proposal, as I understand it, an expensive default expression would always be evaluated, even if it's not always needed. Eg, in this toy example: def foo(x:int=>expensive()): if delphic_oracle(): return x else: return 0 expensive() is always evaluated. In that (presumably quite rare) case, we'd just use a sentinel instead, of course. I have further two comments, which are mostly addressed to Steve, I guess. First, I don't really understand Steve's intended difference between 2c and 2d. Second, as I understand them, both 2c and 2d encapsulate the expression in bytecode, so the nice property of introspectability of the expression is lost. I guess you can decompile it more easily than if it's just interpolated into the function body, but if it's implemented as a closure, don't we lose the identity of the identifiers in the expression? And if it's not (eg, the function-like thing encapsulates an abstract representation of the expression rather than bytecode that computes it), what's the point of 2c? I don't see how it has any advantage over 2a or 2b. Footnotes: [1] Nobody's right, if everybody's wrong. -- Stephen Stills You can never have too many Steves in a discussion!
On Sat, Nov 6, 2021 at 2:57 AM Stephen J. Turnbull <stephenjturnbull@gmail.com> wrote:
Still on the agenda as far as I can see:
1. Syntax. The proposals I can recall are a. x=>default b. *x=default c. x=@default d. maybe putting * or @ on the opposite component in b and c? e. a keyword before default such as "late" or "defer". Others? I believe Chris currently favors a.
Yes, I'm currently favouring "x=>default", though weakly; but I strongly favour syntax options that change only the part around the equals sign (no adornment before the variable name, no adornment after the expression). There are a few syntaxes listed in the PEP, and there's a plan in progress to strengthen one of those syntaxes somewhat.
2. The implementation. a. Keep an abstract representation of the default expression as a string in a dunder, and prefix the compiled body with code to evaluate it in the appropriate namespace. b. As in a, but the abstract representation is an AST or similar. c. Wrap the evaluation in a function (or function-like object) and invoke it before the compiled body (this was suggested by Steven d'Aprano as a compromise, I believe). d. Wrap the evalution in a general-purpose deferred object (this is not in the scope of PEP 671, discussion below). I believe Chris's current reference implementation is a (or if I got that wrong, closer to a than any of the others).
It would be helpful to the discussion if Chris starts by striking any of the above that he's unwilling to implement.
Sure. Let's see. a. This is what's currently implemented, plus using Ellipsis in __defaults__ as a marker that there needs to be a default expression. It's a little bit complicated, but it does mean that the vast majority of functions aren't significantly affected by this proposal. b. Less preferred than a, due to the higher cost of retaining the AST, but I'd be fine with this conceptually. c. While this is philosophically interesting, I'm not sure how it would be implemented, so I'd have to see someone else's implementation before I can truly judge it. d. Definitely not, and if someone else wants it, it can be a competing proposal. So: a and b are yes, c is dubious, d is not.
A question for Chris: In your proposal, as I understand it, an expensive default expression would always be evaluated, even if it's not always needed. Eg, in this toy example:
def foo(x:int=>expensive()): if delphic_oracle(): return x else: return 0
expensive() is always evaluated. In that (presumably quite rare) case, we'd just use a sentinel instead, of course.
It will be evaluated even if it's not referred to in the body, but only if the argument is omitted. There is a guarantee that, once the function body begins executing, all arguments (whether given values or populated from defaults) have been assigned. So, yes, if you want conditional evaluation, you do need to use a sentinel.
I have further two comments, which are mostly addressed to Steve, I guess. First, I don't really understand Steve's intended difference between 2c and 2d. Second, as I understand them, both 2c and 2d encapsulate the expression in bytecode, so the nice property of introspectability of the expression is lost. I guess you can decompile it more easily than if it's just interpolated into the function body, but if it's implemented as a closure, don't we lose the identity of the identifiers in the expression? And if it's not (eg, the function-like thing encapsulates an abstract representation of the expression rather than bytecode that computes it), what's the point of 2c? I don't see how it has any advantage over 2a or 2b.
I'm not entirely sure either. There are a few concepts that could be described, but without getting some implementation going, it'll be hard to judge them.
Footnotes: [1] Nobody's right, if everybody's wrong. -- Stephen Stills You can never have too many Steves in a discussion!
We do have a good few. Not quite as many Chrises, although I'm more likely to see Chris replying to Chris replying to Chris on a nerdy mailing list than I am anywhere else! My current implementation does have one somewhat annoying flaw: the functionality of late-bound defaults is buried in the function's bytecode, but the description of it is a function dunder, and can be changed. I'd be open to suggestions that would make this a feature of the code object instead, thus preventing desynchronization. ChrisA
My "vote" if one has to be chosen: #1: x=defer default #2: @x=default #3: x=@default #4: x=>default #5:. *x=default Explicit is better than implicit.
On Sat, Nov 6, 2021 at 10:46 AM David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
My "vote" if one has to be chosen:
Preferences are very important. This isn't a "vote" in the sense that the one with the most choices will be selected, but I always want to hear people's preferences.
#1: x=defer default #2: @x=default #3: x=@default #4: x=>default #5:. *x=default
I don't like "defer" because it implies things that aren't true, and I really don't like *x=default since it would be very confusing with *x meaning "collect zero or more positional args", but the others are all at least somewhat viable.
Explicit is better than implicit.
That's what everyone says. Even people who are advocating precisely opposite viewpoints. :) ChrisA
On 05/11/2021 15:57, Stephen J. Turnbull wrote:
Still on the agenda as far as I can see:
1. Syntax. The proposals I can recall are a. x=>default b. *x=default c. x=@default d. maybe putting * or @ on the opposite component in b and c? e. a keyword before default such as "late" or "defer". Others? I believe Chris currently favors a. Please. I have more than once advocated x:=default (and there is no clash with the walrus operator, even if others have said/implied that there is). Rob Cliffe
On Mon, Nov 8, 2021 at 11:22 PM Rob Cliffe via Python-ideas < python-ideas@python.org> wrote:
I have more than once advocated x:=default (and there is no clash with the walrus operator, even if others have said/implied that there is).
not a clash, but you could have a walrus in the default expression, which could be pretty visually confusing. On the other hand, maybe that's a really bad idea anyway. And otherwise I like it. BTW, => would have a similar problem if it's adopted as a shorter way to spell lambda. And that would be worse, as putting a lambda in a default depression might be good style in some cases :-) -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Wed, Nov 10, 2021 at 6:02 PM Christopher Barker <pythonchb@gmail.com> wrote:
On Mon, Nov 8, 2021 at 11:22 PM Rob Cliffe via Python-ideas <python-ideas@python.org> wrote:
I have more than once advocated x:=default (and there is no clash with the walrus operator, even if others have said/implied that there is).
not a clash, but you could have a walrus in the default expression, which could be pretty visually confusing. On the other hand, maybe that's a really bad idea anyway. And otherwise I like it.
BTW, => would have a similar problem if it's adopted as a shorter way to spell lambda. And that would be worse, as putting a lambda in a default depression might be good style in some cases :-)
In both cases, it would be confusing to a human, but not technically ambiguous. I'm not sure how important that will be - neither case seems particularly common, and if you do need to do it, you can always parenthesize a bit. Using the walrus operator in a default expression would be VERY weird (why on earth would you be assigning in the middle of default arg handling?!?), but if you really want it, sure! Using a hypothetical lambda function in an argument default wouldn't be unreasonable, but the number of times you'd also need that to be late-bound would be extremely few. Normally you'd get something like this: def merge_objects(stuff, key=lambda item: item.id): ... where the default key function doesn't need to refer to any of the other parameters, so it can be early-bound. But maybe you want to be able to do something weird like: def merge_objects(stuff, idfield="id", match=>lambda item: item[idfield]): ... in which case that might end up being spelled "match=>item => item[idfield]", but aside from that, it's unlikely to cause major problems. (For the most part, lambda functions will be used when *calling* that sort of function, and there's no ambiguity there, since you'll only ever use "=" for keyword arguments, or nothing at all for positional.) My view on this is: All variants of spelling that involve changes to the equals sign are one group of options, and it's my favoured group (whether it's "=>", ":=", "=:", etc). All variants that involve adornments elsewhere ("@var=dflt", "var=@dflt@", "var=`dflt`") are less appealing to me. The ":=" syntax is listed in the current PEP, but I'll adjust the wording of things a bit to make that group clearer. ChrisA
On Wed, Nov 10, 2021 at 6:02 PM Christopher Barker <pythonchb@gmail.com> wrote:
On Mon, Nov 8, 2021 at 11:22 PM Rob Cliffe via Python-ideas <python-ideas@python.org> wrote:
I have more than once advocated x:=default (and there is no clash with the walrus operator, even if others have said/implied that there is).
not a clash, but you could have a walrus in the default expression, which could be pretty visually confusing. On the other hand, maybe that's a really bad idea anyway. And otherwise I like it.
BTW, there is one other small wrinkle with the := spelling, which is that it's very similar to annotation syntax: def spam(a:int=1): ... def ham(a:=1): ... Again, not a fundamental problem to the parser, since an empty expression isn't a valid annotation, but could be confusing. I don't think we're going to get away from that confusion. There are just too many things we want to do with the equals sign, and only so many keys on most people's keyboards. ChrisA
Chris, Will we able to use late-bound arguments in dataclass when it’s creating the __init__ function? @dataclass class C: x: int y: int ls: list[int] => [x, y]
On 10 Nov 2021, at 11:25 AM, Chris Angelico <rosuav@gmail.com> wrote:
On Wed, Nov 10, 2021 at 6:02 PM Christopher Barker <pythonchb@gmail.com <mailto:pythonchb@gmail.com>> wrote:
On Mon, Nov 8, 2021 at 11:22 PM Rob Cliffe via Python-ideas <python-ideas@python.org> wrote:
I have more than once advocated x:=default (and there is no clash with the walrus operator, even if others have said/implied that there is).
not a clash, but you could have a walrus in the default expression, which could be pretty visually confusing. On the other hand, maybe that's a really bad idea anyway. And otherwise I like it.
BTW, there is one other small wrinkle with the := spelling, which is that it's very similar to annotation syntax:
def spam(a:int=1): ... def ham(a:=1): ...
Again, not a fundamental problem to the parser, since an empty expression isn't a valid annotation, but could be confusing.
I don't think we're going to get away from that confusion. There are just too many things we want to do with the equals sign, and only so many keys on most people's keyboards.
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org <mailto:python-ideas@python.org> To unsubscribe send an email to python-ideas-leave@python.org <mailto:python-ideas-leave@python.org> https://mail.python.org/mailman3/lists/python-ideas.python.org/ <https://mail.python.org/mailman3/lists/python-ideas.python.org/> Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/5WOGHM... <https://mail.python.org/archives/list/python-ideas@python.org/message/5WOGHM...> Code of Conduct: http://python.org/psf/codeofconduct/ <http://python.org/psf/codeofconduct/>
On Fri, Nov 26, 2021 at 6:22 PM Abdulla Al Kathiri <alkathiri.abdulla@gmail.com> wrote:
Chris,
Will we able to use late-bound arguments in dataclass when it’s creating the __init__ function?
@dataclass class C: x: int y: int ls: list[int] => [x, y]
With that syntax, no, because there's no object that can be stored in an annotation dictionary that would represent the code construct to create that effect. But the __init__ function is constructed with exec(), and that means that, in theory, dataclasses._field_init could be enhanced to have some way to indicate this - or, possibly, to *always* use late-bound defaults, since it appears to use sentinels for everything. I don't know enough about the workings of dataclasses.dataclass to be able to say for sure, but a cursory glance does suggest that, in some way, this should be possible. It may require stringifying the default though. ls: list[int] = "[x, y]" ChrisA
dataclasses use Field objects that can be created automatically, but also you can specify them if you need to do something special. And one of the special things you can do is set a default constructor -- I'm sure that could be extended to support early bound defaults. -CHB On Thu, Nov 25, 2021 at 11:40 PM Chris Angelico <rosuav@gmail.com> wrote:
On Fri, Nov 26, 2021 at 6:22 PM Abdulla Al Kathiri <alkathiri.abdulla@gmail.com> wrote:
Chris,
Will we able to use late-bound arguments in dataclass when it’s creating
the __init__ function?
@dataclass class C: x: int y: int ls: list[int] => [x, y]
With that syntax, no, because there's no object that can be stored in an annotation dictionary that would represent the code construct to create that effect.
But the __init__ function is constructed with exec(), and that means that, in theory, dataclasses._field_init could be enhanced to have some way to indicate this - or, possibly, to *always* use late-bound defaults, since it appears to use sentinels for everything.
I don't know enough about the workings of dataclasses.dataclass to be able to say for sure, but a cursory glance does suggest that, in some way, this should be possible. It may require stringifying the default though.
ls: list[int] = "[x, y]"
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/YVLHVQ... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
Yeah it makes sense the default_factory argument in the field object could be utilized to support early bound defaults.
On 26 Nov 2021, at 10:42 PM, Christopher Barker <pythonchb@gmail.com> wrote:
dataclasses use Field objects that can be created automatically, but also you can specify them if you need to do something special. And one of the special things you can do is set a default constructor -- I'm sure that could be extended to support early bound defaults.
-CHB
On Thu, Nov 25, 2021 at 11:40 PM Chris Angelico <rosuav@gmail.com <mailto:rosuav@gmail.com>> wrote: On Fri, Nov 26, 2021 at 6:22 PM Abdulla Al Kathiri <alkathiri.abdulla@gmail.com <mailto:alkathiri.abdulla@gmail.com>> wrote:
Chris,
Will we able to use late-bound arguments in dataclass when it’s creating the __init__ function?
@dataclass class C: x: int y: int ls: list[int] => [x, y]
With that syntax, no, because there's no object that can be stored in an annotation dictionary that would represent the code construct to create that effect.
But the __init__ function is constructed with exec(), and that means that, in theory, dataclasses._field_init could be enhanced to have some way to indicate this - or, possibly, to *always* use late-bound defaults, since it appears to use sentinels for everything.
I don't know enough about the workings of dataclasses.dataclass to be able to say for sure, but a cursory glance does suggest that, in some way, this should be possible. It may require stringifying the default though.
ls: list[int] = "[x, y]"
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org <mailto:python-ideas@python.org> To unsubscribe send an email to python-ideas-leave@python.org <mailto:python-ideas-leave@python.org> https://mail.python.org/mailman3/lists/python-ideas.python.org/ <https://mail.python.org/mailman3/lists/python-ideas.python.org/> Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/YVLHVQ... <https://mail.python.org/archives/list/python-ideas@python.org/message/YVLHVQ...> Code of Conduct: http://python.org/psf/codeofconduct/ <http://python.org/psf/codeofconduct/>
-- Christopher Barker, PhD (Chris)
Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/KTBG6L... Code of Conduct: http://python.org/psf/codeofconduct/