Non-standard evaluation for Python

Hello all, I would like to briefly share my thoughts on non-standard evaluation (NSE), why this is useful, and how, potentially, it can be added to Python. In most languages, functions have access only to the *value* of their arguments, and not to the expressions that yielded those values. However, sometimes it is useful to have access to the expressions as well. For example, let plot(x, y, ...) be a function that draws the scatter plot of y against x and adds labels for the x- and y-axes. Currently, there is no way to distinguish plot(x, *z*, ...) from plot(x, *y*, ...) from within the function. As such, one needs to pass the names of the variables to the plot function to be used as the axes' labels. Using NSE, though, the function can look at the expressions in the method call and use those as the default value for labels. In R, this idea is used widely to make the syntax simpler. In the following, I sketch how I think this can be implemented: 1. Let BoundExpression be a class containing an ast.Expression as well as locals and globals dictionaries. BoundExpression can also have an eval method that evaluates its expression using its locals and globals dictionaries. 2. Let `x` be the short form for BoundExpression("x", locals(), globals()). Then, the plot function can be implemented in a way that plot(`x`, `y`) draws scatter plot of x and y *and also,* labels the axes correctly. No need to provide labels explicitly anymore. 3. A more challenging idea is to let developers decide whether their functions need NSE or not. For example, when a function is defined as def f(`x`), for any method call like f(x), the first argument should be wrapped with a BoundExpression instance. I would appreciate any feedback on these ideas. Best, Nima

On Jul 13, 2019, at 12:16, Nima Hamidi <hamidi@stanford.edu> wrote:
Would it be easy to build something like this on top of a more general macro system, like MacroPy? If so, you could create a proof of concept using existing Python and get useful feedback. As it is, with just an explanation and a single example, it’s hard to really evaluate the idea. Also, what happens if the expression contains, say, a := assignment? In that case, calling eval against locals() doesn’t have the same effect as normal evaluation. Is that a problem, or is that behavior that people would and should expect when they ask for a bound expression instead of a value? In your example, it seems like you could evaluate it normally and pass in the value along with the AST, avoiding that problem, and making everything a lot simpler. I can imagine uses where that wouldn’t be sufficient, but they all seem like the kind of thing that demands a full macro system, so I’m not sure they’re relevant. Do you have examples where you need to eval on demand? For that matter, it seems like your example could be handled by just defining `x` to mean, say, ('x', x), without needing a live AST at all. What are the examples where the string form of the expression isn’t sufficient but this proposal is?

Thank you very much for your reply! I wasn't familiar with MacroPy. It's a good idea to implement NSE using it. I'll work on it. Sometimes it's necessary not to evaluate the expression. Two such applications of NSE in R are as follows: 1. Data-tables have cleaner syntax. For example, letting dt be a data-table with a column called price, one can retrieve items cheaper than $1 using the following: dt [price < 1]. Pandas syntax requires something like dt[dt.price < 1]. This is currently inevitable as the expression is evaluated *before* __getitem__ is invoked. Using NSE, dt.__getitem__ can, first, add its columns to locals() dictionary and then evaluate the expression in the new context. 2. Pipe-lining in R is also much cleaner. Dplyr provided an operator %>% which passes the return value of its LHS as the first argument of its RHS. In other words, f() %>% g() is equivalent to g(f()). This is pretty useful for long pipelines. The way that it works is that the operator %>% changes AST and then evaluates the modified expression. In this example, evaluating g() is undesirable. From: Andrew Barnert <abarnert@yahoo.com> Date: Saturday, July 13, 2019 at 4:26 PM To: Nima Hamidi <hamidi@stanford.edu> Cc: "python-ideas@python.org" <python-ideas@python.org> Subject: Re: [Python-ideas] Non-standard evaluation for Python On Jul 13, 2019, at 12:16, Nima Hamidi <hamidi@stanford.edu<mailto:hamidi@stanford.edu>> wrote: In the following, I sketch how I think this can be implemented: 1. Let BoundExpression be a class containing an ast.Expression as well as locals and globals dictionaries. BoundExpression can also have an eval method that evaluates its expression using its locals and globals dictionaries. 2. Let `x` be the short form for BoundExpression("x", locals(), globals()). Then, the plot function can be implemented in a way that plot(`x`, `y`) draws scatter plot of x and y and also, labels the axes correctly. No need to provide labels explicitly anymore. 3. A more challenging idea is to let developers decide whether their functions need NSE or not. For example, when a function is defined as def f(`x`), for any method call like f(x), the first argument should be wrapped with a BoundExpression instance. Would it be easy to build something like this on top of a more general macro system, like MacroPy? If so, you could create a proof of concept using existing Python and get useful feedback. As it is, with just an explanation and a single example, it’s hard to really evaluate the idea. Also, what happens if the expression contains, say, a := assignment? In that case, calling eval against locals() doesn’t have the same effect as normal evaluation. Is that a problem, or is that behavior that people would and should expect when they ask for a bound expression instead of a value? In your example, it seems like you could evaluate it normally and pass in the value along with the AST, avoiding that problem, and making everything a lot simpler. I can imagine uses where that wouldn’t be sufficient, but they all seem like the kind of thing that demands a full macro system, so I’m not sure they’re relevant. Do you have examples where you need to eval on demand? For that matter, it seems like your example could be handled by just defining `x` to mean, say, ('x', x), without needing a live AST at all. What are the examples where the string form of the expression isn’t sufficient but this proposal is?

It's an interesting idea, and has come up occasionally before. It probably stems from Lisp, where it's common practice (IIRC control flow like 'if' is defined this way). I believe it's called quoting there -- I've never heard the term NSE before. (Is that term specific to R?) I'm glad you start out by proposing dedicated syntax -- automatically inferring that a particular function takes (some or all) auto-quoted arguments would be hard because it's the parser that would need to know, but the parser doesn't have access to imported function definitions. A dedicated syntax doesn't have this problem (though it might have other problems). A few problems include: - The `x` syntax is pretty ugly -- in Python 2 this was a shorthand for repr(x), and we removed it in Python 3 because backticks are hard to read (it looks like about two pixels on my screen). Also, these days `x` is pretty commonly used as indicator for "code" (e.g. markdown). (OTOH I don't have a better suggestion.) - The ast module is exempt from the usual backwards compatibility guarantees across minor versions -- if we need to change the AST structure to accommodate a new type of expression or a new feature of the compiler, we will do his without worrying too much about backwards compatibility. (As an example, in Python 3.8 the AST uses Constant for all literal node types (numbers and strings), whereas in previous versions there are separate Num and Str node types.) - If you want to evaluate a quoted expression, there are a few different choices depending on the use case: you might want to evaluate it in exactly the environment where it was first written (i.e. the caller's environment, including nonlocals), or you might want to evaluate it in a context provided by the function that's evaluating it. I *think* you are accounting for this, but I wanted to make sure it was on your radar. - We need a function that, given an expression, returns a reasonable string. We have such a function, thanks to PEP 563, although it's not exposed (it's written in C and used only internally to construct the strings that go into the __annotations__ dict). It also doesn't preserve whitespace exactly. Now, in order to evaluate the expression we don't need the whitespace, but if you want to get the string and use that as an axis label it could become somewhat ugly. With all that in mind, and without any promises, I do think it would be cool if you could code up a prototype and see how it feels. Good luck! --Guido On Sat, Jul 13, 2019 at 12:36 PM Nima Hamidi <hamidi@stanford.edu> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him/his **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Delaying evaluation in some way looks like a useful feature which was up to this point integrated to python only by introducing special syntax for each problem which appeared. Out of my mind: the "conditional if trueresult else falseresult" appeared because one can't write something like if(conditional,trueresult,falseresult) like in lisp because of lack of delayed execution the f"{x=}" somehow appears because one can't write something like debug(x) without losing the name of the x variable. A lot of uses of lambda is as an inline parameter of a function, which might be replaced by the function taking a delayed expression as a parameter. lambda itself, might be think as a function and don't need special syntax if delayed evaluation would exists. If delayed execution some existing code might be better namedtuple('Point',['x','y']) could be rewritten as namedtuple(Point,x,y) partial(f,1,2) might be harder to understand than partial(f(1,2)) I already played with ast and as Guido says there is no backward compatibility guarantee. From my memory, the code I wrote worked only for only one minor version of python. Maybe some representation of the code which is more independent of the needs of CPython execution model would be nice and stable would be nice. Something isomorph to hy syntax, that is 3 + 2 corespond to (+ 3 2) would be very nice to work with and don't need to know the particular type of 3, 2, + in cpython internals Macropy way to make such magic appear is to do a first import an import hook and then import a module which is the place where you could use delayed execution. As such I feel like only a thing you can play with, not a thing you can ship as a library usable without special instruction. the following would be very nice if it would plot the function x,y->3*x+2*y on the range specified from plotting import plot3d plot3d(3*x+2*y,x=[0:2],y=[0,2]) I might be wrong, but the parser don't necessarily need to know which function need delayed evaluation, The ast of the arguments need to be accessible at function call, but it is just a problem of save it until the execution of the function. The fact that an argument must be evaluated just be known at the point where the argument is supposed to be evaluated, so at execution time. So at execution time, the source code plot(3*x) could be executed in something similar to: if plot.is_delayed_evaluation: plot(("3*x",{"x":x})) else: plot(3*x) A big difference between lisp macro and this proposition (as well as initial proposition as far as can say) is that macro in lisp generate source code at compilation while this proposition manipulate code at execution time (If I am correct, for equivalent code, the source code manipulation can be made only once by memoizing the delayed evaluation code) Xavier Combelle Le dim. 14 juil. 2019 à 02:53, Guido van Rossum <guido@python.org> a écrit :

Thank you very much for your feedback! I'm convinced that backtick is a bad choice for doing this. What about something like q"x"? It resembles other python syntaxes like b"x" or f"x".

This use case is nicely covered by short form for keyword arguments. So in this case instead of plot(x, z), we'd have plot(=x, =z) which would be transformed into plot(**{'x': x, 'z': z}) at compile time. This could work for any expression: foo(=lamda x: x*2) -> foo(**{'lamda x: x*2': lamda x: x*2}) This feature is easy to implement and has broad applications.

On Jul 14, 2019, at 00:33, Anders Hovmöller <boxed@killingar.net> wrote:
This use case is nicely covered by short form for keyword arguments. So in this case instead of plot(x, z), we'd have plot(=x, =z) which would be transformed into plot(**{'x': x, 'z': z}) at compile time.
This does avoid the problem of needing new syntax, but it seems to have other problems. First, the function only gets the names, not live expressions that it can eval late, or in a modified environment, etc., so it only handles this case, not cases like df[year>1990]. Second, the names show up as keyword argument names, not as values. Besides being a bit confusing to mix up names and values this way, it means everything has to be defined as def plot(**kwargs) and document its actual arguments some other way, and can’t annotate the parameter types, and if there are optional like the ones pyplot takes for point formatting, scale, etc., there’s no protection against them colliding with the names of the two fake-positional args.
This could work for any expression:
foo(=lamda x: x*2) -> foo(**{'lamda x: x*2': lamda x: x*2})
Is this actually legal? The docs just say the contents of the ** mapping are treated as additional keyword arguments. CPython happens to check that they are strings but not check that those strings are valid keywords, but I don’t think the language definition actually says this is the intended behavior, it’s just an accident of the CPython implementation. So this might require at least defining that implementation behavior as the only correct one, and changing the docs to explain it. Also, is this translation literally the way it’s implemented? Does that mean bare keyword args can only appear in a call expression in the same place as **, even though they don’t look similar?
This feature is easy to implement and has broad applications.
How is this implemented? Doesn’t the compiler have the same problem generating a string for the keyword out of an AST that the user code would have in the OP’s proposal? Also, what are the other applications besides the plot example and equivalent things? I would guess the main uses are actually really simple cases. For example, in a lot of short, simple functions, you often create variables named x and y for the things you’re going to pass as the x and y arguments to plot, and sometimes even xscale; similarly for the url argument to requests.get; and so on. All of these get a few characters shorter to type and read (and maybe a bit harder to accidentally screw up) if you can write “=x” instead of “x=x”. But are there other benefits, for less trivial cases?

Well it does introduce new syntax, just one that is generally useful.
First, the function only gets the names, not live expressions that it can eval late, or in a modified environment, etc., so it only handles this case, not cases like df[year>1990].
I don't understand how it's not the full expression. It's the expression as a string but it's still the full code.
Second, the names show up as keyword argument names, not as values. Besides being a bit confusing to mix up names and values this way, it means everything has to be defined as def plot(**kwargs) and document its actual arguments some other way, and can’t annotate the parameter types, and if there are optional like the ones pyplot takes for point formatting, scale, etc., there’s no protection against them colliding with the names of the two fake-positional args.
I think names being put with names isn't confusing ;) But yes, a system like this would probably need other tweaks for API design but that's to be expected.
You are correct. I have checked that this is the behavior of CPython, pypy, micropythob, iron python, and jython so shouldn't be a big burden.
Also, is this translation literally the way it’s implemented? Does that mean bare keyword args can only appear in a call expression in the same place as **, even though they don’t look similar?
Can be implemented. Not is. It's a suggestion.
This feature is easy to implement and has broad applications.
How is this implemented? Doesn’t the compiler have the same problem generating a string for the keyword out of an AST that the user code would have in the OP’s proposal?
Sure. The lack of round tripping in the standard library AST is a problem but in this case a simple AST dump is most likely fine even if it loses the users specific formatting.
Also, what are the other applications besides the plot example and equivalent things? I would guess the main uses are actually really simple cases. For example, in a lot of short, simple functions, you often create variables named x and y for the things you’re going to pass as the x and y arguments to plot, and sometimes even xscale; similarly for the url argument to requests.get; and so on. All of these get a few characters shorter to type and read (and maybe a bit harder to accidentally screw up) if you can write “=x” instead of “x=x”. But are there other benefits, for less trivial cases?
You are correct that this a big benefit in many small doses. There are some nice use cases like being able to create a debug printer that is very good. It's also very useful for us web developers because we often need to put a bunch of variables in a template context dict. / Anders

On Jul 14, 2019, at 11:53, Anders Hovmöller <boxed@killingar.net> wrote:
Right, sorry, I meant that it doesn’t need any new _symbols_, just a new use of an existing symbol (and in a way that’s unambiguous, both to parsers and humans). That’s obviously a big plus compared to using backticks.
First, the function only gets the names, not live expressions that it can eval late, or in a modified environment, etc., so it only handles this case, not cases like df[year>1990].
I don't understand how it's not the full expression. It's the expression as a string but it's still the full code.
But this doesn’t in any way help with late eval. The caller has already evaluated the expression to pass the value of the keyword argument. The fact that you could ignore that value and re-do the eval doesn’t help if the eval would raise a NameError like the year>1990 case (or if it would do something expensive or dangerous). And it doesn’t help you eval against a modified version of the caller environment because you don’t have the caller environment. That’s what I mean by it not being a “live expression”. You don’t need that for the df[year>1990] case, but you do for some of the other examples. Consider just the minor change of replacing the constant 1990 with a local start: df[year>start]. The OP didn’t explain how this would actually work, but it seems pretty simple—e.g., just eval against a chainmap of __dict__ (to get the year column) over the bound locals (to get the start value). There’s no way to do that with your proposal. (Well, no way short of frame hacks that you can already do without your proposal.) Finally, while this is a less serious problem than the last two, the same string isn’t guaranteed to compile to the same AST in two different contexts. Consider how your proposal would play with libraries that insert token or AST processors.
Second, the names show up as keyword argument names, not as values. Besides being a bit confusing to mix up names and values this way, it means everything has to be defined as def plot(**kwargs) and document its actual arguments some other way, and can’t annotate the parameter types, and if there are optional like the ones pyplot takes for point formatting, scale, etc., there’s no protection against them colliding with the names of the two fake-positional args.
I think names being put with names isn't confusing ;) But yes, a system like this would probably need other tweaks for API design but that's to be expected.
But they’re not being used as variable names, they’re being used as string values. That is the same confusion novices always run into that makes them ask “How do I create a local variable, for each file in a list, named after the filename?” The answer is that you don’t want to do that; you don’t want data in your variable names. But your proposal does exactly that. In the case of get(=url), that’s perfect, because you’re just saying the caller and callee both have the same value in a variable named “url”, which is exactly what you want to say. But in the case of plot(=lambda x:x*2, =lambda x:x**2), you’re saying that the callee has a variable named lambda x:x*2, which is not true. What the callee actually has is two unnamed positional arguments, each of which comes with an extra string argument buried as its name. (The fact that they’re positional, but passed as if they were keyword, accepted as if they were keyword, and then pulled out by iterating **kw in order is also confusing.)
This might actually be a good change even on its own. The fact that the docs aren’t clear on what can go in a **kw is probably not a strength of the language definition but a flaw. If every implementation does the exact same thing (especially if it’s been that way unchanged in every implementation from 2.3 to 3.8), why not document that as the rule?
It’s not just about a lack of round tripping in the standard library AST, it’s about a lack of round tripping in the implementation of the CPython, PyPy, etc. compilers. If the compilers don’t have access to that information internally, they can’t put it in the output. And I think it would be pretty confusing if spam(=eggs+cheese) gave you an argument named “eggs + cheese” instead of “eggs+cheese”. And it would be annoying if you were using it to define labels on a displayed graph—you keep trying to tweak the code to change how the label is written and it doesn’t do what you tell it to do.

Ah. Now I'm with you. Yes this is different but it also seems strange that we wouldn't just use a lambda or a function for that. That's exactly why they exist.
Finally, while this is a less serious problem than the last two, the same string isn’t guaranteed to compile to the same AST in two different contexts. Consider how your proposal would play with libraries that insert token or AST processors.
Hmm, well I've never seen one of those outside toy programs but I'll concede the point.
But in the case of plot(=lambda x:x*2, =lambda x:x**2), you’re saying that the callee has a variable named lambda x:x*2, which is not true. What the callee actually has is two unnamed positional arguments, each of which comes with an extra string argument buried as its name.
I disagree. The callee has a thing called that, it just isn't a variable but a local expression. And in any kwargs have never meant that the caller has a variable by the name of the callee argument so why would it here?
(The fact that they’re positional, but passed as if they were keyword, accepted as if they were keyword, and then pulled out by iterating **kw in order is also confusing.)
What is positional? Everything here is keyword. We must be talking past each other. My proposal is just a different short syntax for keyword arguments at the call site.
Agreed! If you don't follow this you are de facto a broken python implementation even if the spec allows it.
Agreed. But there must be a way to explicitly specify the label anyway so I don't think this is a deal breaker. / Anders

On Jul 15, 2019, at 00:27, Anders Hovmöller <boxed@killingar.net> wrote:
Yes, but they’re verbose, so they hardly solve the problem here. Think about it this way: The OP is unhappy that he had to write df[(df.price > 500) | (df.tip >5)] instead of just df[(price > 500) | (tip >5)]. Would you really suggest that the answer is to write df[lambda d: d.price > 500 or d.tip > 5]? There have been lots of proposals for general late binding, like Nick Coghlan’s concise lambda idea, that might solve this, but none of them have ever gotten near ready for prime time status. The OP’s proposal only covers a much more limited set of uses, but it does seem to cover them well. Your proposal doesn’t even attempt to solve it, so it’s not really a substitute for the OP’s idea.
But in the case of plot(=lambda x:x*2, =lambda x:x**2), you’re saying that the callee has a variable named lambda x:x*2, which is not true. What the callee actually has is two unnamed positional arguments, each of which comes with an extra string argument buried as its name.
I disagree. The callee has a thing called that, it just isn't a variable but a local expression.
An expression isn’t a name that you call something, it’s instructions to compute something. Those are very different things. .What happens If you try to assign a value to that expression, nonlocal it from a nested function, etc.? The question doesn’t even make sense, because those things only make sense for names, and expressions aren’t names.
And in any kwargs have never meant that the caller has a variable by the name of the callee argument so why would it here?
Sure, today, kwargs means that the callee has a variable with that name, not the caller. But isn’t the whole point of your proposal that often the callee _does_ have a name that exactly matches the caller name, so you shouldn’t have to repeat it?
(The fact that they’re positional, but passed as if they were keyword, accepted as if they were keyword, and then pulled out by iterating **kw in order is also confusing.)
What is positional?
The name-value pairs are positional. If the keyword names aren’t being used to match the values to parameter names, or to distinguish the values in any other way, but instead to smuggle in arbitrary extra string values, how else can you handle them but by position? Consider a concrete example, your plot(=lambda x:x*2, =lambda x:x**2). How could plot process that? It has to pull them out of kwargs in order, acting as if they were positional arguments even though they aren’t passed that way, maybe something like this: def plot(**lwargs): for name, values in kwargs.items(): draw(values, label=name)
This might actually be a good change even on its own. The fact that the docs aren’t clear on what can go in a **kw is probably not a strength of the language definition but a flaw. If every implementation does the exact same thing (especially if it’s been that way unchanged in every implementation from 2.3 to 3.8), why not document that as the rule?
Agreed! If you don't follow this you are de facto a broken python implementation even if the spec allows it.
You ought to propose this idea separately. Buried in the middle of this thread I don’t think it’s going to get noticed, but on b.p.o., or maybe its own thread, you’ve got a good shot of convincing people.
Sure, you can write something like, say, spam(eggs+cheese, label='eggs+cheese'), just as you’d do today. But then your proposal isn’t useful for these cases. If stringifying arbitrary expressions doesn’t get you the actual string that you’d want for things like labels, and can’t give you late eval or any other kind of modified eval, what does it get you? I think in your main use case, where the callee and caller often have variables with the same names and you want to be able to write plot(=x, =xscale, =fmt) or get(=url, =data, =auth) without repeating everything twice, it’s a nice idea. But I don’t think it really helps most of the OP’s uses.

It seems a bit excessive to be upset over that added verbosity I think. But I think I'd suggest df[lambda price, tip, **_: price > 500 or tip > 5] but that's basically the same. It does scale nicer with larger expressions though.
There have been lots of proposals for general late binding, like Nick Coghlan’s concise lambda idea, that might solve this, but none of them have ever gotten near ready for prime time status. The OP’s proposal only covers a much more limited set of uses, but it does seem to cover them well. Your proposal doesn’t even attempt to solve it, so it’s not really a substitute for the OP’s idea.
Largely because I misunderstood what he wrote :)
Yea ok. I don't see the relevance to any of that. We're talking about a thing that produces a dict. The keys are strings. Pretty standard stuff.
That's one point. But the other is that we can get the stringly representation of the expression. It just happens that those two have an intersection when the callee expression is a simple variable name.
It's a syntax error. Just like normal keyword arguments:
Good idea!
Much simpler code in the 80% of common cases? The proposal didn't aim much higher I think.
I think in your main use case, where the callee and caller often have variables with the same names and you want to be able to write plot(=x, =xscale, =fmt) or get(=url, =data, =auth) without repeating everything twice, it’s a nice idea. But I don’t think it really helps most of the OP’s uses.
Maybe not. / Anders

On Jul 15, 2019, at 05:25, Anders Hovmöller <boxed@killingar.net> wrote:
Why would two similar but different lambdas (doubling vs. squaring) with similar but different string forms (one * vs two) give a syntax error? I didn’t come up with the plot example, so I don’t know the reason it took two functions. But let’s say they represent functions for the y and z values respectively for each x, or the real and imaginary parts of a function on complex x, or the theta and phi values for each r, or two independent functions f and g that we want to plot the area between, or whatever. So, how does plot know which argument is y and which is a? It can’t be by the names, because the names are stringified lambda expressions. The only thing it can do is treat the first name-value pair in kwargs as y, and the second we z. In other words, it has to treat them as if they were positional arguments.

Well I assumed that was a typo since if it wasn't the question didn't make sense. If it wasn't a typo ithe question would be "how can the code differentiate between two different strings?" which is a bit silly.
I didn’t come up with the plot example, so I don’t know the reason it took two functions. But let’s say they represent functions for the y and z values respectively for each x, or the real and imaginary parts of a function on complex x, or the theta and phi values for each r, or two independent functions f and g that we want to plot the area between, or whatever.
So, how does plot know which argument is y and which is a? It can’t be by the names, because the names are stringified lambda expressions.
The stringification produces two different strings. One has one * while the other has two. / Anders

On Jul 17, 2019, at 10:45, Anders Hovmöller <boxed@killingar.net> wrote:
Right, of course it can tell they’re different strings, but how can it tell which string is the name of the y function and which is the name of the z function? There’s nothing about having one * vs. two that tells you which one is y and which is z. And of course there’s nothing about the values, either. If there’s no way to tell based on the names, and no way to tell based on the values, what way can there be to tell, except for the position of the name-value pairs?

On Jul 17, 2019, at 12:29, Anders Hovmöller <boxed@killingar.net> wrote:
OK, now we can get back to the original point: code that looks like it’s using keyword arguments, but acts like it’s using positional arguments, is confusing to the reader, because it’s defeating the expected purpose of keyword arguments. Your proposal is good for your original use case, but it’s not good for the OP’s use case if it can only solve that case by confusing readers. In case that isn’t clear: As the glossary puts it, the normal use of arguments is “Arguments are assigned to the names local variables in a function body”, and the normal use of keyword arguments is that you can name which of those variables the argument is for. When you write complex(imag=5, real=3), you know that the imag variable inside the constructor function gets 5 (and can even guess that the imag attribute of the constructed object will be 5). With your proposal, if you happened to have local variables lying around for real and imag, complex(=imag, =real) would be just as clear: the callee’s imag local gets the value of the caller’s imag local. But in the plot example, that isn’t what’s happening. The callee y variable gets the first argument, and z gets the second. That’s not what keywords mean, or how they work, so it’s confusing. Of course you’re right that Python 3.5+ makes it possible to write that code. Python lets you define different sets of names in __dir__ and __ getattt__, or name a function sin when it actually calculates the arctangent, and it lets you pretend to take keyword arguments but treat them positionally; that doesn’t mean that doing so isn’t confusing, or that it’s a good idea.

I'm not convinced it's more confusing than doing something like plot(x=y, y=x, z=w) where the leftmost of those single characters are the plot-space names, and the rightmost are the problem-space names. Just having them in the order that makes sense and always using the problem-space names seems nice to me.
Sure. But that's not what happens with dict() to take the most basic example.
But in the plot example, that isn’t what’s happening. The callee y variable gets the first argument, and z gets the second. That’s not what keywords mean, or how they work, so it’s confusing.
I don't have a problem with this. I disagree that this is "not what keywords mean, or how they work". It's not how they work in the most common case, I agree but there are other cases. / Anders PS. Sorry for the super late reply, this got buried a bit in my inbox

On Jul 13, 2019, at 12:16, Nima Hamidi <hamidi@stanford.edu> wrote:
Would it be easy to build something like this on top of a more general macro system, like MacroPy? If so, you could create a proof of concept using existing Python and get useful feedback. As it is, with just an explanation and a single example, it’s hard to really evaluate the idea. Also, what happens if the expression contains, say, a := assignment? In that case, calling eval against locals() doesn’t have the same effect as normal evaluation. Is that a problem, or is that behavior that people would and should expect when they ask for a bound expression instead of a value? In your example, it seems like you could evaluate it normally and pass in the value along with the AST, avoiding that problem, and making everything a lot simpler. I can imagine uses where that wouldn’t be sufficient, but they all seem like the kind of thing that demands a full macro system, so I’m not sure they’re relevant. Do you have examples where you need to eval on demand? For that matter, it seems like your example could be handled by just defining `x` to mean, say, ('x', x), without needing a live AST at all. What are the examples where the string form of the expression isn’t sufficient but this proposal is?

Thank you very much for your reply! I wasn't familiar with MacroPy. It's a good idea to implement NSE using it. I'll work on it. Sometimes it's necessary not to evaluate the expression. Two such applications of NSE in R are as follows: 1. Data-tables have cleaner syntax. For example, letting dt be a data-table with a column called price, one can retrieve items cheaper than $1 using the following: dt [price < 1]. Pandas syntax requires something like dt[dt.price < 1]. This is currently inevitable as the expression is evaluated *before* __getitem__ is invoked. Using NSE, dt.__getitem__ can, first, add its columns to locals() dictionary and then evaluate the expression in the new context. 2. Pipe-lining in R is also much cleaner. Dplyr provided an operator %>% which passes the return value of its LHS as the first argument of its RHS. In other words, f() %>% g() is equivalent to g(f()). This is pretty useful for long pipelines. The way that it works is that the operator %>% changes AST and then evaluates the modified expression. In this example, evaluating g() is undesirable. From: Andrew Barnert <abarnert@yahoo.com> Date: Saturday, July 13, 2019 at 4:26 PM To: Nima Hamidi <hamidi@stanford.edu> Cc: "python-ideas@python.org" <python-ideas@python.org> Subject: Re: [Python-ideas] Non-standard evaluation for Python On Jul 13, 2019, at 12:16, Nima Hamidi <hamidi@stanford.edu<mailto:hamidi@stanford.edu>> wrote: In the following, I sketch how I think this can be implemented: 1. Let BoundExpression be a class containing an ast.Expression as well as locals and globals dictionaries. BoundExpression can also have an eval method that evaluates its expression using its locals and globals dictionaries. 2. Let `x` be the short form for BoundExpression("x", locals(), globals()). Then, the plot function can be implemented in a way that plot(`x`, `y`) draws scatter plot of x and y and also, labels the axes correctly. No need to provide labels explicitly anymore. 3. A more challenging idea is to let developers decide whether their functions need NSE or not. For example, when a function is defined as def f(`x`), for any method call like f(x), the first argument should be wrapped with a BoundExpression instance. Would it be easy to build something like this on top of a more general macro system, like MacroPy? If so, you could create a proof of concept using existing Python and get useful feedback. As it is, with just an explanation and a single example, it’s hard to really evaluate the idea. Also, what happens if the expression contains, say, a := assignment? In that case, calling eval against locals() doesn’t have the same effect as normal evaluation. Is that a problem, or is that behavior that people would and should expect when they ask for a bound expression instead of a value? In your example, it seems like you could evaluate it normally and pass in the value along with the AST, avoiding that problem, and making everything a lot simpler. I can imagine uses where that wouldn’t be sufficient, but they all seem like the kind of thing that demands a full macro system, so I’m not sure they’re relevant. Do you have examples where you need to eval on demand? For that matter, it seems like your example could be handled by just defining `x` to mean, say, ('x', x), without needing a live AST at all. What are the examples where the string form of the expression isn’t sufficient but this proposal is?

It's an interesting idea, and has come up occasionally before. It probably stems from Lisp, where it's common practice (IIRC control flow like 'if' is defined this way). I believe it's called quoting there -- I've never heard the term NSE before. (Is that term specific to R?) I'm glad you start out by proposing dedicated syntax -- automatically inferring that a particular function takes (some or all) auto-quoted arguments would be hard because it's the parser that would need to know, but the parser doesn't have access to imported function definitions. A dedicated syntax doesn't have this problem (though it might have other problems). A few problems include: - The `x` syntax is pretty ugly -- in Python 2 this was a shorthand for repr(x), and we removed it in Python 3 because backticks are hard to read (it looks like about two pixels on my screen). Also, these days `x` is pretty commonly used as indicator for "code" (e.g. markdown). (OTOH I don't have a better suggestion.) - The ast module is exempt from the usual backwards compatibility guarantees across minor versions -- if we need to change the AST structure to accommodate a new type of expression or a new feature of the compiler, we will do his without worrying too much about backwards compatibility. (As an example, in Python 3.8 the AST uses Constant for all literal node types (numbers and strings), whereas in previous versions there are separate Num and Str node types.) - If you want to evaluate a quoted expression, there are a few different choices depending on the use case: you might want to evaluate it in exactly the environment where it was first written (i.e. the caller's environment, including nonlocals), or you might want to evaluate it in a context provided by the function that's evaluating it. I *think* you are accounting for this, but I wanted to make sure it was on your radar. - We need a function that, given an expression, returns a reasonable string. We have such a function, thanks to PEP 563, although it's not exposed (it's written in C and used only internally to construct the strings that go into the __annotations__ dict). It also doesn't preserve whitespace exactly. Now, in order to evaluate the expression we don't need the whitespace, but if you want to get the string and use that as an axis label it could become somewhat ugly. With all that in mind, and without any promises, I do think it would be cool if you could code up a prototype and see how it feels. Good luck! --Guido On Sat, Jul 13, 2019 at 12:36 PM Nima Hamidi <hamidi@stanford.edu> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him/his **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Delaying evaluation in some way looks like a useful feature which was up to this point integrated to python only by introducing special syntax for each problem which appeared. Out of my mind: the "conditional if trueresult else falseresult" appeared because one can't write something like if(conditional,trueresult,falseresult) like in lisp because of lack of delayed execution the f"{x=}" somehow appears because one can't write something like debug(x) without losing the name of the x variable. A lot of uses of lambda is as an inline parameter of a function, which might be replaced by the function taking a delayed expression as a parameter. lambda itself, might be think as a function and don't need special syntax if delayed evaluation would exists. If delayed execution some existing code might be better namedtuple('Point',['x','y']) could be rewritten as namedtuple(Point,x,y) partial(f,1,2) might be harder to understand than partial(f(1,2)) I already played with ast and as Guido says there is no backward compatibility guarantee. From my memory, the code I wrote worked only for only one minor version of python. Maybe some representation of the code which is more independent of the needs of CPython execution model would be nice and stable would be nice. Something isomorph to hy syntax, that is 3 + 2 corespond to (+ 3 2) would be very nice to work with and don't need to know the particular type of 3, 2, + in cpython internals Macropy way to make such magic appear is to do a first import an import hook and then import a module which is the place where you could use delayed execution. As such I feel like only a thing you can play with, not a thing you can ship as a library usable without special instruction. the following would be very nice if it would plot the function x,y->3*x+2*y on the range specified from plotting import plot3d plot3d(3*x+2*y,x=[0:2],y=[0,2]) I might be wrong, but the parser don't necessarily need to know which function need delayed evaluation, The ast of the arguments need to be accessible at function call, but it is just a problem of save it until the execution of the function. The fact that an argument must be evaluated just be known at the point where the argument is supposed to be evaluated, so at execution time. So at execution time, the source code plot(3*x) could be executed in something similar to: if plot.is_delayed_evaluation: plot(("3*x",{"x":x})) else: plot(3*x) A big difference between lisp macro and this proposition (as well as initial proposition as far as can say) is that macro in lisp generate source code at compilation while this proposition manipulate code at execution time (If I am correct, for equivalent code, the source code manipulation can be made only once by memoizing the delayed evaluation code) Xavier Combelle Le dim. 14 juil. 2019 à 02:53, Guido van Rossum <guido@python.org> a écrit :

Thank you very much for your feedback! I'm convinced that backtick is a bad choice for doing this. What about something like q"x"? It resembles other python syntaxes like b"x" or f"x".

This use case is nicely covered by short form for keyword arguments. So in this case instead of plot(x, z), we'd have plot(=x, =z) which would be transformed into plot(**{'x': x, 'z': z}) at compile time. This could work for any expression: foo(=lamda x: x*2) -> foo(**{'lamda x: x*2': lamda x: x*2}) This feature is easy to implement and has broad applications.

On Jul 14, 2019, at 00:33, Anders Hovmöller <boxed@killingar.net> wrote:
This use case is nicely covered by short form for keyword arguments. So in this case instead of plot(x, z), we'd have plot(=x, =z) which would be transformed into plot(**{'x': x, 'z': z}) at compile time.
This does avoid the problem of needing new syntax, but it seems to have other problems. First, the function only gets the names, not live expressions that it can eval late, or in a modified environment, etc., so it only handles this case, not cases like df[year>1990]. Second, the names show up as keyword argument names, not as values. Besides being a bit confusing to mix up names and values this way, it means everything has to be defined as def plot(**kwargs) and document its actual arguments some other way, and can’t annotate the parameter types, and if there are optional like the ones pyplot takes for point formatting, scale, etc., there’s no protection against them colliding with the names of the two fake-positional args.
This could work for any expression:
foo(=lamda x: x*2) -> foo(**{'lamda x: x*2': lamda x: x*2})
Is this actually legal? The docs just say the contents of the ** mapping are treated as additional keyword arguments. CPython happens to check that they are strings but not check that those strings are valid keywords, but I don’t think the language definition actually says this is the intended behavior, it’s just an accident of the CPython implementation. So this might require at least defining that implementation behavior as the only correct one, and changing the docs to explain it. Also, is this translation literally the way it’s implemented? Does that mean bare keyword args can only appear in a call expression in the same place as **, even though they don’t look similar?
This feature is easy to implement and has broad applications.
How is this implemented? Doesn’t the compiler have the same problem generating a string for the keyword out of an AST that the user code would have in the OP’s proposal? Also, what are the other applications besides the plot example and equivalent things? I would guess the main uses are actually really simple cases. For example, in a lot of short, simple functions, you often create variables named x and y for the things you’re going to pass as the x and y arguments to plot, and sometimes even xscale; similarly for the url argument to requests.get; and so on. All of these get a few characters shorter to type and read (and maybe a bit harder to accidentally screw up) if you can write “=x” instead of “x=x”. But are there other benefits, for less trivial cases?

Well it does introduce new syntax, just one that is generally useful.
First, the function only gets the names, not live expressions that it can eval late, or in a modified environment, etc., so it only handles this case, not cases like df[year>1990].
I don't understand how it's not the full expression. It's the expression as a string but it's still the full code.
Second, the names show up as keyword argument names, not as values. Besides being a bit confusing to mix up names and values this way, it means everything has to be defined as def plot(**kwargs) and document its actual arguments some other way, and can’t annotate the parameter types, and if there are optional like the ones pyplot takes for point formatting, scale, etc., there’s no protection against them colliding with the names of the two fake-positional args.
I think names being put with names isn't confusing ;) But yes, a system like this would probably need other tweaks for API design but that's to be expected.
You are correct. I have checked that this is the behavior of CPython, pypy, micropythob, iron python, and jython so shouldn't be a big burden.
Also, is this translation literally the way it’s implemented? Does that mean bare keyword args can only appear in a call expression in the same place as **, even though they don’t look similar?
Can be implemented. Not is. It's a suggestion.
This feature is easy to implement and has broad applications.
How is this implemented? Doesn’t the compiler have the same problem generating a string for the keyword out of an AST that the user code would have in the OP’s proposal?
Sure. The lack of round tripping in the standard library AST is a problem but in this case a simple AST dump is most likely fine even if it loses the users specific formatting.
Also, what are the other applications besides the plot example and equivalent things? I would guess the main uses are actually really simple cases. For example, in a lot of short, simple functions, you often create variables named x and y for the things you’re going to pass as the x and y arguments to plot, and sometimes even xscale; similarly for the url argument to requests.get; and so on. All of these get a few characters shorter to type and read (and maybe a bit harder to accidentally screw up) if you can write “=x” instead of “x=x”. But are there other benefits, for less trivial cases?
You are correct that this a big benefit in many small doses. There are some nice use cases like being able to create a debug printer that is very good. It's also very useful for us web developers because we often need to put a bunch of variables in a template context dict. / Anders

On Jul 14, 2019, at 11:53, Anders Hovmöller <boxed@killingar.net> wrote:
Right, sorry, I meant that it doesn’t need any new _symbols_, just a new use of an existing symbol (and in a way that’s unambiguous, both to parsers and humans). That’s obviously a big plus compared to using backticks.
First, the function only gets the names, not live expressions that it can eval late, or in a modified environment, etc., so it only handles this case, not cases like df[year>1990].
I don't understand how it's not the full expression. It's the expression as a string but it's still the full code.
But this doesn’t in any way help with late eval. The caller has already evaluated the expression to pass the value of the keyword argument. The fact that you could ignore that value and re-do the eval doesn’t help if the eval would raise a NameError like the year>1990 case (or if it would do something expensive or dangerous). And it doesn’t help you eval against a modified version of the caller environment because you don’t have the caller environment. That’s what I mean by it not being a “live expression”. You don’t need that for the df[year>1990] case, but you do for some of the other examples. Consider just the minor change of replacing the constant 1990 with a local start: df[year>start]. The OP didn’t explain how this would actually work, but it seems pretty simple—e.g., just eval against a chainmap of __dict__ (to get the year column) over the bound locals (to get the start value). There’s no way to do that with your proposal. (Well, no way short of frame hacks that you can already do without your proposal.) Finally, while this is a less serious problem than the last two, the same string isn’t guaranteed to compile to the same AST in two different contexts. Consider how your proposal would play with libraries that insert token or AST processors.
Second, the names show up as keyword argument names, not as values. Besides being a bit confusing to mix up names and values this way, it means everything has to be defined as def plot(**kwargs) and document its actual arguments some other way, and can’t annotate the parameter types, and if there are optional like the ones pyplot takes for point formatting, scale, etc., there’s no protection against them colliding with the names of the two fake-positional args.
I think names being put with names isn't confusing ;) But yes, a system like this would probably need other tweaks for API design but that's to be expected.
But they’re not being used as variable names, they’re being used as string values. That is the same confusion novices always run into that makes them ask “How do I create a local variable, for each file in a list, named after the filename?” The answer is that you don’t want to do that; you don’t want data in your variable names. But your proposal does exactly that. In the case of get(=url), that’s perfect, because you’re just saying the caller and callee both have the same value in a variable named “url”, which is exactly what you want to say. But in the case of plot(=lambda x:x*2, =lambda x:x**2), you’re saying that the callee has a variable named lambda x:x*2, which is not true. What the callee actually has is two unnamed positional arguments, each of which comes with an extra string argument buried as its name. (The fact that they’re positional, but passed as if they were keyword, accepted as if they were keyword, and then pulled out by iterating **kw in order is also confusing.)
This might actually be a good change even on its own. The fact that the docs aren’t clear on what can go in a **kw is probably not a strength of the language definition but a flaw. If every implementation does the exact same thing (especially if it’s been that way unchanged in every implementation from 2.3 to 3.8), why not document that as the rule?
It’s not just about a lack of round tripping in the standard library AST, it’s about a lack of round tripping in the implementation of the CPython, PyPy, etc. compilers. If the compilers don’t have access to that information internally, they can’t put it in the output. And I think it would be pretty confusing if spam(=eggs+cheese) gave you an argument named “eggs + cheese” instead of “eggs+cheese”. And it would be annoying if you were using it to define labels on a displayed graph—you keep trying to tweak the code to change how the label is written and it doesn’t do what you tell it to do.

Ah. Now I'm with you. Yes this is different but it also seems strange that we wouldn't just use a lambda or a function for that. That's exactly why they exist.
Finally, while this is a less serious problem than the last two, the same string isn’t guaranteed to compile to the same AST in two different contexts. Consider how your proposal would play with libraries that insert token or AST processors.
Hmm, well I've never seen one of those outside toy programs but I'll concede the point.
But in the case of plot(=lambda x:x*2, =lambda x:x**2), you’re saying that the callee has a variable named lambda x:x*2, which is not true. What the callee actually has is two unnamed positional arguments, each of which comes with an extra string argument buried as its name.
I disagree. The callee has a thing called that, it just isn't a variable but a local expression. And in any kwargs have never meant that the caller has a variable by the name of the callee argument so why would it here?
(The fact that they’re positional, but passed as if they were keyword, accepted as if they were keyword, and then pulled out by iterating **kw in order is also confusing.)
What is positional? Everything here is keyword. We must be talking past each other. My proposal is just a different short syntax for keyword arguments at the call site.
Agreed! If you don't follow this you are de facto a broken python implementation even if the spec allows it.
Agreed. But there must be a way to explicitly specify the label anyway so I don't think this is a deal breaker. / Anders

On Jul 15, 2019, at 00:27, Anders Hovmöller <boxed@killingar.net> wrote:
Yes, but they’re verbose, so they hardly solve the problem here. Think about it this way: The OP is unhappy that he had to write df[(df.price > 500) | (df.tip >5)] instead of just df[(price > 500) | (tip >5)]. Would you really suggest that the answer is to write df[lambda d: d.price > 500 or d.tip > 5]? There have been lots of proposals for general late binding, like Nick Coghlan’s concise lambda idea, that might solve this, but none of them have ever gotten near ready for prime time status. The OP’s proposal only covers a much more limited set of uses, but it does seem to cover them well. Your proposal doesn’t even attempt to solve it, so it’s not really a substitute for the OP’s idea.
But in the case of plot(=lambda x:x*2, =lambda x:x**2), you’re saying that the callee has a variable named lambda x:x*2, which is not true. What the callee actually has is two unnamed positional arguments, each of which comes with an extra string argument buried as its name.
I disagree. The callee has a thing called that, it just isn't a variable but a local expression.
An expression isn’t a name that you call something, it’s instructions to compute something. Those are very different things. .What happens If you try to assign a value to that expression, nonlocal it from a nested function, etc.? The question doesn’t even make sense, because those things only make sense for names, and expressions aren’t names.
And in any kwargs have never meant that the caller has a variable by the name of the callee argument so why would it here?
Sure, today, kwargs means that the callee has a variable with that name, not the caller. But isn’t the whole point of your proposal that often the callee _does_ have a name that exactly matches the caller name, so you shouldn’t have to repeat it?
(The fact that they’re positional, but passed as if they were keyword, accepted as if they were keyword, and then pulled out by iterating **kw in order is also confusing.)
What is positional?
The name-value pairs are positional. If the keyword names aren’t being used to match the values to parameter names, or to distinguish the values in any other way, but instead to smuggle in arbitrary extra string values, how else can you handle them but by position? Consider a concrete example, your plot(=lambda x:x*2, =lambda x:x**2). How could plot process that? It has to pull them out of kwargs in order, acting as if they were positional arguments even though they aren’t passed that way, maybe something like this: def plot(**lwargs): for name, values in kwargs.items(): draw(values, label=name)
This might actually be a good change even on its own. The fact that the docs aren’t clear on what can go in a **kw is probably not a strength of the language definition but a flaw. If every implementation does the exact same thing (especially if it’s been that way unchanged in every implementation from 2.3 to 3.8), why not document that as the rule?
Agreed! If you don't follow this you are de facto a broken python implementation even if the spec allows it.
You ought to propose this idea separately. Buried in the middle of this thread I don’t think it’s going to get noticed, but on b.p.o., or maybe its own thread, you’ve got a good shot of convincing people.
Sure, you can write something like, say, spam(eggs+cheese, label='eggs+cheese'), just as you’d do today. But then your proposal isn’t useful for these cases. If stringifying arbitrary expressions doesn’t get you the actual string that you’d want for things like labels, and can’t give you late eval or any other kind of modified eval, what does it get you? I think in your main use case, where the callee and caller often have variables with the same names and you want to be able to write plot(=x, =xscale, =fmt) or get(=url, =data, =auth) without repeating everything twice, it’s a nice idea. But I don’t think it really helps most of the OP’s uses.

It seems a bit excessive to be upset over that added verbosity I think. But I think I'd suggest df[lambda price, tip, **_: price > 500 or tip > 5] but that's basically the same. It does scale nicer with larger expressions though.
There have been lots of proposals for general late binding, like Nick Coghlan’s concise lambda idea, that might solve this, but none of them have ever gotten near ready for prime time status. The OP’s proposal only covers a much more limited set of uses, but it does seem to cover them well. Your proposal doesn’t even attempt to solve it, so it’s not really a substitute for the OP’s idea.
Largely because I misunderstood what he wrote :)
Yea ok. I don't see the relevance to any of that. We're talking about a thing that produces a dict. The keys are strings. Pretty standard stuff.
That's one point. But the other is that we can get the stringly representation of the expression. It just happens that those two have an intersection when the callee expression is a simple variable name.
It's a syntax error. Just like normal keyword arguments:
Good idea!
Much simpler code in the 80% of common cases? The proposal didn't aim much higher I think.
I think in your main use case, where the callee and caller often have variables with the same names and you want to be able to write plot(=x, =xscale, =fmt) or get(=url, =data, =auth) without repeating everything twice, it’s a nice idea. But I don’t think it really helps most of the OP’s uses.
Maybe not. / Anders

On Jul 15, 2019, at 05:25, Anders Hovmöller <boxed@killingar.net> wrote:
Why would two similar but different lambdas (doubling vs. squaring) with similar but different string forms (one * vs two) give a syntax error? I didn’t come up with the plot example, so I don’t know the reason it took two functions. But let’s say they represent functions for the y and z values respectively for each x, or the real and imaginary parts of a function on complex x, or the theta and phi values for each r, or two independent functions f and g that we want to plot the area between, or whatever. So, how does plot know which argument is y and which is a? It can’t be by the names, because the names are stringified lambda expressions. The only thing it can do is treat the first name-value pair in kwargs as y, and the second we z. In other words, it has to treat them as if they were positional arguments.

Well I assumed that was a typo since if it wasn't the question didn't make sense. If it wasn't a typo ithe question would be "how can the code differentiate between two different strings?" which is a bit silly.
I didn’t come up with the plot example, so I don’t know the reason it took two functions. But let’s say they represent functions for the y and z values respectively for each x, or the real and imaginary parts of a function on complex x, or the theta and phi values for each r, or two independent functions f and g that we want to plot the area between, or whatever.
So, how does plot know which argument is y and which is a? It can’t be by the names, because the names are stringified lambda expressions.
The stringification produces two different strings. One has one * while the other has two. / Anders

On Jul 17, 2019, at 10:45, Anders Hovmöller <boxed@killingar.net> wrote:
Right, of course it can tell they’re different strings, but how can it tell which string is the name of the y function and which is the name of the z function? There’s nothing about having one * vs. two that tells you which one is y and which is z. And of course there’s nothing about the values, either. If there’s no way to tell based on the names, and no way to tell based on the values, what way can there be to tell, except for the position of the name-value pairs?

On Jul 17, 2019, at 12:29, Anders Hovmöller <boxed@killingar.net> wrote:
OK, now we can get back to the original point: code that looks like it’s using keyword arguments, but acts like it’s using positional arguments, is confusing to the reader, because it’s defeating the expected purpose of keyword arguments. Your proposal is good for your original use case, but it’s not good for the OP’s use case if it can only solve that case by confusing readers. In case that isn’t clear: As the glossary puts it, the normal use of arguments is “Arguments are assigned to the names local variables in a function body”, and the normal use of keyword arguments is that you can name which of those variables the argument is for. When you write complex(imag=5, real=3), you know that the imag variable inside the constructor function gets 5 (and can even guess that the imag attribute of the constructed object will be 5). With your proposal, if you happened to have local variables lying around for real and imag, complex(=imag, =real) would be just as clear: the callee’s imag local gets the value of the caller’s imag local. But in the plot example, that isn’t what’s happening. The callee y variable gets the first argument, and z gets the second. That’s not what keywords mean, or how they work, so it’s confusing. Of course you’re right that Python 3.5+ makes it possible to write that code. Python lets you define different sets of names in __dir__ and __ getattt__, or name a function sin when it actually calculates the arctangent, and it lets you pretend to take keyword arguments but treat them positionally; that doesn’t mean that doing so isn’t confusing, or that it’s a good idea.

I'm not convinced it's more confusing than doing something like plot(x=y, y=x, z=w) where the leftmost of those single characters are the plot-space names, and the rightmost are the problem-space names. Just having them in the order that makes sense and always using the problem-space names seems nice to me.
Sure. But that's not what happens with dict() to take the most basic example.
But in the plot example, that isn’t what’s happening. The callee y variable gets the first argument, and z gets the second. That’s not what keywords mean, or how they work, so it’s confusing.
I don't have a problem with this. I disagree that this is "not what keywords mean, or how they work". It's not how they work in the most common case, I agree but there are other cases. / Anders PS. Sorry for the super late reply, this got buried a bit in my inbox
participants (5)
-
Anders Hovmöller
-
Andrew Barnert
-
Guido van Rossum
-
Nima Hamidi
-
Xavier Combelle