Hello Python community, Earlier this year I put forward a proposal for decorators on variables which I felt was well received in concept, but fell flat on the syntax. One of the responses to that thread was rather than abusing the decorator, to use a new symbol to get the target of an assignment statement as a string on the right hand side. I would call this compile-time reflection of assignment targets, or simply reflection. Although initially continuing on with my original syntax, the idea of using such a token in my own code came up on more than one occasion this year and I wanted to see what it might really look like. From that, this proposal is to add a new token to Python, currently spelled <<<, which, at compile time, is replaced with the string representation of the nearest assignment target, either assignment statement or assignment expression. It does not bind with either keyword arguments or default parameters. Neither does it work in any kind of non-assignment binding. When encountered anywhere other than the right hand side of some assignment, it is an unconditional SyntaxError. Having access to the target at runtime is most often useful in meta programming and the standard library has several factory functions that require the user to repeat the variable name in the function call to get expected behavior. Bellow are some examples of where I believe the reflection token would be used if adopted.
Point = namedtuple(<<<, 'x, y, z') Point <class '__main__.Point'>
UUIDType = NewType(<<<, str) UUIDType __main__.UUIDType
class Colors(Enum): ... Black = <<< ... GRAY = <<< ... WHITE = <<< ... Colors.GRAY.value 'GRAY'
HOME = '$' + <<< HOME '$HOME'
if error := response.get(<<<): ... if errorcode := error.get(<<<): ... print(f"Can't continue, got {errorcode=}") ... Can't continue, got errorcode=13
To get a feel for using this new token I have created a fork of the 3.11 alpha that implements a *very* incomplete version of this new grammar, just enough to actually produce all of the examples above. It also passes a small new test suite with further examples https://github.com/ucodery/cpython/blob/reflection/Lib/test/test_reflection.... .
On Sat, Oct 9, 2021 at 6:24 AM Jeremiah Paige <ucodery@gmail.com> wrote:
Bellow are some examples of where I believe the reflection token would be used if adopted.
Point = namedtuple(<<<, 'x, y, z') Point <class '__main__.Point'>
UUIDType = NewType(<<<, str) UUIDType __main__.UUIDType
Not very commonly needed. The class keyword handles this just fine; namedtuple does require that repetition, but I don't know of any other cases where people construct types like this.
class Colors(Enum): ... Black = <<< ... GRAY = <<< ... WHITE = <<< ... Colors.GRAY.value 'GRAY'
Can do this just as easily using Enum.auto().
HOME = '$' + <<< HOME '$HOME'
Wow, this is so incredibly useful. I'm sure I would use this construct *at least* once per decade if it existed.
if error := response.get(<<<): ... if errorcode := error.get(<<<): ... print(f"Can't continue, got {errorcode=}") ... Can't continue, got errorcode=13
Ugh. I'm sure there would be better ways to advocate unpacking but this really isn't selling it.
To get a feel for using this new token I have created a fork of the 3.11 alpha that implements a *very* incomplete version of this new grammar, just enough to actually produce all of the examples above. It also passes a small new test suite with further examples https://github.com/ucodery/cpython/blob/reflection/Lib/test/test_reflection.....
General summary based on all of the examples in that file: Use the class statement more. There is absolutely no reason, for instance, to use make_dataclass in this way - just use the class statement instead. There *may* be some value in the use of TypeVar like this (not sure though), in which case there'd be two mildly tempting use-cases, neither of which is hugely common (TypeVar and namedtuple). For the rest, a big ol' YAGNI on a syntax that badly impairs readability. And while I would somewhat like to see a dictionary unpacking syntax, this really isn't selling the concept well. ChrisA
On Fri, Oct 8, 2021 at 2:30 PM Chris Angelico <rosuav@gmail.com> wrote:
On Sat, Oct 9, 2021 at 6:24 AM Jeremiah Paige <ucodery@gmail.com> wrote:
Bellow are some examples of where I believe the reflection token would be used if adopted.
Point = namedtuple(<<<, 'x, y, z') Point <class '__main__.Point'>
UUIDType = NewType(<<<, str) UUIDType __main__.UUIDType
Not very commonly needed. The class keyword handles this just fine; namedtuple does require that repetition, but I don't know of any other cases where people construct types like this.
Besides these two and the two more in the test file, the standard library has type, new_class, import_module, TypedDict, ParamSpec, and probably more, less used, factories I have missed.
class Colors(Enum): ... Black = <<< ... GRAY = <<< ... WHITE = <<< ... Colors.GRAY.value 'GRAY'
Can do this just as easily using Enum.auto().
That's fair, but this works for constants in dataclasses, attrs, generally any class or namespace.
HOME = '$' + <<< HOME '$HOME'
Wow, this is so incredibly useful. I'm sure I would use this construct *at least* once per decade if it existed.
Perhaps the concatenation, showing it is just a string, was a poor example. In my own code I often make strings that are reused, such as for dict key access, variables of the same spelling. It looks like cpython also does this at least a few hundred times.
if error := response.get(<<<): ... if errorcode := error.get(<<<): ... print(f"Can't continue, got {errorcode=}") ... Can't continue, got errorcode=13
Ugh. I'm sure there would be better ways to advocate unpacking but this really isn't selling it.
To get a feel for using this new token I have created a fork of the 3.11 alpha that implements a *very* incomplete version of this new grammar, just enough to actually produce all of the examples above. It also passes a small new test suite with further examples https://github.com/ucodery/cpython/blob/reflection/Lib/test/test_reflection.... .
General summary based on all of the examples in that file: Use the class statement more. There is absolutely no reason, for instance, to use make_dataclass in this way - just use the class statement instead. There *may* be some value in the use of TypeVar like this (not sure though), in which case there'd be two mildly tempting use-cases, neither of which is hugely common (TypeVar and namedtuple). For the rest, a big ol' YAGNI on a syntax that badly impairs readability.
And while I would somewhat like to see a dictionary unpacking syntax, this really isn't selling the concept well.
The syntax is not only helpful to dictionary unpacking, but any retrieval by string and so is general to e.g. match.group, list.index, Message.get. Regards ~ Jeremiah
On Sat, Oct 9, 2021 at 10:02 AM Jeremiah Paige <ucodery@gmail.com> wrote:
On Fri, Oct 8, 2021 at 2:30 PM Chris Angelico <rosuav@gmail.com> wrote:
On Sat, Oct 9, 2021 at 6:24 AM Jeremiah Paige <ucodery@gmail.com> wrote:
Bellow are some examples of where I believe the reflection token would be used if adopted.
Point = namedtuple(<<<, 'x, y, z') Point <class '__main__.Point'>
UUIDType = NewType(<<<, str) UUIDType __main__.UUIDType
Not very commonly needed. The class keyword handles this just fine; namedtuple does require that repetition, but I don't know of any other cases where people construct types like this.
Besides these two and the two more in the test file, the standard library has type, new_class, import_module, TypedDict, ParamSpec, and probably more, less used, factories I have missed.
But most of those don't need to be used with constants. You don't use the type constructor when you could just use a class statement. I'm not sure about the others since I have literally never used them in production; which is an indication of how much they need special syntax to support them (namely: approximately zero).
class Colors(Enum): ... Black = <<< ... GRAY = <<< ... WHITE = <<< ... Colors.GRAY.value 'GRAY'
Can do this just as easily using Enum.auto().
That's fair, but this works for constants in dataclasses, attrs, generally any class or namespace.
Can you provide better examples then? When you offer a new piece of syntax, saying "well, it could be useful for other things" isn't nearly as convincing as actual examples that will make people's lives better.
HOME = '$' + <<< HOME '$HOME'
Wow, this is so incredibly useful. I'm sure I would use this construct *at least* once per decade if it existed.
Perhaps the concatenation, showing it is just a string, was a poor example. In my own code I often make strings that are reused, such as for dict key access, variables of the same spelling. It looks like cpython also does this at least a few hundred times.
Again, need better examples if it's to be of value. Preferably, show places where it's not just a matter of saving keystrokes (which are cheap) - show places where it reduces errors.
The syntax is not only helpful to dictionary unpacking, but any retrieval by string and so is general to e.g. match.group, list.index, Message.get.
Match groups (assuming they're named - personally, I more often use positional groups) and Message.get are definitely a plausible use-case for something, but this syntax isn't really selling it. I've no idea what your use-cases for list.index are. A generic unpacking syntax might be plausible, but it would need to handle multiple unpackings in a single operation, and it'd achieve something like: spam, ham, eggs, sausages = foo["spam"], foo["ham"], foo["eggs"], foo["sausages"] Writing that in a way that doesn't involve repeating the keys OR the thing being unpacked *would* be tempting, but the syntax you're proposing can't handle that. If your use-cases are like this, I would be much more inclined to recommend class syntax, maybe with a suitable decorator. It's a great way to create a namespace. You can do all kinds of namespace-like things by starting with a declarative structure and then giving that to whatever function you like. ChrisA
On Fri, Oct 8, 2021 at 4:27 PM Chris Angelico <rosuav@gmail.com> wrote:
On Sat, Oct 9, 2021 at 10:02 AM Jeremiah Paige <ucodery@gmail.com> wrote:
On Fri, Oct 8, 2021 at 2:30 PM Chris Angelico <rosuav@gmail.com> wrote:
On Sat, Oct 9, 2021 at 6:24 AM Jeremiah Paige <ucodery@gmail.com>
wrote:
Bellow are some examples of where I believe the reflection token would be used if adopted.
> Point = namedtuple(<<<, 'x, y, z') > Point <class '__main__.Point'>
> UUIDType = NewType(<<<, str) > UUIDType __main__.UUIDType
Not very commonly needed. The class keyword handles this just fine; namedtuple does require that repetition, but I don't know of any other cases where people construct types like this.
Besides these two and the two more in the test file, the standard library has type, new_class, import_module, TypedDict, ParamSpec, and probably more, less used, factories I have missed.
But most of those don't need to be used with constants. You don't use the type constructor when you could just use a class statement. I'm not sure about the others since I have literally never used them in production; which is an indication of how much they need special syntax to support them (namely: approximately zero).
> class Colors(Enum): ... Black = <<< ... GRAY = <<< ... WHITE = <<< ... > Colors.GRAY.value 'GRAY'
Can do this just as easily using Enum.auto().
That's fair, but this works for constants in dataclasses, attrs, generally any class or namespace.
Can you provide better examples then? When you offer a new piece of syntax, saying "well, it could be useful for other things" isn't nearly as convincing as actual examples that will make people's lives better.
> HOME = '$' + <<< > HOME '$HOME'
Wow, this is so incredibly useful. I'm sure I would use this construct *at least* once per decade if it existed.
Perhaps the concatenation, showing it is just a string, was a poor example. In my own code I often make strings that are reused, such as for dict key access, variables of the same spelling. It looks like cpython also does this at least a few hundred times.
Again, need better examples if it's to be of value. Preferably, show places where it's not just a matter of saving keystrokes (which are cheap) - show places where it reduces errors.
The syntax is not only helpful to dictionary unpacking, but any retrieval by string and so is general to e.g. match.group, list.index, Message.get.
Match groups (assuming they're named - personally, I more often use positional groups) and Message.get are definitely a plausible use-case for something, but this syntax isn't really selling it. I've no idea what your use-cases for list.index are.
A generic unpacking syntax might be plausible, but it would need to handle multiple unpackings in a single operation, and it'd achieve something like:
spam, ham, eggs, sausages = foo["spam"], foo["ham"], foo["eggs"], foo["sausages"]
Writing that in a way that doesn't involve repeating the keys OR the thing being unpacked *would* be tempting, but the syntax you're proposing can't handle that.
If your use-cases are like this, I would be much more inclined to recommend class syntax, maybe with a suitable decorator. It's a great way to create a namespace. You can do all kinds of namespace-like things by starting with a declarative structure and then giving that to whatever function you like.
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/QADKWE... Code of Conduct: http://python.org/psf/codeofconduct/
Here is a pseudo-program showing where I would like to use this token in my own code if it existed. I think besides the cases where one is forced to always repeat the variable name as a string (namedtuple, NewType) this is an easy way to express clear intent to link the variable name to either its value or original source.
REGION = os.getenv(<<<) db_url = config[REGION][<<<]
name = arguments.get(<<<)
con = connect(db_url) knights = <<< horse = <<< con.execute(f"SELECT * FROM {knights} WHERE {horse}=?", (name,))
Using the new token like this will remove bugs where the variable name was spelled correctly, but the string doing the lookup has a typo. Admittedly this is a small set of bugs, but I have run into them before. Where I see this being a bigger advantage is purposefully linking variables names within python to names outside, making it easier to refactor and easier to trace usage across an entire service and across different environments. For the other use, in factory functions, I believe we have just come to accept that it is okay to have to repeat ourselves to dynamically generate certain objects in a dynamic language. The fact is that variable names are relevant in python and can be a useful piece of information at runtime as well as compile time or for static analysis. This is why some objects have a __name__: it is useful information despite the fact it may not always be accurate.
def foo(): pass
bar = foo del foo bar.__name__ 'foo'
It may not be incredibly common but it is a power that the compiler has that is not really available to the programmer. And not every place where variable name access can be used would benefit from being implemented with the large class object and the complex implementation of a metaclass. Regards, ~ Jeremiah
I suspect there won’t be enough support for this proposal to ever make it happen, but at the very least could you think of a different token? The three left arrows just look too weird (esp. in the REPL examples, where they strongly seem to suggest a false symmetry with the ‘>>>’ prompt. How did you decide to use this symbol? On Fri, Oct 15, 2021 at 14:25 Jeremiah Paige <ucodery@gmail.com> wrote:
On Fri, Oct 8, 2021 at 4:27 PM Chris Angelico <rosuav@gmail.com> wrote:
On Sat, Oct 9, 2021 at 10:02 AM Jeremiah Paige <ucodery@gmail.com> wrote:
On Fri, Oct 8, 2021 at 2:30 PM Chris Angelico <rosuav@gmail.com> wrote:
On Sat, Oct 9, 2021 at 6:24 AM Jeremiah Paige <ucodery@gmail.com>
wrote:
Bellow are some examples of where I believe the reflection token would be used if adopted.
>> Point = namedtuple(<<<, 'x, y, z') >> Point <class '__main__.Point'>
>> UUIDType = NewType(<<<, str) >> UUIDType __main__.UUIDType
Not very commonly needed. The class keyword handles this just fine; namedtuple does require that repetition, but I don't know of any other cases where people construct types like this.
Besides these two and the two more in the test file, the standard library has type, new_class, import_module, TypedDict, ParamSpec, and probably more, less used, factories I have missed.
But most of those don't need to be used with constants. You don't use the type constructor when you could just use a class statement. I'm not sure about the others since I have literally never used them in production; which is an indication of how much they need special syntax to support them (namely: approximately zero).
>> class Colors(Enum): ... Black = <<< ... GRAY = <<< ... WHITE = <<< ... >> Colors.GRAY.value 'GRAY'
Can do this just as easily using Enum.auto().
That's fair, but this works for constants in dataclasses, attrs, generally any class or namespace.
Can you provide better examples then? When you offer a new piece of syntax, saying "well, it could be useful for other things" isn't nearly as convincing as actual examples that will make people's lives better.
>> HOME = '$' + <<< >> HOME '$HOME'
Wow, this is so incredibly useful. I'm sure I would use this construct *at least* once per decade if it existed.
Perhaps the concatenation, showing it is just a string, was a poor example. In my own code I often make strings that are reused, such as for dict key access, variables of the same spelling. It looks like cpython also does this at least a few hundred times.
Again, need better examples if it's to be of value. Preferably, show places where it's not just a matter of saving keystrokes (which are cheap) - show places where it reduces errors.
The syntax is not only helpful to dictionary unpacking, but any retrieval by string and so is general to e.g. match.group, list.index, Message.get.
Match groups (assuming they're named - personally, I more often use positional groups) and Message.get are definitely a plausible use-case for something, but this syntax isn't really selling it. I've no idea what your use-cases for list.index are.
A generic unpacking syntax might be plausible, but it would need to handle multiple unpackings in a single operation, and it'd achieve something like:
spam, ham, eggs, sausages = foo["spam"], foo["ham"], foo["eggs"], foo["sausages"]
Writing that in a way that doesn't involve repeating the keys OR the thing being unpacked *would* be tempting, but the syntax you're proposing can't handle that.
If your use-cases are like this, I would be much more inclined to recommend class syntax, maybe with a suitable decorator. It's a great way to create a namespace. You can do all kinds of namespace-like things by starting with a declarative structure and then giving that to whatever function you like.
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/QADKWE... Code of Conduct: http://python.org/psf/codeofconduct/
Here is a pseudo-program showing where I would like to use this token in my own code if it existed. I think besides the cases where one is forced to always repeat the variable name as a string (namedtuple, NewType) this is an easy way to express clear intent to link the variable name to either its value or original source.
REGION = os.getenv(<<<) db_url = config[REGION][<<<]
name = arguments.get(<<<)
con = connect(db_url) knights = <<< horse = <<< con.execute(f"SELECT * FROM {knights} WHERE {horse}=?", (name,))
Using the new token like this will remove bugs where the variable name was spelled correctly, but the string doing the lookup has a typo. Admittedly this is a small set of bugs, but I have run into them before. Where I see this being a bigger advantage is purposefully linking variables names within python to names outside, making it easier to refactor and easier to trace usage across an entire service and across different environments.
For the other use, in factory functions, I believe we have just come to accept that it is okay to have to repeat ourselves to dynamically generate certain objects in a dynamic language. The fact is that variable names are relevant in python and can be a useful piece of information at runtime as well as compile time or for static analysis. This is why some objects have a __name__: it is useful information despite the fact it may not always be accurate.
def foo(): pass
bar = foo del foo bar.__name__ 'foo'
It may not be incredibly common but it is a power that the compiler has that is not really available to the programmer. And not every place where variable name access can be used would benefit from being implemented with the large class object and the complex implementation of a metaclass.
Regards, ~ Jeremiah _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7EKOQI... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido (mobile)
On Fri, Oct 15, 2021 at 2:32 PM Guido van Rossum <guido@python.org> wrote:
I suspect there won’t be enough support for this proposal to ever make it happen, but at the very least could you think of a different token? The three left arrows just look too weird (esp. in the REPL examples, where they strongly seem to suggest a false symmetry with the ‘>>>’ prompt. How did you decide to use this symbol?
Yes, I would consider a different token. I am not the happiest with `<<<` to start with. I wanted a symbol that evoked "this value comes from the left hand side of the assignment". Most symbols containing a `=` either already mean some sort of assignment, or look like they might become an assignment operator in the future. I went with an arrow, pointing to the target, but one that doesn't conflict with any existing arrow symbols. When this idea first surfaced on ideas it was spelled `@@` which doesn't really seem to evoke anything; maybe that's good as it can't be confused, but I was hoping for an intuitive symbol. `$` also made some sense to me as referring to the target, but felt maybe out of place in python. So perhaps `$`, `%%`, or `@@`? It doesn't feel important enough, even to me, to use a keyword, and soft keywords are out because of where it is allowed. I'm open to other suggestions. Regards, Jeremiah
On Fri, Oct 15, 2021 at 6:02 PM Jeremiah Paige <ucodery@gmail.com> wrote:
On Fri, Oct 15, 2021 at 2:32 PM Guido van Rossum <guido@python.org> wrote:
I suspect there won’t be enough support for this proposal to ever make it happen, but at the very least could you think of a different token? The three left arrows just look too weird (esp. in the REPL examples, where they strongly seem to suggest a false symmetry with the ‘>>>’ prompt. How did you decide to use this symbol?
Yes, I would consider a different token. I am not the happiest with `<<<` to start with. I wanted a symbol that evoked "this value comes from the left hand side of the assignment". Most symbols containing a `=` either already mean some sort of assignment, or look like they might become an assignment operator in the future. I went with an arrow, pointing to the target, but one that doesn't conflict with any existing arrow symbols. When this idea first surfaced on ideas it was spelled `@@` which doesn't really seem to evoke anything; maybe that's good as it can't be confused, but I was hoping for an intuitive symbol. `$` also made some sense to me as referring to the target, but felt maybe out of place in python.
So perhaps `$`, `%%`, or `@@`? It doesn't feel important enough, even to me, to use a keyword, and soft keywords are out because of where it is allowed. I'm open to other suggestions.
Regards, Jeremiah
You say a soft keyword isn't an option and I understand why, but what about one that is incredibly unlikely to have been used very often? I'm thinking of just a simple double underscore:
a = __ a 'a'
This would be a breaking change, but surely __ is not in widespread use...? And I think it looks a bit of alright.
Point = NamedTuple(__, "x y")
--- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
On Fri, Oct 15, 2021 at 3:37 PM Ricky Teachey <ricky@teachey.org> wrote:
You say a soft keyword isn't an option and I understand why, but what about one that is incredibly unlikely to have been used very often? I'm thinking of just a simple double underscore:
a = __ a 'a'
This would be a breaking change, but surely __ is not in widespread use...? And I think it looks a bit of alright.
Point = NamedTuple(__, "x y")
--- Ricky.
"I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
IIRC those implementing pattern matching want to use `__` as literally a non-binding name. They can't use `_` as is done in some other languages as that name is used with some regularity by python modules. Regards, ~ Jeremiah
On Fri, Oct 15, 2021 at 06:37:04PM -0400, Ricky Teachey wrote:
You say a soft keyword isn't an option and I understand why, but what about one that is incredibly unlikely to have been used very often? I'm thinking of just a simple double underscore:
a = __ a 'a'
I frequently use double underscore as a "don't care" name to avoid shadowing the single underscore, which has special meaning in the interactive interpreter as the previous result. Jupyter and IPython use `__` for the second previous result, and `___` for the third previous result. In [10]: 2*6 Out[10]: 12 In [11]: 2*4 Out[11]: 8 In [12]: _ + __ Out[12]: 20 It would be nice if somebody were to do a survey of other languages and see which ones, if any, provide this functionality and what token they use for it. The token should preferably be: * self-explanatory, not line-noise; * shorter rather than longer, otherwise it is easier to just type the target name as a string: 'x' is easier to type than NAME_OF_ASSIGNMENT_TARGET; * backwards compatible, which means it can't be anything that is already a legal name or expression; * doesn't look like an error or typo. I don't think we can satisfy all four requirements, and I'm not entirely convinced that the use-cases are common or important enough to justify this feature if it only satisfies two or three of them. Classes, functions, decorators and imports already satisfy the "low hanging fruit" for this functionality. My estimate is that well over 99% of the use-cases for this fall into just four examples, which are already satisfied by the interpreter: # like `math = __import__('math')` import math # like K = type('K', (), ns) class K(): ... # like func = types.FunctionType(code, globals, # name='func', argdefs, closure) def func(): ... # like func = decorator(func) # similarly for classes @decorator def func(): ... If we didn't already have interpreter support for these four cases, it would definitely be worth coming up with a solution. But the use-cases that remain are, I think, quite niche and uncommon. -- Steve
On Sat, 16 Oct 2021, Steven D'Aprano wrote:
The token should preferably be:
* self-explanatory, not line-noise;
* shorter rather than longer, otherwise it is easier to just type the target name as a string: 'x' is easier to type than NAME_OF_ASSIGNMENT_TARGET;
* backwards compatible, which means it can't be anything that is already a legal name or expression;
* doesn't look like an error or typo.
A possible soft keyword: __lhs__ (short for 'left-hand side'):
REGION = os.getenv(__lhs__) db_url = config[REGION][__lhs__]
It's not especially short, and it's not backward-compatible, but at least there's a history of adding double-underscore things. Perhaps, for backward compatibility, the feature could be disabled in any scope (or file?) where __lhs__ is assigned, in which case it's treated like a variable as usual. The magic version only applies when it's used in a read-only fashion. It's kind of like a builtin variable, but its value changes on every line (and it's valid only in an assignment line). One thing I wonder: what happens if you write the following?
foo[1] = __lhs__ # or <<< or whatever
Maybe you get 'foo[1]', or maybe this is invalid syntax, in the same way that the following is.
def foo[1]: pass
Classes, functions, decorators and imports already satisfy the "low hanging fruit" for this functionality. My estimate is that well over 99% of the use-cases for this fall into just four examples, which are already satisfied by the interpreter: [...] # like func = decorator(func) # similarly for classes @decorator def func(): ...
This did get me wondering about how you could simulate this feature with decorators. Probably obvious, but here's the best version I came up with: ``` def env_var(x): return os.getenv(x.__name__) @env_var def REGION(): pass ``` It's definitely ugly to avoid repetition... Using a class, I guess we could at least get several such variables at once.
If we didn't already have interpreter support for these four cases, it would definitely be worth coming up with a solution. But the use-cases that remain are, I think, quite niche and uncommon.
To me (a mathematician), the existence of this magic in def, class, import, etc. is a sign that this is indeed useful functionality. As a fan of first-class language features, it definitely makes me wonder whether it could be generalized. But I'm not sure what the best mechanism is. (From the description in the original post, I gather that variable assignment decorators didn't work out well.) I wonder about some generalized mechanism for automatically setting the __name__ of an assigned object (like def and class), but I'm not sure what it would look like... Erik -- Erik Demaine | edemaine@mit.edu | http://erikdemaine.org/
On Sat, Oct 16, 2021 at 09:19:26AM -0400, Erik Demaine wrote:
To me (a mathematician), the existence of this magic in def, class, import, etc. is a sign that this is indeed useful functionality. As a fan of first-class language features, it definitely makes me wonder whether it could be generalized.
Obviously it is useful, because we have four major uses for it: imports, classes, functions and decorators. And I can think of at least two minor uses: the three argument form of type(), and namedtuple. The question is, are any additional uses worth the effort and potential ugliness of generalising it? Not every generalisation is worth it. Sometimes you can overgeneralise. https://www.joelonsoftware.com/2001/04/21/dont-let-architecture-astronauts-s... Until we have some compelling use-cases beyond the Big Four, I think this is a case of YAGNI. https://www.martinfowler.com/bliki/Yagni.html I've been thinking about use-cases for this feature for literally *years*, I've even got a handful of rough notes for a proto-PEP written down. To my embarrassment, today was the first time I realised that imports are also an example of this feature. So it is possible that there are many great use-cases for this and I just can't see them.
But I'm not sure what the best mechanism is.
The mechanism is probably easy, if built into the compiler. When the compiler sees something that looks like a binding operation: target = expression it knows what the target is. If the expression contains some magic token, the compiler can substitute the target as a string for the magic token. (There is at least one other possible solution.) So the implementation is probably easy. It is the design that is hard, and the justification lacking. -- Steve
On Sat, Oct 16, 2021 at 8:46 AM Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Oct 16, 2021 at 09:19:26AM -0400, Erik Demaine wrote:
To me (a mathematician), the existence of this magic in def, class, import, etc. is a sign that this is indeed useful functionality. As a fan of first-class language features, it definitely makes me wonder whether it could be generalized.
Obviously it is useful, because we have four major uses for it: imports, classes, functions and decorators. And I can think of at least two minor uses: the three argument form of type(), and namedtuple.
The question is, are any additional uses worth the effort and potential ugliness of generalising it?
Not every generalisation is worth it. Sometimes you can overgeneralise.
https://www.joelonsoftware.com/2001/04/21/dont-let-architecture-astronauts-s...
Until we have some compelling use-cases beyond the Big Four, I think this is a case of YAGNI.
https://www.martinfowler.com/bliki/Yagni.html
I've been thinking about use-cases for this feature for literally *years*, I've even got a handful of rough notes for a proto-PEP written down. To my embarrassment, today was the first time I realised that imports are also an example of this feature. So it is possible that there are many great use-cases for this and I just can't see them.
You made me realize what the four built-in methods of accessing target name have in common: they are all building namespaces. Okay so maybe decorators aren't creating a new namespace, not beyond what def already does, and def isn't often thought of as a means to create a new namespace but it certainly does. So not having this access impeeds a pythonista's ability to create novel namespaces. This token alone isn't going to make new namespace types effortless, but it is maybe the last piece not already available at runtime. And yes we already have SimpleNamespace for just creating a bag of things. But a SimpleNamespace is still a class object when broken down, It's just one that does some namespace things better than object. There is actually at least one more first class use of target names: that of async def. This is different than def because it is creating a CoroutineType not simply a FunctionType. Python actually has many different types that can be created using the def keyword, but async shows that sometimes it is not even enough to use a decorator, or a metaclass with a __call__. Having access to the target would allow for new function types as well as new namespace types. Finally, sometimes the name a class instance is assigned to *is* very significant and needs to be taken into account when creating it. These are for instance the classes given here as examples from the typing module, or for a sentinel class that in only ever be used as instances. It is not always possible, or desirable, to subclass and work with class objects rather than instances just to have access to the binding name.
But I'm not sure what the best mechanism is.
The mechanism is probably easy, if built into the compiler. When the compiler sees something that looks like a binding operation:
target = expression
it knows what the target is. If the expression contains some magic token, the compiler can substitute the target as a string for the magic token. (There is at least one other possible solution.)
So the implementation is probably easy. It is the design that is hard, and the justification lacking.
Yes, this is what my toy PoC does. It replaces the node with a string while building the AST. Regards, ~Jeremiah
On Sat, Oct 16, 2021 at 6:22 AM Erik Demaine <edemaine@mit.edu> wrote:
It's not especially short, and it's not backward-compatible, but at least there's a history of adding double-underscore things. Perhaps, for backward compatibility, the feature could be disabled in any scope (or file?) where __lhs__ is assigned, in which case it's treated like a variable as usual. The magic version only applies when it's used in a read-only fashion. It's kind of like a builtin variable, but its value changes on every line (and it's valid only in an assignment line).
This could probably be toggled via a __future__ import which would make its usage more apparent to readers of the file. But this would imply that the keyword would eventually be turned on by default; I don't think there are any examples of __future__ imports intended to be left forever as options for the programmer. I would be against any sort of magic replacement-only-when-otherwise- NameError. It seems to easy to mistakenly change it value because the namespace of locals or globals changed somewhere else. Regards, ~ Jeremiah
On Tue, Oct 19, 2021 at 11:30 AM Jeremiah Paige <ucodery@gmail.com> wrote:
This could probably be toggled via a __future__ import which would make its usage more apparent to readers of the file. But this would imply that the keyword would eventually be turned on by default; I don't think there are any examples of __future__ imports intended to be left forever as options for the programmer.
Aside from barry_as_FLUFL, which doesn't really count, no - they all have clearly-defined end dates after which they remain syntactically valid but have no effect. ChrisA
On Sat, Oct 16, 2021 at 8:22 AM Jeremiah Paige <ucodery@gmail.com> wrote:
Here is a pseudo-program showing where I would like to use this token in my own code if it existed. I think besides the cases where one is forced to always repeat the variable name as a string (namedtuple, NewType) this is an easy way to express clear intent to link the variable name to either its value or original source.
REGION = os.getenv(<<<) db_url = config[REGION][<<<]
name = arguments.get(<<<)
con = connect(db_url) knights = <<< horse = <<< con.execute(f"SELECT * FROM {knights} WHERE {horse}=?", (name,))
Toys like this often don't sell the idea very well, because there's a solid criticism of every example: 1) The environment variable REGION shouldn't be assigned to a variable named REGION, because it's not a constant. In real-world code, I'd be more likely to write >> region = os.getenv('REGION') << which wouldn't work with this magic token. 2) I'd be much more likely to put the entire config block into a variable >> cfg = config[os.getenv("REGION")] << and then use cfg.db_url for all config variables, so they also wouldn't be doubling up the names. 3) Not sure what you're doing with arguments.get(), but if that's command line args, it's way easier to wrap everything up and make them into function parameters. 4) I've no idea why you'd be setting knights to the string literal "knights" outside of a toy. If it's for the purpose of customizing the table name in the query, wouldn't it be something like >> table = "knights" << ? I'm sure there are better examples than these, but these ones really aren't a great advertisement.
Using the new token like this will remove bugs where the variable name was spelled correctly, but the string doing the lookup has a typo. Admittedly this is a small set of bugs, but I have run into them before. Where I see this being a bigger advantage is purposefully linking variables names within python to names outside, making it easier to refactor and easier to trace usage across an entire service and across different environments.
Yes. I agree in principle, but what I usually end up with is either inverting the mapping, or using a class. Here are two real-world examples from a couple of tools of mine: @cmdline def confirm_user(id, hex_key): """Attempt to confirm a user's email address id: Numeric user ID (not user name or email) hex_key: Matching key to the one stored, else the confirmation fails """ The cmdline decorator is built on top of argparse and examines the function's name and arguments to set up the argparse config. The main() function then processes arguments and calls a function as appropriate. If you don't want any name replication at all, you could use a non-type annotation like this: @cmdline def confirm_user( id: "Numeric user ID (not user name or email)", hex_key: "Matching key to the one stored, else the confirmation fails", ): """Attempt to confirm a user's email address""" This is what I mean by inverting the mapping. Instead of lines like >> hex_key = arguments.get("hex_key") << there are generic handlers that use func(**args) to map all necessary arguments directly. The other example is a use (abuse?) of class syntax to provide names: class Heavy_Encased_Frame(Manufacturer): Modular_Frame: 8 Encased_Industrial_Beam: 10 Steel_Pipe: 36 Concrete: 22 time: 64 Heavy_Modular_Frame: 3 It effectively forms a DSL that takes advantage of the names that classes have, and the way that they can "contain" a series of directives, which will be retained with name and value. Can you find some real-world examples where you're frequently doing the sorts of assignment that this new syntax would help?
For the other use, in factory functions, I believe we have just come to accept that it is okay to have to repeat ourselves to dynamically generate certain objects in a dynamic language. The fact is that variable names are relevant in python and can be a useful piece of information at runtime as well as compile time or for static analysis. This is why some objects have a __name__: it is useful information despite the fact it may not always be accurate.
def foo(): pass
bar = foo del foo bar.__name__ 'foo'
It may not be incredibly common but it is a power that the compiler has that is not really available to the programmer. And not every place where variable name access can be used would benefit from being implemented with the large class object and the complex implementation of a metaclass.
Oh, I don't think anyone will disagree with you on that :) It's extremely helpful with functions and classes, and sometimes, the class statement can serve other duties. Proper use of a decorator, metaclass, or parent class, can turn a class into something quite different. ChrisA
On Fri, Oct 15, 2021 at 2:53 PM Chris Angelico <rosuav@gmail.com> wrote:
On Sat, Oct 16, 2021 at 8:22 AM Jeremiah Paige <ucodery@gmail.com> wrote:
Here is a pseudo-program showing where I would like to use this token in my own code if it existed. I think besides the cases where one is forced
to
always repeat the variable name as a string (namedtuple, NewType) this is an easy way to express clear intent to link the variable name to either its value or original source.
REGION = os.getenv(<<<) db_url = config[REGION][<<<]
name = arguments.get(<<<)
con = connect(db_url) knights = <<< horse = <<< con.execute(f"SELECT * FROM {knights} WHERE {horse}=?", (name,))
Toys like this often don't sell the idea very well, because there's a solid criticism of every example:
1) The environment variable REGION shouldn't be assigned to a variable named REGION, because it's not a constant. In real-world code, I'd be more likely to write >> region = os.getenv('REGION') << which wouldn't work with this magic token. 2) I'd be much more likely to put the entire config block into a variable >> cfg = config[os.getenv("REGION")] << and then use cfg.db_url for all config variables, so they also wouldn't be doubling up the names. 3) Not sure what you're doing with arguments.get(), but if that's command line args, it's way easier to wrap everything up and make them into function parameters. 4) I've no idea why you'd be setting knights to the string literal "knights" outside of a toy. If it's for the purpose of customizing the table name in the query, wouldn't it be something like >> table = "knights" << ?
I'm sure there are better examples than these, but these ones really aren't a great advertisement.
I'll admit I don't really have any further examples. After this idea came up I found myself occasionally typing out code where I tried to reach for it and wishing it was already implemented. But saying "I literally had to type T = TypeVar("T") four times!" is not in itself a compelling example. Even though I know I will have to type it again, and will again feel weird that there is no better way to write this in python. I don't really write factory functions this way myself, I just use them with what sounds like more regularity than some. I would absolutely use this syntax to gather some env vars, or declare string flags, like in tix.py. My real world example that got me to circle back to this idea was writing a scraper that would collect as much metadata as possible from a package. This meant walking many different static config files that have lots of optional branches, or branches that can hold many different kinds of data. This is the short bit that reads in a package's long description from pyproject.
if readme := project.get("readme"): if isinstance(readme, str): self.readme = self._read_file(readme, "utf-8") elif text := readme.get("text"): if "file" in readme: raise CorruptPackage() self.readme = text else: charset = readme.get("charset", "utf-8") self.readme = self._read_file(readme["file"], charset)
A lot of lines in this example block are creating very short lived variables based on the key name in the file read. I want the names to be kept in sync with the config file, even though I have no fear of this specific standard changing. However you already said that unpacking in expression assignment is not selling the new syntax. And I see no one else sticking their head up to say this is of interest. Regards, ~ Jeremiah
08.10.21 22:23, Jeremiah Paige пише:
Point = namedtuple(<<<, 'x, y, z') Point <class '__main__.Point'>
UUIDType = NewType(<<<, str) UUIDType __main__.UUIDType
In many cases similar to namedtuple and NewType this is not enough. You need to pass to the constructor not only name, but module name and full qualified name. The full qualified name is needed to make nested declarations working. And to getting the module name we currently need to use _getframe() which is ugly and non-portable. It may be useful to provide access also to globals and locals of the outer scope.
participants (7)
-
Chris Angelico
-
Erik Demaine
-
Guido van Rossum
-
Jeremiah Paige
-
Ricky Teachey
-
Serhiy Storchaka
-
Steven D'Aprano