Allow starred expressions to be used as subscripts for nested lookups for getitem and setitem
Given code like this: ``` d = {1: {2: {3: 4}}} print(d[1][2][3]) d[1][2][3] = None print(d) ``` It should be possible to rewrite it using a starred expression, like this: ``` d = {1: {2: {3: 4}}} keys= 1,2,3 print(d[*keys]) d[*keys] = None print(d) ``` Hopefully it's clear from that example what I'm suggesting.
On 2021-09-01 17:41, Kevin Mills wrote:
Given code like this:
``` d = {1: {2: {3: 4}}} print(d[1][2][3]) d[1][2][3] = None print(d) ```
It should be possible to rewrite it using a starred expression, like this:
``` d = {1: {2: {3: 4}}} keys= 1,2,3 print(d[*keys]) d[*keys] = None print(d) ```
Hopefully it's clear from that example what I'm suggesting.
Would that be equivalent to d[keys[0], keys[1], keys[2]]? If so, it would be equivalent to something that's already legal, namely, a subscript that's a tuple.
No, definitely not. d[1,2,3] and d[1][2][3] are not the same thing. The latter is what I am talking about.
Kevin Mills suggested:
d = {1: {2: {3: 4}}} keys= 1,2,3 print(d[*keys]) d[*keys] = None print(d)
Hopefully it's clear from that example what I'm suggesting.
MRAB replied:
Would that be equivalent to d[keys[0], keys[1], keys[2]]?
If so, it would be equivalent to something that's already legal, namely, a subscript that's a tuple.
and Kevin responded:
No, definitely not. d[1,2,3] and d[1][2][3] are not the same thing. The latter is what I am talking about.
Ah, I read it the same as MRAB did. I think it would be *very* unfortunate if `*args` had a different meaning in subscripts from the meaning elsewhere: keys = (1, 2, 3) f(*keys) # like f(1, 2, 3) d[*keys] # like d[1][2][3] I think it might be better to start with a *function* that chains subscript calls, and perhaps put it in the operator module with itemgetter and attrgetter. # Untested. def chained_item(items, obj): for key in items: obj = obj[key] return obj -- Steve
On Fri, 3 Sept 2021 at 12:06, Steven D'Aprano <steve@pearwood.info> wrote:
I think it might be better to start with a *function* that chains subscript calls, and perhaps put it in the operator module with itemgetter and attrgetter.
# Untested. def chained_item(items, obj): for key in items: obj = obj[key] return obj
That's an interesting idea, but why chain only subscript calls? chained_item chains application of the callable objects from the iterable map(operator.itemgetter, items) It would be sufficient and more convenient if functools had something that chains application of any callable objects. Best regards, Takuo
On Wed, Sep 1, 2021 at 12:45 PM Kevin Mills <kevin.mills226@gmail.com> wrote:
d = {1: {2: {3: 4}}} keys= 1,2,3 print(d[*keys]) # Meaning: d[1][2][3] d[*keys] = None print(d)
This is a bad idea for a couple reasons. One reason MRAB points to. The `*keys` syntax is more-or-less equivalent to "substitute a tuple" in other Python contexts; you are proposing to give it a completely different meaning. This would be confusing and inconsistent. However, let's bracket the syntax for a moment. More important is that "nested" data comes in shapes other than strictly nested dictionaries. This seems to be most common in dealing with JSON data, but in concept it can arise elsewhere. A variety of libraries address this, with some small differences among them. Many are inspired by the semi-convention of "JSON Path" that is inspired by XPath for XML. For example, what if our data looks like this: data = {1: [{2: {3, 4}}, {5: {6,7}}, [8, 9, 0]]} It's not unreasonable to want to query that, and not uncommon to encounter similar things in the JSON world (I added a deliberately ugly inconsistency in the list of values associate with the key `1`; this is painful, but common, in the JSON world). A few libraries I find that try to handle this are: https://pypi.org/project/jmespath/ https://pypi.org/project/jsonpath-ng/ https://pypi.org/project/path-dict/ There are differences in their approaches, but crucially, none want or need changes to Python syntax. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
One reason MRAB points to. The `*keys` syntax is more-or-less equivalent to "substitute a tuple" in other Python contexts; you are proposing to give it a completely different meaning. This would be confusing and inconsistent.
I disagree that it is a completely different meaning. If the issue is that f(1,2,3) is the same as f(*(1,2,3)), but d[1,2,3] wouldn't be the same as d[*(1,2,3)], that's only because the "1,2,3" in d[1,2,3] already means something very different than the "1,2,3" in f(1,2,3). If we don't cheat by comparing expressions to expression lists, the two are fairly analogous. `f(t)` means pass the tuple to the function, and `d[t]` means use the tuple as a key. `f(*t)` means break up the tuple and pass each element as an argument to the function, and `d[*t]` means break up the tuple and pass each element to getitem in turn. I could see it as somewhat confusing, but I don't agree it's particularly inconsistent. It's not exactly the same, of course, because we're invoking getitem several times on different objects, rather than once with multiple arguments, but that's only because you can't pass multiple arguments to getitem, even if you wanted to invoke it directly rather than using square brackets, because it only takes one argument (or two, if you want to count self).
However, let's bracket the syntax for a moment. More important is that "nested" data comes in shapes other than strictly nested dictionaries. This seems to be most common in dealing with JSON data, but in concept it can arise elsewhere. A variety of libraries address this, with some small differences among them. Many are inspired by the semi-convention of "JSON Path" that is inspired by XPath for XML.
For example, what if our data looks like this:
data = {1: [{2: {3, 4}}, {5: {6,7}}, [8, 9, 0]]}
I'm sorry, but I don't really see how this presents an issue to what I suggested, beyond simply that you wouldn't be able to access individual elements of the sets (because sets themselves don't support getitem). For example, if you wanted to access the 9, you would do: d[1][-1][1] Which in my proposed syntax would be: d[*(1, -1, 1)] --- What initially sparked this suggestion was an issue where I had to find the maximum value in a nested list (where sublists could themselves contain sublists, but potentially might not), and modify it to something else. My initial solution was basically to walk through the list, and keep a list of all the indices to get to the currently found maximum. Then, once that's done, use the list of indices to modify what is now known to be the true maximum. I ended up writing a nested_setitem function since that sort of operation isn't natively supported. Admittedly, I changed it afterwards so that the loop instead stored the sublist the maximum was found in and only the index in that sublist, eliminating the need for nested_setitem, so even in the problem I was working on, I wouldn't have ended up using my own proposed syntax... So it might not actually be useful, after all.
On Sat, Sep 4, 2021 at 6:06 AM Kevin Mills <kevin.mills226@gmail.com> wrote:
If we don't cheat by comparing expressions to expression lists, the two are fairly analogous. `f(t)` means pass the tuple to the function, and `d[t]` means use the tuple as a key. `f(*t)` means break up the tuple and pass each element as an argument to the function, and `d[*t]` means break up the tuple and pass each element to getitem in turn.
Cheat? But that's the exact context in which starred expressions have meaning. When you call f(1,2,3), you're not calling f(1)(2)(3). You're passing all of those arguments to a single function call. That's why d[*t] would have to mean d[1,2,3] not d[1][2][3]. ChrisA
As I said before, the "1,2,3" in `f(1,2,3)` has a very different meaning than the "1,2,3" in `d[1,2,3]`. One is a (comma-separated) list of expressions, and one is a single expression, a tuple. `*(1,2,3)` does not evaluate to the tuple `(1,2,3)`, so I don't think expecting it to do so in the context of d[*(1,2,3)] makes sense. And, I would argue that having it to so would be far more inconsistent - why does it evaluate to the tuple `(1,2,3)` in the context of `d[*(1,2,3)]`, but not in the context of `f(*(1,2,3))`? Also, from a practical perspective, what would even be the point in allowing the syntax here if `d[*x]` meant exactly the same thing as `d[x]` (or `d[tuple(x)]` if it's not already a tuple)?
Dear Kevin Mills, I understand your point to some degree, but could you please clarify it further by answering the following question? Which of the following expressions will be valid in your model, and as what currently valid expression will each of the valid ones evaluated? (For instance, I understand d[*(1,2,3)] will be valid and evaluated as d[1][2][3] . Some may seem redundant, but just to be sure...) (1) d[*(1,2,3), ] (2) d[*(),1,2,3,*()] (3) d[*(1,2,3),*()] (4) d[] The reason why this could clarify your point is, for example, it's tempting to possibly incorrectly assume all (1) to (3) are going to be valid and equivalent to d[*(1,2,3)] (as their counterparts are in a function call), which could be a source of confusion since some of them might be too hard to not interpret as d[1,2,3] if valid. (E.g., I happen to have recently proposed (1) to (3) and the like as valid expressions equivalent to d[1,2,3] while not giving any interpretation to d[*x] here https://mail.python.org/archives/list/python-ideas@python.org/message/BEGGQE... with a summary https://mail.python.org/archives/list/python-ideas@python.org/message/KF37FM... ) Best regards, Takuo
On 2021-09-03 14:01, Kevin Mills wrote:
As I said before, the "1,2,3" in `f(1,2,3)` has a very different meaning than the "1,2,3" in `d[1,2,3]`. One is a (comma-separated) list of expressions, and one is a single expression, a tuple.
`*(1,2,3)` does not evaluate to the tuple `(1,2,3)`, so I don't think expecting it to do so in the context of d[*(1,2,3)] makes sense. And, I would argue that having it to so would be far more inconsistent - why does it evaluate to the tuple `(1,2,3)` in the context of `d[*(1,2,3)]`, but not in the context of `f(*(1,2,3))`?
I agree with others that your proposal would make things confusing. It's true that `*(1, 2, 3)` doesn't evaluate to the tuple `(1, 2, 3)`, but nonetheless `*(1, 2, 3)` does expand to something that remains contained within the argument list where it occurs. Your proposal would cause the `x` in `[*x]`to "leak" beyond the indexing brackets and (as you note in another message) would actually create additional indexing brackets, whereas there's no situation where existing *-expansion creates additional function calls. In other words, currently `*` can turn what looks like one function call with one thing inside it into one function call with several things inside it. You are proposing to make it so `*` can turn one indexing operation with one thing inside it into several indexing operations each with one thing inside it. I think that is quite different. To my mind, the operation of indexing itself is not sufficiently parallel to function calls to warrant a crossover of `*` between the two. A function call can have multiple arguments, but an indexing operation only ever has one index (albeit that index may be a tuple or other composite object). So there is no scope for "unpacking" because there are no extra "slots" (akin to separate arguments) to unpack into. You're proposing to not just unpack the indexes, but to "unpack" the entire indexing operation into multiple such operations, which is a bridge too far for (as it appears to be for others).
Also, from a practical perspective, what would even be the point in allowing the syntax here if `d[*x]` meant exactly the same thing as `d[x]` (or `d[tuple(x)]` if it's not already a tuple)?
I agree there would be little point in that. But that doesn't mean that the `*` is now "free" to be used for some other use in this context. Rather, because the existing unpacking behavior of `*` doesn't have much use in an indexing context, that to me means that use of `*` is "blocked" in this context and we can't use it for anything. It's better to have it not mean anything at all than have it mean something different and confusing. I would support adding some kind of multi-getter function to the stdlib. I would even support adding multi-get methods to list, dict, and tuple so that you could call `my_list.multiget(1, 2, 3)` or `my_dict.multiget(*some_keys)` and have it do a series of nested indexes. But I don't think we should add syntax for this, or at least I think we should not use `*`-unpacking-like syntax for it. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Sat, 4 Sept 2021 at 16:33, Brendan Barnwell <brenbarn@brenbarn.net> wrote:
In other words, currently `*` can turn what looks like one function call with one thing inside it into one function call with several things inside it. You are proposing to make it so `*` can turn one indexing operation with one thing inside it into several indexing operations each with one thing inside it. I think that is quite different.
I'm not strongly against your conclusion, but I think the two use of `*` are analogous in fact. Consider for instance, ``` from functools import partial curry = partial(partial, partial) F = curry(curry(f)) ``` Now F has the same amount of information as f does in the sense that e.g., `f(1,2,3)` is equivalent to `F(1)(2)(3)`. I suspect the proposer's idea comes from an analogy to this. Best regards, Takuo
On 2021-09-04 05:47, Matsuoka Takuo wrote:
On Sat, 4 Sept 2021 at 16:33, Brendan Barnwell <brenbarn@brenbarn.net> wrote:
In other words, currently `*` can turn what looks like one function call with one thing inside it into one function call with several things inside it. You are proposing to make it so `*` can turn one indexing operation with one thing inside it into several indexing operations each with one thing inside it. I think that is quite different.
I'm not strongly against your conclusion, but I think the two use of `*` are analogous in fact. Consider for instance,
``` from functools import partial
curry = partial(partial, partial) F = curry(curry(f)) ```
Now F has the same amount of information as f does in the sense that e.g., `f(1,2,3)` is equivalent to `F(1)(2)(3)`. I suspect the proposer's idea comes from an analogy to this.
But where is the *-unpacking there? You just showed a function-call equivalent of the multi-getter function that I suggested. I agree that's useful in that if you want to convert multiple arguments into multiple function calls you can make a new function that does that, but that's not what *-unpacking does. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
As I wrote before, I think a multi-getter FUNCTION is a useful thing. In fact, I linked to several libraries that provide variations on the idea. But there are a number of choice to make about the behavior of that, and absolutely no reason it needs to be in the standard library, let alone a method on common collections, and still less dedicated syntax On Sat, Sep 4, 2021, 2:06 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-09-04 05:47, Matsuoka Takuo wrote:
On Sat, 4 Sept 2021 at 16:33, Brendan Barnwell <brenbarn@brenbarn.net> wrote:
In other words, currently `*` can turn what looks like one
function
call with one thing inside it into one function call with several things inside it. You are proposing to make it so `*` can turn one indexing operation with one thing inside it into several indexing operations each with one thing inside it. I think that is quite different.
I'm not strongly against your conclusion, but I think the two use of `*` are analogous in fact. Consider for instance,
``` from functools import partial
curry = partial(partial, partial) F = curry(curry(f)) ```
Now F has the same amount of information as f does in the sense that e.g., `f(1,2,3)` is equivalent to `F(1)(2)(3)`. I suspect the proposer's idea comes from an analogy to this.
But where is the *-unpacking there? You just showed a function-call equivalent of the multi-getter function that I suggested. I agree that's useful in that if you want to convert multiple arguments into multiple function calls you can make a new function that does that, but that's not what *-unpacking does.
-- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/KPMACJ... Code of Conduct: http://python.org/psf/codeofconduct/
Dear Brendan Barnwell, On Sun, 5 Sept 2021 at 05:06, Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-09-04 05:47, Matsuoka Takuo wrote:
......
``` from functools import partial
curry = partial(partial, partial) F = curry(curry(f)) ```
Now F has the same amount of information as f does in the sense that e.g., `f(1,2,3)` is equivalent to `F(1)(2)(3)`. I suspect the proposer's idea comes from an analogy to this.
But where is the *-unpacking there?
Sure. `f(1,2,3)` is equivalent to `f(*(1,2,3))`. My understanding of Kevin Mills' suggestion is `f(1,2,3)` is not analogous to `d[1,2,3]`, but by reading it as `F(1)(2)(3)`, we can see its analogy to `d[1][2][3]`, and the notation `d[*x]`, which has no reason really to mean `d[(*x, )]`, could be for a counterpart of `f(*x)`. I can sense some naturality in the idea, but I understand it's not very straightforward.
You just showed a function-call equivalent of the multi-getter function that I suggested. I agree that's useful in that if you want to convert multiple arguments into multiple function calls you can make a new function that does that, but that's not what *-unpacking does.
Best regards, Takuo
participants (7)
-
Brendan Barnwell
-
Chris Angelico
-
David Mertz, Ph.D.
-
Kevin Mills
-
Matsuoka Takuo
-
MRAB
-
Steven D'Aprano