Mailman 3 PEP 472 - new dunder attribute, to influence item access - Python-ideas

PEP 472 - new dunder attribute, to influence item access

Jonathan Fine

26 Aug 2020 26 Aug '20

6:40 a.m.

SUMMARY Sequences and mappings are both similar and different. Let's introduce and use a new dunder attribute, to give item access appropriate behaviour. This proposal is based on an earlier thread - see Acknowledgments for URL. INTRODUCTION In Python, there are two sorts of builtin objects that have item access. They are mappings and sequences. They have different abstract base classes. Mappings and sequences have fundamental similarities, and also fundamental differences. To see an example of this, let's consider >>> x = dict() # Mapping >>> y = [None] # Sequence Consider now >>> x[0] = 'hi' >>> x[0] == 'hi' True >>> 0 in x, 'hi' in x (True, False) Compare it with >>> y[0] = 'hi' # Similar >>> y[0] == 'hi' True >>> 0 in y, 'hi' in y # Different (False, True) THE PROBLEM Not taking into account the fundamental differences between mappings and sequences can, I think, cause of difficulty when considering the semantics of >>> z[1, 2, a=3, b=4] If z is a sequence (or multi-dimensional array) then many of us would like to think of item access as a function call. In other words, ideally, >>> z[1, 2, a=3, b=4] >>> z.__getitem__(1, 2, a=3, b=4) are to be equivalent. But if z is a mapping, then perhaps ideally we'd like an object >>> key = K(1, 2, a=3, b=4) such that >>> z[1, 2, a=3, b=4] >>> z.__getitem__(key) are equivalent. PRESENT BEHAVIOUR At present >>> z[1, 2, a=3, b=4] roughly speaking calls >>> internal_get_function(z, 1, 2, a=3, b=4) where we have something like def internal_get_function(self, *argv, **kwargs): if kwargs: raise SyntaxError if len(argv) == 1: key = argv[0] else: key = argv type(self).__getitem__(self, key) PROPOSAL I think it will help solve our problem, to give Z = type(z) a new dunder attribute that either is used as the internal_get_function, or is used inside a revised system-wide internal_get_function. That way, depending on the new dunder attribute on Z = type(z), sometimes >>> z[1, 2, a=3, b=4] >>> z.__getitem__(1, 2, a=3, b=4) are equivalent. And sometimes >>> z[1, 2, a=3, b=4] is equivalent to >>> key = K(1, 2, a=3, b=4) >>> z.__getitem__(key) all depending on the new dunder attribute on Z = type(z). I hope this contribution helps. ACKNOWLEDGEMENTS I've had much valuable help from Ricky Teachey in preparing this message. I've also been influenced by his and others contributions to an earlier thread, which he started 3 weeks ago. https://mail.python.org/archives/list/python-ideas@python.org/thread/FFXXO5N... -- Jonathan

Attachments:

attachment.htm (text/html — 3.4 KB)

Show replies by date

Chris Angelico

26 Aug 26 Aug

6:54 a.m.

On Wed, Aug 26, 2020 at 9:44 PM Jonathan Fine <jfine2358@gmail.com> wrote:

...

PROPOSAL I think it will help solve our problem, to give Z = type(z) a new dunder attribute that either is used as the internal_get_function, or is used inside a revised system-wide internal_get_function.

That way, depending on the new dunder attribute on Z = type(z), sometimes >>> z[1, 2, a=3, b=4] >>> z.__getitem__(1, 2, a=3, b=4) are equivalent. And sometimes >>> z[1, 2, a=3, b=4] is equivalent to >>> key = K(1, 2, a=3, b=4) >>> z.__getitem__(key) all depending on the new dunder attribute on Z = type(z).

-1. We have already had way WAY too much debate about the alternative ways to handle kwargs in subscripts, and quite frankly, I don't think this is adding anything new to the discussion. Can we just stop bikeshedding this already and let this matter settle down? You still aren't showing any advantage over just allowing the current dunder to receive kwargs. ChrisA

Stefano Borini

5:20 p.m.

Personally I think Jonathan and I (and possibly a couple others) should form a separate subgroup and come up with a sensible and logical set of options in a proto-PEP. The topic has been discussed and we have plenty of ideas and opinions, and if we want to achieve something coherent we need to take it aside and dig deeper the various options until the whole proposal (or set of proposals) has been considered fully. There's little sense in going back and forth in the mailing list. One thing I want to understand though, and it's clear that this is a potential dealbreaker: is there any chance that the steering council will actually accept such a feature once it's been fully fleshed out, considering that it's undeniable that there are use cases spanning through multiple fields? Because if not (whatever the reason might be) regardless of the possible options to implement it, it's clear that there's no point in exploring it further. On Wed, 26 Aug 2020 at 12:58, Chris Angelico <rosuav@gmail.com> wrote:

...

On Wed, Aug 26, 2020 at 9:44 PM Jonathan Fine <jfine2358@gmail.com> wrote:

...
PROPOSAL I think it will help solve our problem, to give Z = type(z) a new dunder attribute that either is used as the internal_get_function, or is used inside a revised system-wide internal_get_function.

That way, depending on the new dunder attribute on Z = type(z), sometimes >>> z[1, 2, a=3, b=4] >>> z.__getitem__(1, 2, a=3, b=4) are equivalent. And sometimes >>> z[1, 2, a=3, b=4] is equivalent to >>> key = K(1, 2, a=3, b=4) >>> z.__getitem__(key) all depending on the new dunder attribute on Z = type(z).

-1. We have already had way WAY too much debate about the alternative ways to handle kwargs in subscripts, and quite frankly, I don't think this is adding anything new to the discussion. Can we just stop bikeshedding this already and let this matter settle down? You still aren't showing any advantage over just allowing the current dunder to receive kwargs.

ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/UCYMGB... Code of Conduct: http://python.org/psf/codeofconduct/

-- Kind regards, Stefano Borini

Chris Angelico

6:30 p.m.

On Thu, Aug 27, 2020 at 8:20 AM Stefano Borini <stefano.borini@gmail.com> wrote:

...

Personally I think Jonathan and I (and possibly a couple others) should form a separate subgroup and come up with a sensible and logical set of options in a proto-PEP. The topic has been discussed and we have plenty of ideas and opinions, and if we want to achieve something coherent we need to take it aside and dig deeper the various options until the whole proposal (or set of proposals) has been considered fully. There's little sense in going back and forth in the mailing list.

One thing I want to understand though, and it's clear that this is a potential dealbreaker: is there any chance that the steering council will actually accept such a feature once it's been fully fleshed out, considering that it's undeniable that there are use cases spanning through multiple fields? Because if not (whatever the reason might be) regardless of the possible options to implement it, it's clear that there's no point in exploring it further.

Another way to word that question is: Do you have a sponsor for your PEP? Every PEP needs a core dev (or closely connected person) to sponsor it, otherwise it won't go forward. Do you have any core devs who are on side enough to put their name down as sponsor? If not, it's not even going to get as far as the Steering Council. ChrisA

Stefano Borini

27 Aug 27 Aug

2:25 a.m.

I already sent a mail to D'Aprano and he said (please Steven correct me if I am wrong) that basically PEP-472 as is is not acceptable (and I agree) without modifications. But in my opinion the argument is circular. Who is willing to sponsor a PEP that doesn't exist yet? it's like putting a signature on a blank piece of paper. So the question is: is a core developer willing to sponsor the _idea_ (not the PEP, as it doesn't exist yet)? On Thu, 27 Aug 2020 at 00:33, Chris Angelico <rosuav@gmail.com> wrote:

...

On Thu, Aug 27, 2020 at 8:20 AM Stefano Borini <stefano.borini@gmail.com> wrote:

...
Personally I think Jonathan and I (and possibly a couple others) should form a separate subgroup and come up with a sensible and logical set of options in a proto-PEP. The topic has been discussed and we have plenty of ideas and opinions, and if we want to achieve something coherent we need to take it aside and dig deeper the various options until the whole proposal (or set of proposals) has been considered fully. There's little sense in going back and forth in the mailing list.

One thing I want to understand though, and it's clear that this is a potential dealbreaker: is there any chance that the steering council will actually accept such a feature once it's been fully fleshed out, considering that it's undeniable that there are use cases spanning through multiple fields? Because if not (whatever the reason might be) regardless of the possible options to implement it, it's clear that there's no point in exploring it further.

Another way to word that question is: Do you have a sponsor for your PEP? Every PEP needs a core dev (or closely connected person) to sponsor it, otherwise it won't go forward. Do you have any core devs who are on side enough to put their name down as sponsor? If not, it's not even going to get as far as the Steering Council.

ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/3LQCUF... Code of Conduct: http://python.org/psf/codeofconduct/

-- Kind regards, Stefano Borini

Chris Angelico

2:36 a.m.

On Thu, Aug 27, 2020 at 5:26 PM Stefano Borini <stefano.borini@gmail.com> wrote:

...

So the question is: is a core developer willing to sponsor the _idea_ (not the PEP, as it doesn't exist yet)?

It comes to the same thing (it's really just a matter of terminology). Before the PEP can be created, it needs a sponsor, which means someone needs to be willing to support the idea before you write the PEP. It's called "sponsoring the PEP" but the PEP won't exist without sponsorship. ChrisA

Stefano Borini

2:41 a.m.

Ok then it is my understanding that Steven D'Aprano is willing to sponsor it. Steven please correct me if I am dead wrong, and apologies if that's not the case. On Thu, 27 Aug 2020 at 08:39, Chris Angelico <rosuav@gmail.com> wrote:

...

On Thu, Aug 27, 2020 at 5:26 PM Stefano Borini <stefano.borini@gmail.com> wrote:

...
So the question is: is a core developer willing to sponsor the _idea_ (not the PEP, as it doesn't exist yet)?

It comes to the same thing (it's really just a matter of terminology). Before the PEP can be created, it needs a sponsor, which means someone needs to be willing to support the idea before you write the PEP. It's called "sponsoring the PEP" but the PEP won't exist without sponsorship.

ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/QQBAVZ... Code of Conduct: http://python.org/psf/codeofconduct/

-- Kind regards, Stefano Borini

Steven D'Aprano

3:30 a.m.

On Thu, Aug 27, 2020 at 08:25:45AM +0100, Stefano Borini wrote:

...

I already sent a mail to D'Aprano and he said (please Steven correct me if I am wrong) that basically PEP-472 as is is not acceptable (and I agree) without modifications.

What I said was that *I* "cannot support any of the strategies currently in the PEP". Personally I believe that the Steering Council is unlikely to agree to the PEP in its current form, but I don't speak for them, that's merely my prediction.

...

But in my opinion the argument is circular. Who is willing to sponsor a PEP that doesn't exist yet? it's like putting a signature on a blank piece of paper.

I expect that you don't need a sponsor. The PEP already exists; one of the current authors (you) wishes to resurrect it. I shall ask on Python-Dev and see what they say. -- Steve

Steven D'Aprano

26 Aug 26 Aug

6:55 a.m.

Please don't hijack an existing PEP for an unrelated issue. PEP 472 is an existing PEP for "Support for indexing with keyword arguments". https://www.python.org/dev/peps/pep-0472/ If you want to write a competing PEP, see PEP 1 for the steps you must follow: https://www.python.org/dev/peps/pep-0001 -- Steve

Ricky Teachey

9:10 a.m.

On Wed, Aug 26, 2020 at 7:58 AM Chris Angelico <rosuav@gmail.com> wrote:

...

-1. We have already had way WAY too much debate about the alternative ways to handle kwargs in subscripts, and quite frankly, I don't think this is adding anything new to the discussion. Can we just stop bikeshedding this already and let this matter settle down? You still aren't showing any advantage over just allowing the current dunder to receive kwargs.

ChrisA

On Wed, Aug 26, 2020, 8:02 AM Steven D'Aprano <steve@pearwood.info> wrote:

...

Please don't hijack an existing PEP for an unrelated issue.

PEP 472 is an existing PEP for "Support for indexing with keyword arguments".

https://www.python.org/dev/peps/pep-0472/

If you want to write a competing PEP, see PEP 1 for the steps you must follow:

https://www.python.org/dev/peps/pep-0001

I think Jonathan included the name of pep 472 in the subject line just because it's related to the previous conversations that have been going on, not to hijack the PEP for a different purpose. I understand that Steve has said he is very grumpy about all of this hijacking, derailing, bikeshedding, or whatever other metaphors are appropriate for the alternative ideas that are being proposed for future functionality that he cares about and wants to use. Perhaps this applies to Chris and others as well. But I have to say that I think this latest is a fantastic idea, and when Jonathan presented it to me it was very surprising that I had not seen it presented by anyone else yet. I think it solves a ton of problems, adds a huge amount of flexibility and functionality, and as such has the potential to be very powerful. There are at least three substantive objections I see to this: 1. It will slow down subscripting 2. It adds complexity 3. It actually doesn't add anything helpful/useful *Objection 1: Slowing Things Down* The INTENDED EFFECT of the changes to internals will be as Jonathan Fine described: every time a subscript operation occurs, this new dunder attribute gets investigated on the class, and if it is not present then the default key translation function is executed. If things were implemented in exactly that way, obviously it would slow everything down a lot. Every subscripting operation gets slowed down everywhere and that's probably not an option. However, for actual implementation, there's no reason to do that. Instead we could wrap the existing item dunder methods automatically at class creation time only when the new dunder attribute is present. If it is not present, nothing happens. In other words, what has been described as the "default key or index translation function" already exists. Let's not change that at all for classes that do not choose to use it. This would have the same effect as the proposal, but it would avoid any slowdown when a class has not provided the new attribute. Also note that this can coexist alongside Steve's and Chris's preferred solution, which is to just add kwarg passing to the item dunders. That change would constitute and adjustment to the "default key or index translation function", which is sort of like this (from Jonathan's first message): def internal_get_function(self, *argv, **kwargs): if kwargs: raise SyntaxError if len(argv) == 1: key = argv[0] else: key = argv type(self).__getitem__(self, key) *Objection 2: Adds complexity* This is hard to argue with. It certainly does add complexity. The question is whether the complexity is worth the added benefits. But in the interest of arguing that the additional complexity is not THAT onerous, I will point out that in order to add flexibility, complexity is nearly always necessitated. *Objection 3: Doesn't add anything useful/helpful* This objection seems obviously false. With the proposal, the language would support any function desired to turn the "stuff" inside a subscripting operation into the item dunder calls. For example: if this proposal were already in place and PEP 472 were to continue to be held up because of terrorists like me ;) *, one could have written this translation function and PEP-472-ify their classes already: def yay_kwargs(self, *args, **kwargs): return self.__getitem__(args, **kwargs) But another person, who doesn't like PEP 472, could do something else: def boo_kwargs(self, *args, **kwargs): if kwargs: raise SyntaxError("I prefer to live in the past.") return self.__getitem__(args) * I'm trying to be light, I'm really not offended by Steve's or Chris's comments, and I really am earnestly trying to offer something helpful here, not hold things up for people. I care about adding kwargs to the subscript operator, too.

...

Ricky Teachey

9:36 a.m.

On Wed, Aug 26, 2020 at 10:10 AM Ricky Teachey <ricky@teachey.org> wrote:

...

*Objection 1: Slowing Things Down* ...for actual implementation... we could wrap the existing item dunder methods automatically at class creation time only when the new dunder attribute is present. If it is not present, nothing happens. In other words, what has been described as the "default key or index translation function" already exists. Let's not change that at all for classes that do not choose to use it.

After reading Steve's response to me in the other thread, where he says this: On Wed, Aug 26, 2020 at 9:48 AM Steven D'Aprano <steve@pearwood.info> wrote:

...

On Tue, Aug 25, 2020 at 10:51:42PM -0400, Ricky Teachey wrote:

...
Well it would not have to inspect the signature on every subscript call,

Python is a very dynamic language and objects can change their own methods and even their class at any time, so, yes, it will have to inspect the signature on every call. Otherwise you are changing the language execution model.

Demonstration:

py> class Dynamic: ... def __getitem__(self, arg): ... print("original") ... type(self).__getitem__ = lambda *args: print("replaced") ... py> py> x = Dynamic() py> x[0] original py> x[0] replaced

...I am less optimistic that this can be implemented without slowing things down. And wrapping the methods at class creation time may not be a good idea after all. I am a little crestfallen now, because I still think the core of the idea is a wonderful idea. Is there another clever way this could be implemented, to avoid the slowdown of existing code? --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler

Paul Moore

9:40 a.m.

On Wed, 26 Aug 2020 at 15:12, Ricky Teachey <ricky@teachey.org> wrote:

...

Objection 1: Slowing Things Down

The INTENDED EFFECT of the changes to internals will be as Jonathan Fine described: every time a subscript operation occurs, this new dunder attribute gets investigated on the class, and if it is not present then the default key translation function is executed.

If things were implemented in exactly that way, obviously it would slow everything down a lot. Every subscripting operation gets slowed down everywhere and that's probably not an option.

However, for actual implementation, there's no reason to do that. Instead we could wrap the existing item dunder methods automatically at class creation time only when the new dunder attribute is present. If it is not present, nothing happens. In other words, what has been described as the "default key or index translation function" already exists. Let's not change that at all for classes that do not choose to use it.

That would mean the effect was to disallow runtime monkeypatching - the new dunder is *only* effective if added at class creation time, not if it's added later. You may not care about this, but it is a very different behaviour than any other dunder method Python supports - so quite apart from the problems people would have learning and remembering that this is a special case, you have to document *in your proposal* that you intend to allow this. And other people *do* care about disallowing dynamic features like monkeypatching. I do understand that Python's extreme dynamism is both disconcerting and frustrating when it makes proposals like this harder to implement. But conversely, that same dynamism is what makes tools like pytest possible, so we all benefit from it, even if we don't directly go around monkeypatching classes at runtime. If you think you can implement this proposal without blocking Python's dynamic features, and without introducing a performance impact, I'd say go for it - provide an example implementation, and that would clearly show people concerned about performance that it's not an issue. But personally, I'm not convinced that's possible without adding constraints that *aren't* currently included in the proposal.

...

Objection 3: Doesn't add anything useful/helpful

This objection seems obviously false.

Hardly. What are the use cases? It's "obviously false" to state that the proposal doesn't add anything at all, true. But the question is whether the addition is *useful* or *helpful*. And not just to one individual who thinks "this would be cool", but to *enough* people, whose code would be improved, to justify the cost of adding the feature. You yourself conceded that the feature adds complexity, this is where you get to explain why that cost is justified. "It's obviously helpful" isn't much of a justification. Paul

Ricky Teachey

10:06 a.m.

On Wed, Aug 26, 2020 at 10:40 AM Paul Moore <p.f.moore@gmail.com> wrote:

...

On Wed, 26 Aug 2020 at 15:12, Ricky Teachey <ricky@teachey.org> wrote:

...
Objection 1: Slowing Things Down

The INTENDED EFFECT of the changes to internals will be as Jonathan Fine described: every time a subscript operation occurs, this new dunder attribute gets investigated on the class, and if it is not present then the default key translation function is executed.

If things were implemented in exactly that way, obviously it would slow everything down a lot. Every subscripting operation gets slowed down everywhere and that's probably not an option.

However, for actual implementation, there's no reason to do that. Instead we could wrap the existing item dunder methods automatically at class creation time only when the new dunder attribute is present. If it is not present, nothing happens. In other words, what has been described as the "default key or index translation function" already exists. Let's not change that at all for classes that do not choose to use it.

That would mean the effect was to disallow runtime monkeypatching - the new dunder is *only* effective if added at class creation time, not if it's added later. You may not care about this, but it is a very different behaviour than any other dunder method Python supports - so quite apart from the problems people would have learning and remembering that this is a special case, you have to document *in your proposal* that you intend to allow this. And other people *do* care about disallowing dynamic features like monkeypatching.

I do understand that Python's extreme dynamism is both disconcerting and frustrating when it makes proposals like this harder to implement. But conversely, that same dynamism is what makes tools like pytest possible, so we all benefit from it, even if we don't directly go around monkeypatching classes at runtime.

I agree with you. Let's preserve the dynamism. I abandon that implementation idea.

...

If you think you can implement this proposal without blocking Python's dynamic features, and without introducing a performance impact, I'd say go for it - provide an example implementation, and that would clearly show people concerned about performance that it's not an issue. But personally, I'm not convinced that's possible without adding constraints that *aren't* currently included in the proposal.

Yup. That is going to be a hard one. Hopefully others smarter than me can help think about how we could do it. But I for sure see the problem.

...

Objection 3: Doesn't add anything useful/helpful

...
This objection seems obviously false.

Hardly. What are the use cases? It's "obviously false" to state that the proposal doesn't add anything at all, true. But the question is whether the addition is *useful* or *helpful*. And not just to one individual who thinks "this would be cool", but to *enough* people, whose code would be improved, to justify the cost of adding the feature. You yourself conceded that the feature adds complexity, this is where you get to explain why that cost is justified. "It's obviously helpful" isn't much of a justification.

Paul

Well I did give a couple examples: language supported opt-in for pep-472-ification, and pep-472 opt-out. But Ok, more are needed. --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler

Steven D'Aprano

8:10 p.m.

On Wed, Aug 26, 2020 at 11:06:26AM -0400, Ricky Teachey wrote:

...

Well I did give a couple examples: language supported opt-in for pep-472-ification, and pep-472 opt-out. But Ok, more are needed.

That's one example, not a couple: the ability to choose whether to opt-in or opt-out of keyword subscripts. Except you can't really. Either the interpreter allows keyword syntax in subscripts, or it doesn't. You can't opt-in if the interpreter doesn't support it, and if the interpreter does support it, you can't prevent it from being supported. Of course you can choose whether to use keywords in your own classes, but we can already do that the same way we opt out of any other feature: just don't use it! (I expect that the built-ins list, tuple, dict, str etc will all do exactly that.) -- Steve

Steven D'Aprano

10:15 a.m.

On Wed, Aug 26, 2020 at 10:10:34AM -0400, Ricky Teachey wrote:

...

But I have to say that I think this latest is a fantastic idea, and when Jonathan presented it to me it was very surprising that I had not seen it presented by anyone else yet. I think it solves a ton of problems,

Such as?

...

adds a huge amount of flexibility and functionality,

Such as?

...

With the proposal, the language would support any function desired to turn the "stuff" inside a subscripting operation into the item dunder calls.

I'm sorry, I don't understand that sentence.

...

For example: if this proposal were already in place and PEP 472 were to continue to be held up because of terrorists like me ;) *, one could have written this translation function and PEP-472-ify their classes already:

def yay_kwargs(self, *args, **kwargs): return self.__getitem__(args, **kwargs)

You're calling the `__getitem__` dunder with arbitrary keyword arguments. Are you the same Ricky Teachey who just suggested that we should be free to break code that uses `__getitem__` methods that don't obey the intent that they have only a single parameter and no keywords? If PEP 472 is held up, then `obj[1, 2, axis='north']` is a SyntaxError, so how does this method yay_kwargs make it legal? -- Steve

Ricky Teachey

11:32 a.m.

On Wed, Aug 26, 2020 at 11:19 AM Steven D'Aprano <steve@pearwood.info> wrote:

...

On Wed, Aug 26, 2020 at 10:10:34AM -0400, Ricky Teachey wrote:

...
But I have to say that I think this latest is a fantastic idea, and when Jonathan presented it to me it was very surprising that I had not seen it presented by anyone else yet. I think it solves a ton of problems,

Such as?

It creates a language supported way for the creator of the class to decide how to interpret the contents inside a subscript operation. This is a problem because disagreement over this matter is a large part of the reason PEP 472-- the spirit of which I support-- has been held up. It greatly alleviates (though not perfectly-- see the end of this message) the incongruity between how the indexing operator behaves and function calls behave. As explained by Jonathan Fine, it adds flexibility and smoothes out the differences the different paradigms of subscript operations: sequences, and mappings. And it opens up opportunities for other paradigms to be created, the most prominent example of which is type-hint creation shortcuts, like: Vector = Dict[i=float, j=float]

...

adds a huge amount of flexibility and functionality,

Such as?

With the proposal, the language would support any function desired to turn

...
the "stuff" inside a subscripting operation into the item dunder calls.

I'm sorry, I don't understand that sentence.

I'll provide examples.

...

...
For example: if this proposal were already in place and PEP 472 were to continue to be held up because of terrorists like me ;) *, one could have written this translation function and PEP-472-ify their classes already:

def yay_kwargs(self, *args, **kwargs): return self.__getitem__(args, **kwargs)

You're calling the `__getitem__` dunder with arbitrary keyword arguments. Are you the same Ricky Teachey who just suggested that we should be free to break code that uses `__getitem__` methods that don't obey the intent that they have only a single parameter and no keywords?

I am talking hypothetically-- if this proposal were already in place (which *includes* passing kwargs to the *new dunder* rather than passing them to the existing item dunders by default), you could write code like yay_kwargs today, even if the default way the language behaves did not change. In that universe, if I wrote this: class C: def __getitem__(self, key): print(key) ...and tried to do this:

...

...
...
C()[a=1] SomeError

...*in that universe*, without PEP 472, the language will STILL, by default, give an error as it does today (though it probably would no longer be a SyntaxError). PEP 472 is a proposal to change *the current, default key or index translation function* to pass **kwargs. *This *proposal is to allow for an intervening function that controls HOW they are passed. If PEP 472 is held up, then `obj[1, 2, axis='north']` is a SyntaxError,

...

so how does this method yay_kwargs make it legal?

-- Steve

Because the proposal is that if there is a dunder present containing the class attribute function, the contents of the [ ] operator get passed to that function for translation into the key. We could make the dunder to accept the target dunder method name as a parameter. This way there is only a single new dunder, rather than 3. The single new dunder might look like this: class Q: def __subscript__(self, method_name, *args, **kwargs): return getattr(self, method_name)(*args, **kwargs) def __getitem__(self, *args, **kwargs): ... # Note that I have made the RHS value the first argument in __setitem__ def __setitem__(self, value, *args, **kwargs): ... def __delitem__(self, *args, **kwargs): ... Above I am calling the appropriate dunder method directly inside of __subscript__. Again, there are other ways to do it and it does not have to be this way. If it is done that way, the __subscript__ dunder gets passed which item dunder method is being called (__getitem__, __setitem__, or __delitem__), and the arguments. Examples: No CODE CALLS 1. q[1] q.__subscript__("__getitem__", 1) 2. q[1,] q.__subscript__("__getitem__", 1) 3. q[(1,)] q.__subscript__("__getitem__", 1) 4. q[(1,),] q.__subscript__("__getitem__", (1,)) 5. q[1] = 2 q.__subscript__("__setitem__", 2, 1) 6. q[1,] = 2 q.__subscript__("__setitem__", 2, 1) 7. q[(1,)] = 2 q.__subscript__("__setitem__", 2, 1) 8. q[(1,),] = 2 q.__subscript__("__setitem__", 2, (1,)) And so on for the __delitem__ calls. NOTE: #3 and #7 are very unfortunate, but we cannot change this without breaking backwards compatibility. --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler

David Mertz

11:46 a.m.

In my mind, *anything* other than the straightforward and obvious signature `__getitem__(self, index, **kws)` is a pointless distraction. We don't need new custom objects to hold keywords. We don't need funny conditional logic about one versus multiple index objects. We don't need some other method that sometimes takes priority. Yes, it's slightly funny that square brackets convert to `index` rather than `*index`, but that ship sailed very long ago, and it's no big deal. There's no problem that needs solving and no need for code churn.

Antoine Pitrou

12:06 p.m.

On Wed, 26 Aug 2020 12:46:18 -0400 David Mertz <mertz@gnosis.cx> wrote:

...

In my mind, *anything* other than the straightforward and obvious signature `__getitem__(self, index, **kws)` is a pointless distraction.

Probably. However, one must define how it's exposed in C. Regards Antoine.

Christopher Barker

3 p.m.

On Wed, Aug 26, 2020 at 9:46 AM David Mertz <mertz@gnosis.cx> wrote:

...

In my mind, *anything* other than the straightforward and obvious signature `__getitem__(self, index, **kws)` is a pointless distraction.

...

...

Yes, it's slightly funny that square brackets convert to `index` rather than `*index`, but that ship sailed very long ago, and it's no big deal. There's no problem that needs solving and no need for code churn.

I disagree here -- sure, for full backwards compatibility and preservation of performance, we'll probably have to live with it. But the fact that the square brackets don't create a tuple makes it fundamentally odd and confusing to work with -- not too big a deal when [] only accepted a single expression, and therefor passed a single value on to the dunders, but it gets very odd when you have something that looks a lot like a function call, but is different. And it gets worse for __setitem__, which I guess will be: thing[ind1, ind2, kwd1=v1, kw2=v2] = value Translating to: thing.__setitem__(self, (ind1, ind2), value, kwd1=v1, kw2=v2) which is pretty darn weird -- particularly if you try to write the handler this way: def __setitem__(self, *args, **kwargs): so: args would always be a 2-tuple, something like: ((ind1, ind2), value) At least **kwargs would be "normal", yes? On the plus side, this weirdness is only really exposed to folks writing classes with complex custom indexing behavior, and would still look like the same semantics as a function call to the users of that class. well, almost -- it would not support [*args] -- would it support [**kwargs] ? And this kind of thing is very much a write once, use a lot situation. TL;DR -- let's watch the strong language here -- these proposals do attempt to address a real problem -- probably not worth the downsides, but this "doesn't solve any problem", "no one asked for it", etc, is really dismissive. -CHB

...

_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/IXGZNY... Code of Conduct: http://python.org/psf/codeofconduct/

-- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython

Greg Ewing

8:37 p.m.

On 27/08/20 8:00 am, Christopher Barker wrote:

...

Translating to:

thing.__setitem__(self, (ind1, ind2), value, kwd1=v1, kw2=v2)

which is pretty darn weird

If we use a new set of dunders, the signature of the one for setting could be def __setindex__(self, value, *args, **kwds) which is not completely unsurprising, but plays better with an arbitrary number of positional indexes. -- Greg

Ricky Teachey

8:54 p.m.

On Wed, Aug 26, 2020 at 9:39 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:

...

If we use a new set of dunders, the signature of the one for setting could be

def __setindex__(self, value, *args, **kwds)

which is not completely unsurprising, but plays better with an arbitrary number of positional indexes.

-- Greg

On Wed, Aug 26, 2020 at 9:34 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:

...

On 27/08/20 4:32 am, Ricky Teachey wrote:

...
class Q: def __subscript__(self, method_name, *args, **kwargs): return getattr(self, method_name)(*args, **kwargs) def __getitem__(self, *args, **kwargs): ... # Note that I have made the RHS value the first argument in __setitem__ def __setitem__(self, value, *args, **kwargs): ... def __delitem__(self, *args, **kwargs): ...

This seems like a very convoluted way of going about it as opposed to just adding a new full set of dunders. And I don't see how it actually provides any more flexibility.

-- Greg

That version of the idea I gave isn't great, I don't disagree. It wasn't really intended to be the proposed dunder function.... just the quickest way I could think of to write it up to attempt to illustrate to Steve in the bit below:

...

With the proposal, the language would support any function desired to turn

...
the "stuff" inside a subscripting operation into the item dunder calls.

I'm sorry, I don't understand that sentence.

I think the argument for providing a SINGLE new dunder, rather than three, could be that you don't have two sets of dunders that do the same thing, just with different signatures. The purpose of a new single dunder should simply be to translate the contents of the subscript operator to the existing dunders. The thing I am struggling with is how would it RETURN that translation...? In my slapdash implementation I just CALLED them, but that seems less than ideal. I supposed it could do it by returning a tuple and a dict, like this: def __key_or_index_translator__(self, pos1, *, x) -> Tuple[Tuple[Any], Dict[str, Any]]: """I only allow subscripting like this: >>> obj[1, x=2] """ return (pos1,), dict(x=x) The return value then in turn gets unpacked for each of the existing item dunders: obj.__getitem__(*(pos1,), **dict(x=x)) obj.__setitem__(value, *(pos1,), **dict(x=x)) obj.__delitem__(*(pos1,), **dict(x=x)) Now: class C: __key_or_index_translator__ = __key_or_index_translator__ def __getitem__(self, *args, **kwargs): ... def __setitem__(self, value, *args, **kwargs): ... def __delitem__(self, *args, **kwargs): ...

...

...
...
C()[1, 2, x=1, y=2] TypeError

--- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler

David Mertz

10:01 p.m.

On Wed, Aug 26, 2020 at 4:01 PM Christopher Barker <pythonchb@gmail.com> wrote:

...

Yes, it's slightly funny that square brackets convert to `index` rather

...
than `*index`, but that ship sailed very long ago, and it's no big deal. There's no problem that needs solving and no need for code churn.

TL;DR -- let's watch the strong language here -- these proposals do attempt to address a real problem -- probably not worth the downsides, but this "doesn't solve any problem", "no one asked for it", etc, is really dismissive.

I have read hundreds of comments in this thread. As I stated, I have yet to see *anyone* identify a "problem that needs solving" with all the convoluted new objects and weird support methods. There is absolutely nothing whatsoever that cannot be easily done with the obvious and straightforward signature: .__getitem__(self, index, **kws) Not one person has even vaguely suggested a single use case that cannot be addressed that way, nor even a single case where a different approach would even be slightly easier to work with. ... there was an odd response that a custom class might want to specify its special keywords. Which of course they can1 I was just giving the shorthand version. Some particular class can allow and define whichever keywords it wants, just like with every other function or method. --- Yes, the parser will have to do something special with the stuff in square brackets. As it has always done something special for stuff in square brackets. Obviously, if we call: mything[1, 2:3, four=4, five=5] That needs to get translated, at the bytecode level, into the equivalent of: mything.__getitem__((1, slice(2,3)), four=4, five=5) But the class MyThing is free to handle its keyword arguments `four` and `five` how it likes. They might have defaults or not (i.e. **kws). I think dict and list should, for now, raise an exception if they see any **kws (but ignoring them is not completely absurd either). The only question that seems slightly reasonable to see as open is "What is `index` if only keywords are subscripted?" Some sort of sentinel is needed, but conceivably None or an empty tuple wouldn't be backward compatible since those *can* be subscripts legally now. -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.

Stefano Borini

5:03 p.m.

On Wed, 26 Aug 2020 at 17:50, David Mertz <mertz@gnosis.cx> wrote:

...

In my mind, *anything* other than the straightforward and obvious signature `__getitem__(self, index, **kws)` is a pointless distraction.

It isn't. and the reason why it isn't is that it makes it much harder for the implementing code to decide how to proceed, for the following reasons: 1. you will have to check if the keywords you receive are actually in the acceptable set 2. you will have to disambiguate an argument that is supposed to be passed _either_ as position or with keyword. Always remember this use case: a matrix of acquired data, where the row is the time and the column is the detector. I've seen countless times situations where people were creating the matrix with the wrong orientation, putting the detector along the rows and the time along the column. so you want this acquired_data[time, detector] to be allowed to be written as acquired_data[time=time, detector=detector] so that it's unambiguous and you can even mess up the order, but the right thing will be done, without having to remember which one is along the row and which one is along the column. If you use the interface you propose, now the __getitem__ code has to resolve the ambiguity of intermediate cases, and do the appropriate mapping from keyword to positional. Which is annoying and you are basically doing manually what you would get for free in any standard function call. Can you do it with that implementation you propose? sure, why not, it's code after all. Is it easy and/or practical? not so sure. Is there a better way? I don't know. That's what we are here for. To me, most of the discussion is being derailed in deciding how this should look like on the __getitem__ end and how to implement it, but we haven't even decided what it should look like from the outside (although it's probably in the massive number of emails I am trying to aggregate in the new PEP). -- Kind regards, Stefano Borini

Todd

5:53 p.m.

On Wed, Aug 26, 2020, 18:04 Stefano Borini <stefano.borini@gmail.com> wrote:

...

On Wed, 26 Aug 2020 at 17:50, David Mertz <mertz@gnosis.cx> wrote:

...
In my mind, *anything* other than the straightforward and obvious

signature `__getitem__(self, index, **kws)` is a pointless distraction.

It isn't. and the reason why it isn't is that it makes it much harder for the implementing code to decide how to proceed, for the following reasons: 1. you will have to check if the keywords you receive are actually in the acceptable set 2. you will have to disambiguate an argument that is supposed to be passed _either_ as position or with keyword. Always remember this use case: a matrix of acquired data, where the row is the time and the column is the detector. I've seen countless times situations where people were creating the matrix with the wrong orientation, putting the detector along the rows and the time along the column. so you want this

acquired_data[time, detector]

to be allowed to be written as

acquired_data[time=time, detector=detector]

so that it's unambiguous and you can even mess up the order, but the right thing will be done, without having to remember which one is along the row and which one is along the column.

If you use the interface you propose, now the __getitem__ code has to resolve the ambiguity of intermediate cases, and do the appropriate mapping from keyword to positional. Which is annoying and you are basically doing manually what you would get for free in any standard function call.

Can you do it with that implementation you propose? sure, why not, it's code after all. Is it easy and/or practical? not so sure. Is there a better way? I don't know. That's what we are here for.

To me, most of the discussion is being derailed in deciding how this should look like on the __getitem__ end and how to implement it, but we haven't even decided what it should look like from the outside (although it's probably in the massive number of emails I am trying to aggregate in the new PEP).

Again, implicit on your argument here is the assumption that all keyword indices necessarily map into positional indices. This may be the case with the use-case you had in mind. But for other use-cases brought up so far that assumption is false. Your approach would make those use cases extremely difficult if not impossible.

...

Stefano Borini

6:10 p.m.

On Wed, 26 Aug 2020 at 23:56, Todd <toddrjen@gmail.com> wrote:

...

Again, implicit on your argument here is the assumption that all keyword indices necessarily map into positional indices. This may be the case with the use-case you had in mind. But for other use-cases brought up so far that assumption is false. Your approach would make those use cases extremely difficult if not impossible.

Please remind me of one. I'm literally swamped. In any case, this leads to a different question: should we deprecate anonymous axes, and if not, what is the intrinsic meaning and difference between anonymous axes and named axes? -- Kind regards, Stefano Borini

Todd

8:24 p.m.

On Wed, Aug 26, 2020 at 7:11 PM Stefano Borini <stefano.borini@gmail.com> wrote:

...

...
Again, implicit on your argument here is the assumption that all keyword indices necessarily map into positional indices. This may be the case with

On Wed, 26 Aug 2020 at 23:56, Todd <toddrjen@gmail.com> wrote: the use-case you had in mind. But for other use-cases brought up so far that assumption is false. Your approach would make those use cases extremely difficult if not impossible.

Please remind me of one. I'm literally swamped.

xarray, which is the primary python package for numpy arrays with labelled dimensions. It supports adding and indexing by additional dimensions that don't correspond directly to the dimensions of the underlying numpy array, and those have no position to match up to. They are called "non-dimension coordinates". Other people have wanted to allow parameters to be added when indexing, arguments in the index that change how the indexing behaves. These don't correspond to any dimension, either. In any case, this leads to a different question: should we deprecate

...

anonymous axes, and if not, what is the intrinsic meaning and difference between anonymous axes and named axes?

Anonymous axes are axes that someone hasn't spent the time adding names to. Although they are unsafe, there is an absolutely immense amount of code built around them. And it takes a lot of additional work for simple cases. So I think deprecated them at this point would be completely unworkable.

Stefano Borini

27 Aug 27 Aug

2:28 a.m.

We surely can't deprecate them, but the naming of axes is done at the dunder level, not at the client level. and if the idea were to add a new dunder, then of course you can always name them post-facto, and pass through to your older __getitem__ function. if you so wish. I need to take a look at xarray tonight. I am not familiar with its interface. On Thu, 27 Aug 2020 at 02:27, Todd <toddrjen@gmail.com> wrote:

...

On Wed, Aug 26, 2020 at 7:11 PM Stefano Borini <stefano.borini@gmail.com> wrote:

...
On Wed, 26 Aug 2020 at 23:56, Todd <toddrjen@gmail.com> wrote:

...
Again, implicit on your argument here is the assumption that all keyword indices necessarily map into positional indices. This may be the case with the use-case you had in mind. But for other use-cases brought up so far that assumption is false. Your approach would make those use cases extremely difficult if not impossible.

Please remind me of one. I'm literally swamped.

xarray, which is the primary python package for numpy arrays with labelled dimensions. It supports adding and indexing by additional dimensions that don't correspond directly to the dimensions of the underlying numpy array, and those have no position to match up to. They are called "non-dimension coordinates".

Other people have wanted to allow parameters to be added when indexing, arguments in the index that change how the indexing behaves. These don't correspond to any dimension, either.

...
In any case, this leads to a different question: should we deprecate anonymous axes, and if not, what is the intrinsic meaning and difference between anonymous axes and named axes?

Anonymous axes are axes that someone hasn't spent the time adding names to. Although they are unsafe, there is an absolutely immense amount of code built around them. And it takes a lot of additional work for simple cases. So I think deprecated them at this point would be completely unworkable. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/NJM5HO... Code of Conduct: http://python.org/psf/codeofconduct/

-- Kind regards, Stefano Borini

Greg Ewing

26 Aug 26 Aug

8:44 p.m.

On 27/08/20 10:53 am, Todd wrote:

...

On Wed, Aug 26, 2020, 18:04 Stefano Borini <stefano.borini@gmail.com <mailto:stefano.borini@gmail.com>> wrote:

On Wed, 26 Aug 2020 at 17:50, David Mertz <mertz@gnosis.cx <mailto:mertz@gnosis.cx>> wrote: you want this

acquired_data[time, detector]

to be allowed to be written as

acquired_data[time=time, detector=detector]

Again, implicit on your argument here is the assumption that all keyword indices necessarily map into positional indices.

No, David is saying that it should be *possible* (and not too difficult) to make that happen. Positional-only arguments can be used by classes that want to keep positional and keyword args completely separate. -- Greg

Steven D'Aprano

8:54 p.m.

On Wed, Aug 26, 2020 at 11:03:20PM +0100, Stefano Borini wrote:

...

On Wed, 26 Aug 2020 at 17:50, David Mertz <mertz@gnosis.cx> wrote:

...
In my mind, *anything* other than the straightforward and obvious signature `__getitem__(self, index, **kws)` is a pointless distraction.

It isn't. and the reason why it isn't is that it makes it much harder for the implementing code to decide how to proceed, for the following reasons: 1. you will have to check if the keywords you receive are actually in the acceptable set

I assumed -- possibly this was wishful thinking on my part :-) -- that David didn't mean *literally* only as a collection of arbitrary keywords `**kws` but was using that as an abbreviation for "whatever named keyword parameters are written in the method signature". E.g. I can write: def __getitem__(self, index, *, spam, eggs=None): to have one mandatory keyword "spam" and one optional keyword "eggs" and it will all Just Work. I presume that David would be good with that too, but if he's not, he really needs to explain why not. Before someone else points this out, I'll mention that *technically* someone could use index as a keyword argument too, but so what if they do? However if you really want to prevent it, you can make it positional-only: def __getitem__(self, index, /, *, spam, eggs=None): but honestly that's only necessary if you also collect arbitrary keywords in a `**kwargs` and want to allow "index" as one of them. Otherwise it's overkill. [...]

...

To me, most of the discussion is being derailed in deciding how this should look like on the __getitem__ end and how to implement it, but we haven't even decided what it should look like from the outside

Is that even a question? obj[index, keyword=value] where index is any comma-separated list of expressions, including slices, keyword is an identifier, and value is any expression including slices. Are there even any other options being considered? A few points: - The index is optional, but it will be a runtime TypeError if the method isn't defined with the corresponding parameter taking a default value. - You can have multiple keyword=value pairs, separated by commas. They must all follow the index part. Duplicate keywords is a runtime error, just as they are are for function calls. - An empty subscript remains a syntax error, even if the method signature would allow it.

...

(although it's probably in the massive number of emails I am trying to aggregate in the new PEP).

As PEP author, you don't have to include every little tiny detail of every rejected idea. It should be a summary, and it is okay to skip over brief ideas that went nowhere and just point back to the thread. One of the weaknesses, in my opinion, of the existing PEP is that it tries to exhaustively cover every possible permutation of options (and in doing so, managed to skip the most useful and practical option!). A PEP can be, and most of the time should be, an opinionated persuasive essay which aims to persuade the readers that your solution is the right solution. The PEP has to be fair to objections, but it doesn't have to be neutral on the issue. It shouldn't be neutral unless it is being written merely to document why the issue is being rejected. So my advise is: choose the model you want, and write the PEP to persuade that is the best model, why the others fail to meet the requirements. I really hope the model you choose is the one I describe above, in which case I will support it, but if you hate that model and want something else that's your right as the PEP author. (Although the earlier co-authors may no longer wish to be associated with the PEP if you change it radically.) -- Steve

Stefano Borini

27 Aug 27 Aug

2:39 a.m.

On Thu, 27 Aug 2020 at 03:00, Steven D'Aprano <steve@pearwood.info> wrote:

...

...
To me, most of the discussion is being derailed in deciding how this should look like on the __getitem__ end and how to implement it, but we haven't even decided what it should look like from the outside

Is that even a question?

obj[index, keyword=value]

where index is any comma-separated list of expressions, including slices, keyword is an identifier, and value is any expression including slices. Are there even any other options being considered?

Well, there are a lot of corner cases, such as what to do with obj[]. We already discussed it and we both agreed that it was a bad idea to allow it. Plus there's no consensus, as far as I can tell, if the index and the keyword should interact or not. I say they should, but others say they should not. With interact I mean that anonymous axes exist and live an independent life from named axes, and this behavior is different from function call, where you always give names to arguments on the function declaration side (unless you have *args, of course).

...

- You can have multiple keyword=value pairs, separated by commas. They must all follow the index part. Duplicate keywords is a runtime error, just as they are are for function calls.

- An empty subscript remains a syntax error, even if the method signature would allow it.

All right. So we agree on that. I wanted to be sure that we were not excluding at all the possibility to do _either_ no keyword _or_ only keywords. We do want to allow mixing. So now the question is what to do with the mixing.

...

...
(although it's probably in the massive number of emails I am trying to aggregate in the new PEP).

As PEP author, you don't have to include every little tiny detail of every rejected idea. It should be a summary, and it is okay to skip over brief ideas that went nowhere and just point back to the thread.

One of the weaknesses, in my opinion, of the existing PEP is that it tries to exhaustively cover every possible permutation of options (and in doing so, managed to skip the most useful and practical option!).

Completely agree with your observation. The main problem when I wrote it is that there was no consensus. I had my preference, the other author had a different one, and the mailing list also didn't settle on a plausible, dead sure implementation. We were as confused back then as we are today.

...

So my advise is: choose the model you want, and write the PEP to persuade that is the best model, why the others fail to meet the requirements. I really hope the model you choose is the one I describe above, in which case I will support it, but if you hate that model and want something else that's your right as the PEP author.

Will do.

...

(Although the earlier co-authors may no longer wish to be associated with the PEP if you change it radically.)

I lost contact with the other author. I haven't heard of him since, even on the mailing list. -- Kind regards, Stefano Borini

Joseph Martinot-Lagarde

6:25 p.m.

Hi, other author here ! I'm still following the lists but not writing any more. :) Stefano if you want to change pep 472 there is no problem on my side (but I've read that some people preferred a new one), I remember that you did all the work anyway ! Since lots will change just remove my name, it's fine. As for the technical discussion, I like how __getitem__(self, index, **kwargs) (and variations with named keyword args, of course) looks more like a normal function. "index" is already a strange beast (it's a tuple except when it's not), I vote that we don't add new ones and stick with existing standard types. As for foo[a=1, b=2], I'd propose to keep it a SyntaxError for now, and always require an index. This way it can be changed later when people are more used to the keyword args and have more ideas of what would be good as a default argument. In the face of ambiguity, refuse the temptation to guess ;) Joseph

Ricky Teachey

6:41 p.m.

On Thu, Aug 27, 2020, 7:31 PM Joseph Martinot-Lagarde <contrebasse@gmail.com> wrote:

...

As for foo[a=1, b=2], I'd propose to keep it a SyntaxError for now, and always require an index. This way it can be changed later when people are more used to the keyword args and have more ideas of what would be good as a default argument. In the face of ambiguity, refuse the temptation to guess ;)

...

Joseph

That last would eliminate one of the probably most important uses of the new functionality, in my opinion, which would be a type hint shortcut that looks like this: Vector = dict[i=str, j=str] Kwargs = dict[name=str, value=Any]

Joseph Martinot-Lagarde

8:22 p.m.

That's only one of the possible uses, and I'm not even sure that it would be implemented directly. Adding keywords to indexation for custom classes is not the same as modifying the standard dict type for typing. Anyway, my point was that it's still possible to do it later without breaking anything. Many features were added incrementally, including type hints !

Ricky Teachey

8:52 p.m.

On Thu, Aug 27, 2020 at 9:24 PM Joseph Martinot-Lagarde < contrebasse@gmail.com> wrote:

...

That's only one of the possible uses, and I'm not even sure that it would be implemented directly. Adding keywords to indexation for custom classes is not the same as modifying the standard dict type for typing.

It doesn't have to be the standard dict type, it can be typing.Dict instead. But I understand the desire to take it one step at a time. I very much hope that this type hint syntax comes along before too much longer, though. --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler

Todd

28 Aug 28 Aug

7:07 a.m.

On Thu, Aug 27, 2020, 19:30 Joseph Martinot-Lagarde <contrebasse@gmail.com> wrote:

...

As for foo[a=1, b=2], I'd propose to keep it a SyntaxError for now, and always require an index. This way it can be changed later when people are more used to the keyword args and have more ideas of what would be good as a default argument. In the face of ambiguity, refuse the temptation to guess ;)

This really wouldn't work for xarray, either. I think the simplest default value would be Ellipsis. So foo[a=1, b=2] would be equivalent to foo[..., a=1, b=2] But I don't see why this is a problem we have to deal with. The index argument can just not be passed at all, and it is up to the class developer to pick an appropriate sentinel if needed.

...

Christopher Barker

11:50 a.m.

On Fri, Aug 28, 2020 at 5:10 AM Todd <toddrjen@gmail.com> wrote:

...

But I don't see why this is a problem we have to deal with. The index argument can just not be passed at all, and it is up to the class developer to pick an appropriate sentinel if needed.

...

That is a good point -- any existing ocde (Or new code that doesn't support keywords) would raise an Error -- though it would be a (TypeError, rather than a SyntaxError) if no index were passed in. So why not allow it? This does require some thought about backward compatibility -- as passing anything other than a single index is now a SyntaxError, most code in teh wild will not be set up to handle the run time TypeError that might now arise. As "proper" exception handling should be close to the operation, and catch specific Exceptions, most case will probably be fine. But not all. For example, there might be code in the wild that does try: a_function(something) except TypeError: do_something And there is something in a_function that uses indexing -- someone messes with that code, and puts something new in an index that used to be a SyntaxError and is now a TypeError -- in the past, that wouldn't have even run, but now it will, and the problem might not be caught in tests because the TypeError is being handled. Given that this would be a change from a compile time error to a runtime error, there is no code that used to work that will break, but it would be easier to write broken code in certain situations -- maybe not a huge deal, but worth thinking about. -CHB

Greg Ewing

26 Aug 26 Aug

8:32 p.m.

On 27/08/20 4:32 am, Ricky Teachey wrote:

...

class Q: def __subscript__(self, method_name, *args, **kwargs): return getattr(self, method_name)(*args, **kwargs) def __getitem__(self, *args, **kwargs): ... # Note that I have made the RHS value the first argument in __setitem__ def __setitem__(self, value, *args, **kwargs): ... def __delitem__(self, *args, **kwargs): ...

This seems like a very convoluted way of going about it as opposed to just adding a new full set of dunders. And I don't see how it actually provides any more flexibility. -- Greg

Steven D'Aprano

9:30 p.m.

On Wed, Aug 26, 2020 at 12:32:56PM -0400, Ricky Teachey wrote:

...

On Wed, Aug 26, 2020 at 11:19 AM Steven D'Aprano <steve@pearwood.info> wrote:

...
On Wed, Aug 26, 2020 at 10:10:34AM -0400, Ricky Teachey wrote:

...
But I have to say that I think this latest is a fantastic idea, and when Jonathan presented it to me it was very surprising that I had not seen it presented by anyone else yet. I think it solves a ton of problems,

Such as?

It creates a language supported way for the creator of the class to decide how to interpret the contents inside a subscript operation.

We already have that: `__getitem__`.

...

This is a problem because disagreement over this matter is a large part of the reason PEP 472-- the spirit of which I support-- has been held up.

As I see it, the disagreement comes about from two main areas: - People who want to bake into the language semantics which solve a tiny niche problem ("what if I want to treat keyword:argument pairs as keys?"), making the common use-cases much harder to solve. - And people trying to over-engineer some complicated solution to something that isn't actually a problem: the supposed inconsistency between subscripts and function calls. The disagreement isn't from people arguing about how your class should interpret the keywords. That isn't holding up the PEP. Interpreting the keywords is easy: interpret them however you want. Python doesn't tell you what the keywords mean, it just hands them to your method.

...

It greatly alleviates (though not perfectly-- see the end of this message) the incongruity between how the indexing operator behaves and function calls behave. As explained by Jonathan Fine, it adds flexibility

What flexibility is added that `__getitem__` doesn't already provide?

...

and smoothes out the differences the different paradigms of subscript operations: sequences, and mappings.

Won't sequences and mappings still call the same dunder method?

...

And it opens up opportunities for other paradigms to be created, the most prominent example of which is type-hint creation shortcuts, like:

Vector = Dict[i=float, j=float]

That's not a paradigm. That's just an application of the feature. *Type-hints* was a major change to the Python's language paradigm. This is just a (potential) nice syntax that allows a concise but readable type-hint. https://en.wikipedia.org/wiki/Paradigm [...]

...

...
...
For example: if this proposal were already in place and PEP 472 were to continue to be held up because of terrorists like me ;) *, one could have written this translation function and PEP-472-ify their classes already:

def yay_kwargs(self, *args, **kwargs): return self.__getitem__(args, **kwargs)

You're calling the `__getitem__` dunder with arbitrary keyword arguments. Are you the same Ricky Teachey who just suggested that we should be free to break code that uses `__getitem__` methods that don't obey the intent that they have only a single parameter and no keywords?

I am talking hypothetically-- if this proposal were already in place (which *includes* passing kwargs to the *new dunder* rather than passing them to the existing item dunders by default), you could write code like yay_kwargs today, even if the default way the language behaves did not change.

You're suggesting that if subscript keywords were permitted, rather than having the keywords passed directly to `__getitem__` where they are wanted, the interpreter should pass them to another method and then that method could pass them to `__getitem__`. Or we could just pass them directly into `__getitem__`, as requested. How does the interpreter know to call yay_kwargs rather than some other method?

...

In that universe, if I wrote this:

class C: def __getitem__(self, key): print(key)

...and tried to do this:

...
...
...
C()[a=1] SomeError

...*in that universe*, without PEP 472, the language will STILL, by default, give an error as it does today (though it probably would no longer be a SyntaxError).

That is a remarkably unenlightening example. There's no sign of what your new method is or what it does. There's an error, but you don't know what it is except it "probably" won't be a syntax error, so you don't even know if this error is detected at compile time or runtime. You say PEP 472 is not accepted, but maybe it is accepted since keyword subscripts are possibly not a syntax error, or maybe they are still a syntax error. Who knows? Not me, that's for sure. I feel like I'm trying to nail jelly to a wall, trying to understand your proposal. Every time I ask a question, it seems to twist and mutate and become something else. And how does this relate to Jonathan's idea of "signature dependent semantics" which you were singing the praises of, but seems to have nothing to do with what you are describing now. [...]

...

We could make the dunder to accept the target dunder method name as a parameter. This way there is only a single new dunder, rather than 3.

Or we could have no new dunders at all, and just pass the keywords to the appropriate, and existing, `__*item__` methods. I'm going to ignore the business about `__subscript__` because you seem to have backed away from it in a later email, except for one relatively minor point. You proposed this:

...

No CODE CALLS 1. q[1] q.__subscript__("__getitem__", 1) 2. q[1,] q.__subscript__("__getitem__", 1) 3. q[(1,)] q.__subscript__("__getitem__", 1)

But that's not what subscripting does now. If you use `q[1,]` the getitem method gets the tuple (1,), not the int 1. So if you're serious about that, it's a breaking change. py> "abcd"[1] 'b' py> "abcd"[1,] TypeError: string indices must be integers -- Steve

Ricky Teachey

11:33 p.m.

On Wed, Aug 26, 2020 at 10:34 PM Steven D'Aprano <steve@pearwood.info> wrote:

...

On Wed, Aug 26, 2020 at 12:32:56PM -0400, Ricky Teachey wrote:

...
It creates a language supported way for the creator of the class to decide how to interpret the contents inside a subscript operation.

We already have that: `__getitem__`.

Actually no, we have *THREE* dunders: get, set, and del -item. And they accept a single argument containing a single object representing a single key or index. And if I had to make a prediction, then probably pretty soon they'll accept kwargs, too. And these are great as-is when the way you want to use them is in total agreement with the way they were envisioned to be used: for single argument keys/indexes. But any use case that deviates from that vision requires effort spread across three dunder methods to set it up right. And it requires writing code that breaks up the positional arguments in just the right way; code that you pretty much never have to write when using the function paradigm (rather than the item dunder paradigm), because the language does it all for us: def f(pos1, pos2, *args, kwd1, kwd2=None, **kwargs): ... I don't have to write code in the function body of `f` breaking the arguments up the way I intend to use them, and errors are raised when required arguments are missing, etc etc. This is a set of fantastic language features. But the positional part of it is missing from subscripting, from __getitem__, __setitem__, and __delitem__ and you have to break up your positionals in three different dunder methods. If you are a person who wants to use the subscripting operator from a more function-like paradigm, this is kind of a lot of work and extra code to maintain-- code that simply would not be needed at all if there was a single dunder responsible for accepting the key/index "stuff" and then sending it off-- somehow-- into the item dunders to be used. How is it sent? I don't know yet. Thinking about it.

...

...
It greatly alleviates (though not perfectly-- see the end of this message) the incongruity between how the indexing operator behaves and function calls behave. As explained by Jonathan Fine, it adds flexibility

What flexibility is added that `__getitem__` doesn't already provide?

I explained above: a single point of subscript argument acceptance that provides all the language features of function signatures.

...

and smoothes out

...
the differences the different paradigms of subscript operations: sequences, and mappings.

Won't sequences and mappings still call the same dunder method?

Yes, but again, all the existing power the language has for calling functions and passing them to the function signature can be brought to the table for all three subscript operators using a single new dunder. This smoothes things out, because if you don't want to use the default python treatment of positional subscript arguments as a single key or index object, it makes it much easier to do things another way.

...

And it opens up opportunities for other paradigms to be

...
created, the most prominent example of which is type-hint creation shortcuts, like:

Vector = Dict[i=float, j=float]

That's not a paradigm. That's just an application of the feature. *Type-hints* was a major change to the Python's language paradigm. This is just a (potential) nice syntax that allows a concise but readable type-hint.

https://en.wikipedia.org/wiki/Paradigm

I'm not going to respond to this other than to say that I think I was clear and providing a link to a dictionary entry for "paradigm" sort of feels like picking on my choice of words rather than actually responding to me with candor. If so, that doesn't seem like a very kind conversation choice. If I am correct, I'd like to ask that in the future, please choose to be more kind when you respond to me? Thank you. On the other hand if I am wrong and it wasn't clear what I am saying, or if you were just being humorous, I very much apologize for being oversensitive. :) You're suggesting that if subscript keywords were permitted, rather than

...

having the keywords passed directly to `__getitem__` where they are wanted, the interpreter should pass them to another method and then that method could pass them to `__getitem__`.

The only thing Jonathan and I are proposing at this time is a __dunder__ that gets passed the contents of the subscript operator and does the handling of the contents they get passed along: 1. New __dunder__ that gets the contents 2. ??? 3. Item dunders called the way the user wants them called. The details of 2-- how that function "passes on" that information to the item dunders-- I am currently unsure about. ONE way might be to have the new method call the existing dunders. But that's not my preference-- I would like to find another way. I do have another idea, and I presented it in a previous response. Here it is again: def __key_or_index_translator__(self, pos1, pos2, *, x) -> Tuple[Tuple[Any], Dict[str, Any]]: """I only allow subscripting like this: >>> obj[1, x=2] """ return (pos1, pos2), dict(x=x) The returned two-tuple, `t`, in turn gets unpacked like for each of the existing item dunders: obj.__getitem__(*t[0], **t[1]) obj.__setitem__(value, *t[0], **t[1]) obj.__delitem__(*t[0], **t[1]) Or we could just pass them directly into `__getitem__`, as requested.

...

How does the interpreter know to call yay_kwargs rather than some other method?

The yay_kwargs would be assigned to the new dunder attribute: class C: __new_dunder__ = yay_kwargs However, again, I am not really proposing that the dunder actually calls __get_item__, etc. That is one way to do it, but I don't like it much. I think I like the way above a lot better. But I am open to other suggestions.

...

I feel like I'm trying to nail jelly to a wall, trying to understand your proposal. Every time I ask a question, it seems to twist and mutate and become something else.

That's because I have been exploring multiple ideas at the same time. And this particular proposal is really the nugget of an idea currently, the details of which are changing as part of an ongoing conversation with input from other people. Parenthetically: my comments above about what I felt like was a lack of kindness notwithstanding, I very much appreciate your direct engagement on this topic with me. You've pointed several things out in the process that have informed my thinking. And you've been really patient considering that you've made it clear you'd prefer not to have anymore additional proposals derailing things. :) So really, thank you.

...

And how does this relate to Jonathan's idea of "signature dependent semantics" which you were singing the praises of, but seems to have nothing to do with what you are describing now.

Well I don't want Jonthan Fine blamed for that one: it was actually my idea. I floated it in the other thread, the title of which was: "Changing item dunder method signatures to utilize positional arguments (open thread)" The purpose of that thread is to explore options for how we might find a backwards compatible way to allow subscripting to be used from a function paradigm, if desired by the programmer. The sigdepsum approach was just one such idea, and it doesn't relate to this. I have been presenting several ideas I realize, but it's the ideas list after all. I'm going to ignore the business about `__subscript__` because you seem

...

to have backed away from it in a later email, except for one relatively minor point. You proposed this:

...
No CODE CALLS 1. q[1] q.__subscript__("__getitem__", 1) 2. q[1,] q.__subscript__("__getitem__", 1) 3. q[(1,)] q.__subscript__("__getitem__", 1)

But that's not what subscripting does now.

Yes, backed away from calling __getitem__ inside of __subscript__. And yes: I know that's not what it does. That's what it will do if we created a new dunder that uses the function paradigm for accepting the arguments so they can be passed along to the existing dunders. If you use `q[1,]` the

...

getitem method gets the tuple (1,), not the int 1. So if you're serious about that, it's a breaking change.

It isn't breaking. Because the default key or index translation function, or __subscript__ function-- the one that implicitly exists today-- won't change. It does this, as you know: CODE CALLS q[1,] q.__getitem__((1,)) q[1,] = None q.__setitem__((1,), None) del q[1,] q.__delitem__((1,)) The proposal is that if a different translation function isn't provided, things just get passed to the existing 3 dunders as they do today. Only in the case that a new dunder-- something like __subscript__-- gets called does to change to this: CODE FIRST CALLED LATER CALLED q[1,] q.__subscript__(1) q.__getitem__ in whatever way __subscript__ directs q[1,] = None q.__subscript__(None, 1) q.__setitem__ in whatever way __subscript__ directs del q[1,] q.__subscript__(1) q.__delitem__ in whatever way __subscript__ directs --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler

Ricky Teachey

11:41 p.m.

...

ONE way might be to have the new method call the existing dunders. But that's not my preference-- I would like to find another way. I do have another idea, and I presented it in a previous response. Here it is again:

def __key_or_index_translator__(self, pos1, pos2, *, x) -> Tuple[Tuple[Any], Dict[str, Any]]: """I only allow subscripting like this:

>>> obj[1, x=2] """ return (pos1, pos2), dict(x=x)

The returned two-tuple, `t`, in turn gets unpacked like for each of the existing item dunders:

obj.__getitem__(*t[0], **t[1]) obj.__setitem__(value, *t[0], **t[1]) obj.__delitem__(*t[0], **t[1])

Cripes, I screwed up the __key_or_index_translator__ docstring. See correction below, sorry. ------------------- def __key_or_index_translator__(self, pos1, pos2, *, x) -> Tuple[Tuple[Any], Dict[str, Any]]: """I only allow subscripting like this: >>> obj[1, 2, x=2] """ return (pos1, pos2), dict(x=x) --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler

Alexandre Brault

27 Aug 27 Aug

12:08 a.m.

On 2020-08-27 12:33 a.m., Ricky Teachey wrote:

...

On Wed, Aug 26, 2020 at 10:34 PM Steven D'Aprano <steve@pearwood.info <mailto:steve@pearwood.info>> wrote:

On Wed, Aug 26, 2020 at 12:32:56PM -0400, Ricky Teachey wrote: > It creates a language supported way for the creator of the class to decide > how to interpret the contents inside a subscript operation.

We already have that: `__getitem__`.

Actually no, we have /THREE/ dunders: get, set, and del -item. And they accept a single argument containing a single object representing a single key or index. And if I had to make a prediction, then probably pretty soon they'll accept kwargs, too.

And these are great as-is when the way you want to use them is in total agreement with the way they were envisioned to be used: for single argument keys/indexes.

But any use case that deviates from that vision requires effort spread across three dunder methods to set it up right. And it requires writing code that breaks up the positional arguments in just the right way; code that you pretty much never have to write when using the function paradigm (rather than the item dunder paradigm), because the language does it all for us:

I'm not seeing what problem adding a new dunder and indirection of __*item__ solves that isn't solved by something like class K: def _make_key(self, index, **kwargs): # Where **kwargs is any kind of keywordey params, not necessarily just a **kwargs return 'something' def __getitem__(self, index, **kwargs): key = self._make_key(index, **kwargs) ... def __delitem__(self, index, **kwargs): key = self._make_key(index, **kwargs) ... def __setitem__(self, index, value, **kwargs): key = self._make_key(index, **kwargs) ... Sure, on a pedantic level I had to put effort across three dunders, but the effort is a single method call *and* I would still have needed to do it in the __subscript__ scenario except I would also have to have written a __subscript__ that is a combination of _make_key and boilerplate to call the method that the interpreter would have previously called for me. Alex

Ricky Teachey

6:24 a.m.

On Thu, Aug 27, 2020, 1:29 AM Alexandre Brault <abrault@mapgears.com> wrote:

...

On 2020-08-27 12:33 a.m., Ricky Teachey wrote:

On Wed, Aug 26, 2020 at 10:34 PM Steven D'Aprano <steve@pearwood.info> wrote:

...
On Wed, Aug 26, 2020 at 12:32:56PM -0400, Ricky Teachey wrote:

...
It creates a language supported way for the creator of the class to decide how to interpret the contents inside a subscript operation.

We already have that: `__getitem__`.

Actually no, we have *THREE* dunders: get, set, and del -item.

I'm not seeing what problem adding a new dunder and indirection of __*item__ solves...

It kills at least 3 birds with one stone: 1. Brings kwd arguments to item dunders (PEP 472 does this too, but a key/index translation dunder kills two other birds) 2. Switches positional arguments over to the function paradigm, bringing the full power of python signature parsing to subscripting. 3. Removes the need to write calls to the same supporting function in the item dunders. They are called automatically.

...

Sure, on a pedantic level I had to put effort across three dunders, but the effort is a single method call *and* I would still have needed to do it in the __subscript__ scenario except I would also have to have written a __subscript__ that is a combination of _make_key and boilerplate to call the method that the interpreter would have previously called for me.

Alex

I don't care for how I wrote the details of how __subscript__ passes the args and kwargs to the item dunders, by calling them directly. Looking for another logical way to do that. My current attempt is this: def __subscript__(self, *args, **kwargs) -> Tuple [Tuple[Any], Dict[str, Any]]: return t Which, for a getitem dunder call, `t` becomes: obj.__getitem__(*t[0], **t[1])

Ricky Teachey

6:32 a.m.

...

I'm not seeing what problem adding a new dunder and indirection of

...
__*item__ solves...

It kills at least 3 birds with one stone:

Actually there's a fourth bird. 4. Once you've decided on the signature of your key/index translation dunder, you can ignore the signature in the rest of the item dunders: class C: def __subscript__(self, x, y): return (y,), dict(x=x) def __getitem__(self, *args, **kwargs): print(args, kwargs)

...

...
...
C()[100, y='foo'] (100,), {'y': 'foo'}

Steven D'Aprano

7:30 a.m.

On Thu, Aug 27, 2020 at 07:24:01AM -0400, Ricky Teachey wrote:

...

It kills at least 3 birds with one stone:

1. Brings kwd arguments to item dunders (PEP 472 does this too, but a key/index translation dunder kills two other birds)

I would put it another way: your proposal to redesign subscripting is independent of whether or not keyword subscripts are permitted. We could redesign the comma-separated item part of it without introducing keywords at all. There is nothing in your proposal that requires keywords. Just delete the bits about `**kwargs` and the rest of it stands as it is. PEP 472, on the other hand, is *all about keywords*. Keywords are irrelevant to your proposal to redesign subscripting. We could take it with or without keywords.

...

2. Switches positional arguments over to the function paradigm, bringing the full power of python signature parsing to subscripting.

If you want a function call, use function call syntax. PEP 472 is about adding keyword support to subscripting. The aim of the PEP is still for subscripting to fundamentally be about an index or key, not about making square brackets to be a second way to do function calls. I'll have more to say about that later.

...

3. Removes the need to write calls to the same supporting function in the item dunders. They are called automatically.

I don't understand that sentence. [...]

...

My current attempt is this:

def __subscript__(self, *args, **kwargs) -> Tuple [Tuple[Any], Dict[str, Any]]: return t

NameError: name 't' is not defined

...

Which, for a getitem dunder call, `t` becomes:

obj.__getitem__(*t[0], **t[1])

What does this mean? You assign t = obj.__getitem__(*t[0], **t[1]) and then return t? -- Steve

Ricky Teachey

8:57 a.m.

On Thu, Aug 27, 2020, 8:34 AM Steven D'Aprano <steve@pearwood.info> wrote:

...

...
3. Removes the need to write calls to the same supporting function in the

item dunders. They are called automatically.

I don't understand that sentence.

[...]

...
My current attempt is this:

def __subscript__(self, *args, **kwargs) -> Tuple [Tuple[Any], Dict[str, Any]]: return t

NameError: name 't' is not defined

...
Which, for a getitem dunder call, `t` becomes:

obj.__getitem__(*t[0], **t[1])

What does this mean? You assign

t = obj.__getitem__(*t[0], **t[1])

and then return t?

-- Steve

Sorry, I need to stop coding in shorthand. Here is a more fleshed out illustration of what I am imagining would happen. I welcome suggestions for better ways to do pass the arguments on to the item dunders-- the proposal remains the basic idea nugget Jonathan Fine presented: 1. A dunder that handles the args and kwargs sent into the subscript operator in whatever way the programmer wants (the default way, when no such dunder is provided, being current python behavior). 2. MAGIC --- a suggestion for what the magic could be, below. 3. The appropriate item dunder is called with the parsed parameters First, we have an example class with a new __subscript__ dunder. Remember, the __subscript__ dunder can have any signature you want, and do anything to the arguments inside that you want, it just needs to return something matching the type hint below when done: class Q: def __subscript__(self, a, b, c, *, x, y, z, **kwargs) -> Sequence[Sequence[Any], Mapping[str, Any]]: kwargs.update(x=x, y=y, z=z) return (a,b,c), kwargs def __getitem__(self, *args, **kwargs): ... def __setitem__(self, __value, *args, **kwargs): ... def __delitem__(self, *args, **kwargs): ... Here is what I imagine happens at the python level (but I am writing it in python): MISSING = object() def cpython_subscript_operator_function(obj, *args, __item_dunder_name, __value=MISSING, **kwargs): if __item_dunder_name == "__setitem__" and __value is not MISSING: args = (__value, *args) obj_class = type(obj) subscript_dunder = getattr(obj_class, "__subscript__", None) if subscript_dunder: args, kwargs = subscript_dunder(obj, *args, **kwargs) return getattr(obj_class, __item_dunder_name)(obj, *args, **kwargs) Now, when I write this python code:

...

...
...
q=Q() q[1, x=4, y=5, z=6, foo='bar', b=2, c=3]

That code calls this cpython pseudo code: cpython_subscript_operator_function(q, 1, __item_dunder_name="__getitem__", x=4, y=5, z=6, foo='bar', b=2, c=3) And if I write set item code, it is like this:

...

...
...
q[1, 2, 3, x=4, y=5, z=6, foo='bar'] = "baz"

cpython level call looks like: cpython_subscript_operator_function(q, 1, __item_dunder_name="__getitem__", __value="baz", x=4, y=5, z=6, foo='bar', b=2, c=3)

Ricky Teachey

9 a.m.

...

And if I write set item code, it is like this:

...
...
...
q[1, 2, 3, x=4, y=5, z=6, foo='bar'] = "baz"

cpython level call looks like:

cpython_subscript_operator_function(q, 1, __item_dunder_name="__getitem__", __value="baz", x=4, y=5, z=6, foo='bar', b=2, c=3)

Sorry: forgot to change "__getitem__" to "__setitem__". Correction below. cpython_subscript_operator_function(q, 1, __item_dunder_name="__setitem__", __value="baz", x=4, y=5, z=6, foo='bar', b=2, c=3) --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler

Jonathan Fine

9:46 a.m.

Here are two examples, which I hope help us understand our options. Here's an example of how the new dunder might work in practice. class A: __keyfn__ = None def __setitem__(self, val, x=0, y=0, z=0): print((val, x, y, z)) >>> a = A() >>> a[1, z=2] = 'hello' ('hello', 1, 0, 2) Here's my understanding of Steven's proposal. (Please correct me if I've got something wrong.) class B: def __setitem__(self, argv, val, *, x=0, y=0, z=0): print((val, argv, x, y, z)) >>> b = B() >>> b[1, z=2] = 'hello' ('hello', 1, 0, 0, 2) By the way, I've not tested this code. -- Jonathan

Jonathan Fine

28 Aug 28 Aug

4:59 a.m.

This is a continuation of my previous post. I wrote: Here's an example of how the new dunder might work in practice.

...

class A: __keyfn__ = None def __setitem__(self, val, x=0, y=0, z=0): print((val, x, y, z))

>>> a = A() >>> a[1, z=2] = 'hello' ('hello', 1, 0, 2)

To continue, suppose that True is the default value for __keyfn__. Consider now class C: __keyfn__ = True def __setitem__(self, *argv, **kwargs): print(f'argv={argv} | kwargs={kwargs}') Here's one option for what should happen. >>> c = C() >>> c[1] = 'val' argv=(1, 'val') | kwargs={} >>> c[1, 2] = 'val' argv=((1, 2), 'val') | kwargs={} >>> c[a=1] = 'val' TypeError: __keyfn__ got unexpected keyword argument 'a' By the way, I've not tested this code. In short, the present behaviour continues, except that that the compile time error >>> c[a=1] = 'val SyntaxError: invalid syntax is replaced by the run-time error >>> c[a=1] = 'val' TypeError: __keyfn__ got unexpected keyword argument 'a' Some of us want >>> d = dict() >>> d[a=1] = 'val' to raise an exception. I've just described how having True as the default value for __keyfunc__ allows that to happen, should the Python community so decide. I hope this message helps. -- Jonathan

Steven D'Aprano

9:07 p.m.

On Thu, Aug 27, 2020 at 09:57:26AM -0400, Ricky Teachey wrote:

...

Sorry, I need to stop coding in shorthand.

That might help. What might help even more is if you spend less time showing imaginary, and invariably buggy, examples and more time explaining in words the intended semantics of this, and the reason why you want those semantics. I had to read over your email three times before the penny dropped what you are actually saying. Partly because I haven't had breakfast yet, partly because I was still thinking about earlier versions of your proposal e.g. when you had a single subscript dunder get passed the name of the get- set- or del-item dunder, and was expected to dispatch to that method. But mostly because your description is so full of fine detail that the big picture is missing. So let me see if I have this. You want to add a special dunder method which, if it exists, is automatically called by the interpreter to preprocess the subscript before passing it to the usual get-, set- and del-item dunders. So instead of having this: # hypothetical change to subscript behaviour # to allow multiple arguments def __getitem__(self, fee, fi, fo, fum): ... # and similar for __setitem__ and __delitem__ We will have this: def __subscript__(self, fee, fi, fo, fum): return (fee, fi, fo, fum, {}) def __getitem__(self, fee, fi, fo, fum, **kw): assert kw == {} ... # and similar for __setitem__ and __delitem__ and the process changes from: * interpreter passes arguments to the appropriate dunder to this instead: * interpreter passes arguments to the subscript dunder * which preprocesses them and returns them * and the interpreter then passes them to the appropriate dunder. I'm underwhelmed. I *think* your intention here is to handle the transition from the status quo to full function-like parameters in subscripts in a backwards compatible way, but that's not going to work. The status quo is that the subscript is passed as either a single value, or a tuple, not multiple arguments. If that *parsing rule* remains in place, then these two calls are indistinguishable: obj[spam, eggs] obj[(spam, eggs)] and your subscript dunder will only receive a single argument because that's what the parser sees. So you need to change the parser rule. But that breaks code that doesn't include the subscript dunder, because now this: obj[spam, eggs] gets passed as two args, not one, and `__getitem__` has only been written to accept one, so you get a TypeError. Your subscript preprocessor would allow the coder to stick spam and eggs into a tuple and pass it on, but it also returns a dict so the getitem dunder still needs to be re-written to accept `**kwargs` and check that it's empty, so you're adding more, not less, work. And besides, if I have to add a brand new dunder method to my class in order for my item getter to not break, it's not really backwards-compatible. Okay, let's make the interpreter smarter: it parses spam, eggs as two arguments, and then sees that there is no subscript dunder, so it drops back to "legacy mode", and assembles spam and eggs into a tuple before passing it on to the item getter. Only that's not really backwards compatible either, because the interpreter can't distinguish the two cases: # single argument with trailing comma obj[spam,] # no trailing comma obj[spam] In both cases this looks like a single argument to the interpreter, but the status quo is that they are different. The first one needs to be a tuple of one item. Why add all this complexity only to fail to remain backwards-compatible? Better would be to add a new future directive to change the parsing of subscripts, and allow people to opt-in when they are ready on a per-module basis. from __future__ import subscript_arguments This sort of change in behaviour is exactly why the future mechanism was invented. If it is desirable to change subscripting to pass multiple positional arguments, then we should use that, not complicated jerry- rigged "Do What I Mean" cunning plans that fail to Do What I Meant. Notice that none of the above needs to refer to keyword arguments. We could leave keyword arguments out of your proposal, and the argument parsing issue remains. -- Steve

Ricky Teachey

11:50 p.m.

On Fri, Aug 28, 2020 at 10:10 PM Steven D'Aprano <steve@pearwood.info> wrote:

...

On Thu, Aug 27, 2020 at 09:57:26AM -0400, Ricky Teachey wrote:

...
Sorry, I need to stop coding in shorthand.

That might help.

What might help even more is if you spend less time showing imaginary, and invariably buggy, examples and more time explaining in words the intended semantics of this, and the reason why you want those semantics.

I'm sorry you had to read over my email so many times. I'm sorry I have confused things by throwing out more than one idea at a time. I am sorry I am so bad at explaining things. Hopefully now it is clear, regardless of how we got there. It sounds like it is. I was really trying to explain the semantics multiple times but as I look over my messages your criticism is correct, I was throwing out too many detailed examples rather than focusing on the idea. For your benefit- since my explanations weren't sufficient- I wrote a bunch of admittedly junkie code, which is sometimes easier to understand (even if it is buggy) than English, in an attempt to showcase the idea more clearly. So let me see if I have this. You want to add a special dunder method

...

which, if it exists, is automatically called by the interpreter to preprocess the subscript before passing it to the usual get-, set- and del-item dunders.

Yes, that's the basic idea as I envision it and as Jonathan Fine wote in the first message in this thread.

...

...

* interpreter passes arguments to the subscript dunder * which preprocesses them and returns them * and the interpreter then passes them to the appropriate dunder.

I'm underwhelmed.

I think a new dunder is a good idea. I've explained why a couple times but I can try again if you'd like. On the other hand, we've established I'm bad at explaining things so maybe not a great idea. I can point you to this comment from Greg Ewing in the other thread where I first brought up the new dunders (3 new dunders, in that case) idea, maybe it will be better than I can do (however he's talking about 3 dunders-- still hoping he and others might come around to the idea of just one): https://mail.python.org/archives/list/python-ideas@python.org/message/NIJAZK...

...

I *think* your intention here is to handle the transition from the status quo to full function-like parameters in subscripts in a backwards compatible way, but that's not going to work.

I'm not so sure that's fully true. There are certainly problems that need to be worked out.

...

The status quo is that the subscript is passed as either a single value, or a tuple, not multiple arguments. If that *parsing rule* remains in place, then these two calls are indistinguishable:

obj[spam, eggs] obj[(spam, eggs)]

and your subscript dunder will only receive a single argument because that's what the parser sees. So you need to change the parser rule.

Yes I've made this observation myself in a couple different replies and I agree it's a problem. Greg Ewing (again!) had a helpful comment about it, perhaps he is correct: https://mail.python.org/archives/list/python-ideas@python.org/message/XWE73V... "We could probably cope with that by generating different bytecode when there is a single argument with a trailing comma, so that a runtime decision can be made as to whether to tupleify it. However, I'm not sure whether it's necessary to go that far. The important thing isn't to make the indexing syntax exactly match function call syntax, it's to pass multiple indexes as positional arguments to __getindex__. So I'd be fine with having to write a[(1,)] to get a one-element tuple in both the old and new cases. It might actually be better that way, because having trailing commas mean different things depending on the type of object being indexed could be quite confusing."

...

Your subscript preprocessor would allow the coder to stick spam and eggs into a tuple and pass it on, but it also returns a dict so the getitem dunder still needs to be re-written to accept `**kwargs` and check that it's empty, so you're adding more, not less, work.

No, that's not right. The kwargs mapping included in the return by the preprocessor gets unpacked in the item dunder call. An unpacked empty dict in an existing item dunder (without kwargs support) creates no error at all. Yes, if the kwargs dunder contains argument names not supported by the signature of the item dunders, we will get an error. But that's true with any function call. So, you know, don't do that. And besides, if I

...

have to add a brand new dunder method to my class in order for my item getter to not break, it's not really backwards-compatible.

How is it going to be broken? Unless you add the new dunder to your class, it won't be operative. The implicitly existing internal python function that does the job of this proposed dunder acts as the preprocessor instead of the dunder. Okay, let's make the interpreter smarter: it parses spam, eggs as two

...

arguments, and then sees that there is no subscript dunder, so it drops back to "legacy mode", and assembles spam and eggs into a tuple before passing it on to the item getter.

That's one way to think about it. Another way to think about it is, there is an existing preprocessing function in cpython, today-- and that function is currently not exposed for external use, and it currently gets passed (spam, eggs) as a tuple and silently sends that tuple on to the item dunders: def cpython_subscript_preprocessor(key_or_index): ... We modify that existing function signature so that it is passed positional arguments, something like this: def cpython_subscript_preprocessor(*args): ... To achieve the behavior we have today, that updated function checks to see if len(args) == 1, and if it is, it returns args[0]. Otherwise, it returns just args (a tuple). I believe that should replicate the behavior we have now. This is intended as a semantic description, just using code an aid in explaining the idea.

...

Only that's not really backwards compatible either, because the

...

interpreter can't distinguish the two cases:

# single argument with trailing comma obj[spam,]

# no trailing comma obj[spam]

As I explained above, that can be handled just fine by replicating the existing cpython preprocessor with the correct signature, and parsing it consistent with current python behavior. Cpython pseudo code to explain: def cpython_subscript_preprocessor(*args): ... try: args, = args except (ValueError, TypeError): pass Then return args in whatever way the API becomes defined. My idea at the moment is to return it as a two tuple that looks like this: (item_dunder_args, item_dunder_kwargs) ...and that two-tuple gets unpacked in the item dunders this way: obj.__getitem__(*item_dunder_args, **item_dunder_kwargs) For the current default internal python preprocessing function, in the case of spam, eggs it would return; (((spam, eggs),), {}) ...and that two-tuple gets unpacked in the item dunders this way: obj.__getitem__(*((spam, eggs),), **{}) In both cases this looks like a single argument to the interpreter, but

...

the status quo is that they are different. The first one needs to be a tuple of one item.

Why add all this complexity only to fail to remain backwards-compatible?

It doesn't fail, as I explained.

...

Better would be to add a new future directive to change the parsing of subscripts, and allow people to opt-in when they are ready on a per-module basis.

from __future__ import subscript_arguments

This might be an even better idea. Are you proposing it? But it would certainly break a lot of code to eventually make that change, so I'm unsure I would support it... maybe I could be talke into it I don't know. A new dudner seems far more friendly to existing code. This sort of change in behaviour is exactly why the future mechanism was

...

invented. If it is desirable to change subscripting to pass multiple positional arguments, then we should use that, not complicated jerry- rigged "Do What I Mean" cunning plans that fail to Do What I Meant.

Cool. I'm interested. Notice that none of the above needs to refer to keyword arguments. We

...

could leave keyword arguments out of your proposal, and the argument parsing issue remains.

-- Steve

Understood, but if the intention of the entire proposal is to shift subscripting as much as possible to a function calling paradiam, it would be very weird to leave out kwd arguments in the process. --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler

Greg Ewing

29 Aug 29 Aug

1:24 a.m.

On 29/08/20 4:50 pm, Ricky Teachey wrote:

...

(however [Greg Ewing was] talking about 3 dunders-- still hoping he and others might come around to the idea of just one)

Whereas I'm hoping that you might come around to the idea of three, :-) Your version is simpler in the sense that it uses fewer new dunders. However, it comes at the cost of more complexity, both conceptually and implementation-wise, and requires constructing new objects on every indexing operation, which is a fairly expensive thing to do. My version is based on a vision of what the indexing dunders might have been like if we'd had positional and keyword indexing from the beginning. If Python 4000 ever happens, the old dunders could be cleanly removed, just leaving the new ones. There's no such upgrade path for your version. -- Greg

Paul Moore

4:08 a.m.

On Sat, 29 Aug 2020 at 05:53, Ricky Teachey <ricky@teachey.org> wrote:

...

On Fri, Aug 28, 2020 at 10:10 PM Steven D'Aprano <steve@pearwood.info> wrote:

...

...
So let me see if I have this. You want to add a special dunder method which, if it exists, is automatically called by the interpreter to preprocess the subscript before passing it to the usual get-, set- and del-item dunders.

Yes, that's the basic idea as I envision it and as Jonathan Fine wote in the first message in this thread.

...
...

* interpreter passes arguments to the subscript dunder * which preprocesses them and returns them * and the interpreter then passes them to the appropriate dunder.

I'm underwhelmed.

I think a new dunder is a good idea. I've explained why a couple times but I can try again if you'd like. On the other hand, we've established I'm bad at explaining things so maybe not a great idea.

I've read the discussion. I'm also not impressed by this proposal. It's not lack of understanding, so repeating your explanations isn't going to be worth it, I just don't think this is a good trade-off in terms of complexity vs benefit. There seems to be quite a lot of tendency here (not just you, others are doing it too) to assume "you didn't find my arguments convincing, so I'll explain them again and hopefully you'll understand them better". The problem isn't lack of understanding, it's just that *the arguments aren't convincing people*. Come up with new arguments, or accept that people don't agree with you. It's getting pretty hard to follow this discussion, simply because any genuinely new points are getting lost in a swamp of re-hashed explanations of arguments and suggestions that have already been made. On Sat, 29 Aug 2020 at 07:27, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:

...

On 29/08/20 4:50 pm, Ricky Teachey wrote:

...
(however [Greg Ewing was] talking about 3 dunders-- still hoping he and others might come around to the idea of just one)

Whereas I'm hoping that you might come around to the idea of three, :-)

[...]

...

My version is based on a vision of what the indexing dunders might have been like if we'd had positional and keyword indexing from the beginning.

Three dunders makes sense to me, *because* it has a consistent underlying vision, we have three dunders right now, and they all handle subscripts in the same way. Any new approach to subscripting will need to be reflected either 3 times in the three dunders (your approach) or via some sort of intermediate "adapter" (Ricky's approach). Doing it cleanly seems better. Having said that, the whole thing seems like an over-complicated attempt to solve a quirk of the existing syntax that doesn't really need solving (unless there are important use cases lost in the swamp of repeated arguments I mentioned above :-(). We can simply allow for keywords to be passed to the *existing* dunders, and get enough benefit to address the key use cases, without all of this hassle. So I remain unimpressed by the bulk of the arguments here, and unconvinced that we need *any* of these proposals. Paul

Ricky Teachey

9:12 a.m.

On Sat, Aug 29, 2020, 5:08 AM Paul Moore <p.f.moore@gmail.com> wrote:

...

There seems to be quite a lot of tendency here (not just you, others are doing it too) to assume "you didn't find my arguments convincing, so I'll explain them again and hopefully you'll understand them better". The problem isn't lack of understanding, it's just that *the arguments aren't convincing people*. Come up with new arguments, or accept that people don't agree with you. It's getting pretty hard to follow this discussion, simply because any genuinely new points are getting lost in a swamp of re-hashed explanations of arguments and suggestions that have already been made.

I am fully willing to accept that people don't think it's a good idea. I spent time continuing to explain because it was actually clear that Steve, at least, did not understand what the proposal on this thread actually was until his most recent message. I'm sure that's mostly my fault. Anyway, moving on from explaining what it is. I am not a person that assumes my ideas are so fantastic that if people don't support them they must not understand. On the contrary, I'm assuming my ideas probably aren't good and if people don't like them I seek to understand why. And everyone has been immensely helpful. The signature dependent semantics idea I suggested on the other thread, for example, was rightly shot down because it won't work with python's dynamic nature. So I abandoned it. So I remain unimpressed by the bulk of the arguments here, and

...

unconvinced that we need *any* of these proposals.

Paul

Here's one reason I find a new dunder or dunders compelling that I haven't seen anyone respond to directly: I can write functions that define named arguments, and if I pass them positionally, they get assigned the right name automatically (unless disallowed by the signature using 3.8 positional only syntax): def f(x, y): ... f(1, 2) f(1, y=2) f(y=2, x=1) If we add kwargs to the subscript operator, we'll be able to add new required or optional names to item dunder signatures and supply named arguments : def __getitem__(self, key, x, y): ... q[x=1, y=2] But if we want to have the same behavior without supporting function style syntax, we will have to write code like this: MISSING = object() def __getitem__(self, key, x=MISSING, y=MISSING): if x is MISSING and y is MISSING:: x, y = key if x is missing: x, = key if y is MISSING: y, = key And probably that code I just wrote has bugs. And it gets more complicated if we want to have more arguments than just two. And even more complicated if we want some of the arguments to be positional only or any other combination of things. This is code you would not have to write if we could do this instead with a new dunder or subscript processor: def __getx__(self, x, y): ... And these all just work: q[1, 2] q[1, y=2] q[y=2, x=1] 1 is assigned to x and 2 is assigned to y in all of these for both versions, but the second certain requires not parsing of parameters. Python does it for us. That's a lot of easily available flexibility.

Jonathan Fine

10:42 a.m.

For me, a satisfactory outcome from the current PEP process would be a new dunder, which I am calling __keyfn__, that has two possible values, namely None or True. (And of course, the associated syntax and semantics changes. And documentation changes. These are not minor matters.) As __keyfn__ has only two values, storing the choice in the class requires only a single bit (not byte, but bit). That's the memory cost. And the run-time performance cost would also be small. Ricky has given some examples. Here are more, all assuming __keyfn__ = True First, this use of __keyfn__ would allow

...

...
...
d[1, 2, z=3] to result in d.__getitem__(1, 2, z=3)

Some further examples: >>> d[1, 2] >>> d.__getitem__(1, 2) >>> d[(1, 2)] >>> d.__getitem__((1, 2)) >>> d[a=1, b=2] >>> d.__getitem__(a=1, b=2) I find the above easy to understand and use. For Steven's proposal the calls to __getitem__ would be >>> d[1, 2, z=3] >>> d.__getitem__((1, 2), z=3) >>> d[1, 2] >>> d.__getitem__((1, 2) >>> d[(1, 2)] # Same result as d[1, 2] >>> d.__getitem__((1, 2)) # From d[(1, 2)]

...

...
...
d[a=1, b=2] >>> d.__getitem__((), a=1, b=2)

I find these harder to understand and use, which is precisely the point Ricky made in his most recent post. That's because there's a clear and precise analogy between >>> x(1, 2, a=3, b=4) >>> x[1, 2, a=3, b=4] I think it reasonable to argue adding a single bit to every class is not worth the benefit it provides. However, this argument should be supported by evidence. (As indeed should the argument that it is worth the benefit.) I also think it reasonable to argue that now is not the time to allow __keyfn__ to have values other than None or True. And that allowing further values should require an additional PEP. I don't recall seeing an argument that Steven's proposal is as easy to understand and use as mine (with __keyfn__ == None). -- Jonathan

Jonathan Fine

10:52 a.m.

I wrote: Ricky has given some examples. Here are more, all assuming

...

__keyfn__ = True

My mistake. It should be __keyfn__ = None I'm sure you've already figured that out. Still, please accept my apologies. -- Jonathan

...

Guido van Rossum

12:14 p.m.

So __keyfn__ has literally only two allowed values, None and True. Can you explain (in words or with examples) what happens in each case? You were so excited to show off one case that you never showed how the other case would work (and you tripped up over the value of __keyfn__ that you were demonstrating). Could you just start over? And explain the name? And why it can't be False/True? On Sat, Aug 29, 2020 at 8:52 AM Jonathan Fine <jfine2358@gmail.com> wrote:

...

I wrote:

Ricky has given some examples. Here are more, all assuming

...
__keyfn__ = True

My mistake. It should be __keyfn__ = None

I'm sure you've already figured that out. Still, please accept my apologies.

-- Jonathan

...
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/R2ZLF7... Code of Conduct: http://python.org/psf/codeofconduct/

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Jonathan Fine

1:03 p.m.

Thank you, Guido, for your interest in this discussion. You wrote:

...

So __keyfn__ has literally only two allowed values, None and True.

That's my proposal, for now. I'm happy for further allowed values to be added via another PEP. In fact, that is my preference. You also wrote:

...

Could you just start over? And explain the name? And why it can't be False/True?

Consider >>> d[1, 2] = 'val' >>> d.__setitem__((1,2), 'val') Here I call (1, 2) the key. As in def __setitem__(key, val): # set the value The passage >>> d[1, 2] = 'val' >>> d.__setitem__((1,2), 'val') goes via tuple. If tuple didn't already exist, we'd have to invent it. And so here tuple is what I call "the key function". The key function is the function that is the intermediary between >>> d[EXPRESSION] >>> d.__getitem__(key) When there is a key function, the signatures are __getitem__(key) __delitem__(key) __setitem__(key, val) So we're allowed to have class A: __keyfn__ = None def __setitem__(self, val, a, b, c d): # set the value Recall that dict has, implicitly, a way of getting a key from permitted arguments. The meaning of class B: __keyfn__ = True def __getitem__(key): # Get the value is that the key is produced by exactly the same method as in dict. The meaning of __keyfunc__ = True is "produce a key from the arguments, in exactly the same way as in dict". Or in other words, "Yes, it's True. We do have a keyfn. Use the default, dict, keyfn." I think (None, True) works better than (False, True). This is because a further PEP might allow the user to supply a custom keyfn. So while it is at this time a binary choice, there might be further choices in future.

...

Can you explain (in words or with examples) what happens in each case? You were so excited to show off one case that you never showed how the other case would work

How about: class A: __keyfn__ = None def __setitem__(self, val, x=0, y=0, z=0): print((val, x, y, z)) >>> a = A() >>> a[1, z=2] = 'hello' ('hello', 1, 0, 2) [Above copied from https://mail.python.org/archives/list/python-ideas@python.org/message/P3AW6G... ] And also class C: __keyfn__ = True def __setitem__(self, *argv, **kwargs): print(f'argv={argv} | kwargs={kwargs}') >>> c = C() >>> c[1] = 'val' argv=(1, 'val') | kwargs={} >>> c[1, 2] = 'val' argv=((1, 2), 'val') | kwargs={} >>> c[a=1] = 'val' TypeError: __keyfn__ got unexpected keyword argument 'a' [Above copied from https://mail.python.org/archives/list/python-ideas@python.org/message/RNQFT4... ] I hope this helps. -- Jonathan

Guido van Rossum

6:08 p.m.

On Sat, Aug 29, 2020 at 11:04 AM Jonathan Fine <jfine2358@gmail.com> wrote: [snip]

...

The passage >>> d[1, 2] = 'val' >>> d.__setitem__((1,2), 'val') goes via tuple. If tuple didn't already exist, we'd have to invent it. And so here tuple is what I call "the key function".

IIUC in order to get these semantics under your proposed system, I should either leave __keyfn__ unset (for backward compatible behavior) or set it explicitly to True. Is that correct? The key function is the function that is the intermediary between

...

>>> d[EXPRESSION] >>> d.__getitem__(key)

When there is a key function, the signatures are __getitem__(key) __delitem__(key) __setitem__(key, val)

So we're allowed to have class A: __keyfn__ = None def __setitem__(self, val, a, b, c d): # set the value

Recall that dict has, implicitly, a way of getting a key from permitted arguments. The meaning of class B: __keyfn__ = True def __getitem__(key): # Get the value is that the key is produced by exactly the same method as in dict.

...

The meaning of __keyfunc__ = True is "produce a key from the arguments, in exactly the same way as in dict". Or in other words, "Yes, it's True. We do have a keyfn. Use the default, dict, keyfn."

I think (None, True) works better than (False, True). This is because a further PEP might allow the user to supply a custom keyfn. So while it is at this time a binary choice, there might be further choices in future.

...
Can you explain (in words or with examples) what happens in each case? You were so excited to show off one case that you never showed how the other case would work

How about:

class A: __keyfn__ = None def __setitem__(self, val, x=0, y=0, z=0): print((val, x, y, z))

Okay, I am beginning to understand your proposal (despite vehemently disagreeing). You propose that setting __keyfn__ = None should change the signature of __setitem__ so that 1. the value is placed first (before the "key" values) 2. the rest of the arguments (whether positional or keywords) are passed the same way as for a function

...

>>> a = A() >>> a[1, z=2] = 'hello' ('hello', 1, 0, 2)

[Above copied from https://mail.python.org/archives/list/python-ideas@python.org/message/P3AW6G... ]

And also

class C: __keyfn__ = True def __setitem__(self, *argv, **kwargs): print(f'argv={argv} | kwargs={kwargs}')

>>> c = C()

>>> c[1] = 'val' argv=(1, 'val') | kwargs={} >>> c[1, 2] = 'val' argv=((1, 2), 'val') | kwargs={}

>>> c[a=1] = 'val' TypeError: __keyfn__ got unexpected keyword argument 'a'

[Above copied from https://mail.python.org/archives/list/python-ideas@python.org/message/RNQFT4... ]

I hope this helps.

Yes. I find it a big flaw that the signature of __setitem__ is so strongly influenced by the value of __keyfunc__. For example, a static type checker (since PEP 484 I care deeply about those and they're popping up like mushrooms :-) would have to hard-code a special case for this, because there really is nothing else in Python where the signature of a dunder depends on the value of another dunder. And in case you don't care about static type checkers, I think it's the same for human readers. Whenever I see a __setitem__ function I must look everywhere else in the class (and in all its base classes) for a __keyfn__ before I can understand how the __setitem__ function's signature is mapped from the d[...] notation. Finally, I am unsure how you would deal with the difference between d[1] and d[1,], which must be preserved (for __keyfn__ = True or absent, for backwards compatibility). The bytecode compiler cannot assume to know the value of __keyfn__ (because d could be defined in another module or could be an instance of one of several classes defined in the current module). (I think this problem is also present in the __subscript__ version.) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Guido van Rossum

6:31 p.m.

On Sat, Aug 29, 2020 at 4:08 PM Guido van Rossum <guido@python.org> wrote:

...

[...] Finally, I am unsure how you would deal with the difference between d[1] and d[1,], which must be preserved (for `__keyfn__ = True` or absent, for backwards compatibility). The bytecode compiler cannot assume to know the value of `__keyfn__` (because d could be defined in another module or could be an instance of one of several classes defined in the current module). (I think this problem is also present in the `__subscript__` version.)

This problem is actually also present in Steven's version (which just passes keyword args as **kwargs to `__getitem__` and `__setitem__`). We could treat d[1, a=3] either as d[1,] + kwargs or as d[1] + kwargs. Have people debated this yet? (It is not a problem in Jonathan's version for `__keyfn__ = None`, but since the proposal also has to support `__keyfn__ = True`, it is still a problem there.) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Steven D'Aprano

9:29 p.m.

On Sat, Aug 29, 2020 at 04:31:53PM -0700, Guido van Rossum wrote:

...

On Sat, Aug 29, 2020 at 4:08 PM Guido van Rossum <guido@python.org> wrote:

...
[...] Finally, I am unsure how you would deal with the difference between d[1] and d[1,], which must be preserved (for `__keyfn__ = True` or absent, for backwards compatibility). The bytecode compiler cannot assume to know the value of `__keyfn__` (because d could be defined in another module or could be an instance of one of several classes defined in the current module). (I think this problem is also present in the `__subscript__` version.)

This problem is actually also present in Steven's version (which just passes keyword args as **kwargs to `__getitem__` and `__setitem__`). We could treat d[1, a=3] either as d[1,] + kwargs or as d[1] + kwargs. Have people debated this yet?

Good catch! I don't think that anyone wants adding a keyword to a single-valued subscript to change it to a tuple. At least, I really hope that nobody wants this! So given the current behaviour: obj[1] # calls __getitem__(1) obj[1,] # calls __getitem__((1,)) I expect that the first will be the most common. If we add a keyword to the subscript: obj[1, a=3] I would expect that the it turns into `__getitem__(1, a=3)` which is almost surely what the reader and coder expects. It would be quite weird for the subscript 1 to turn into a tuple just because I add a keyword. That does leave the second case a little trickier to add a keyword to, it would require a pair of parens to disambiguate it from above: obj[(1,), a=3] but I think that's likely to be obvious to the developer who is adding in the keyword where previously no keyword existed. -- Steve

David Mertz

11:07 p.m.

On Sat, Aug 29, 2020, 10:30 PM Steven D'Aprano <steve@pearwood.info> wrote:

...

I expect that the first will be the most common. If we add a keyword to the subscript:

obj[1, a=3]

I would expect that the it turns into `__getitem__(1, a=3)` which is almost surely what the reader and coder expects. It would be quite weird for the subscript 1 to turn into a tuple just because I add a keyword.

Thank you, and +100. This is the only behavior that makes any sense.

Guido van Rossum

30 Aug 30 Aug

12:29 a.m.

On Sat, Aug 29, 2020 at 7:30 PM Steven D'Aprano <steve@pearwood.info> wrote:

...

I don't think that anyone wants adding a keyword to a single-valued subscript to change it to a tuple. At least, I really hope that nobody wants this!

So given the current behaviour:

obj[1] # calls __getitem__(1) obj[1,] # calls __getitem__((1,))

I expect that the first will be the most common. If we add a keyword to the subscript:

obj[1, a=3]

I would expect that the it turns into `__getitem__(1, a=3)` which is almost surely what the reader and coder expects. It would be quite weird for the subscript 1 to turn into a tuple just because I add a keyword.

That does leave the second case a little trickier to add a keyword to, it would require a pair of parens to disambiguate it from above:

obj[(1,), a=3]

but I think that's likely to be obvious to the developer who is adding in the keyword where previously no keyword existed.

That's a fair ruling. In general, when keywords are present, the rule that you can always omit an outermost pair of parentheses is no longer true. That is, d[(...)] and d[...] are always equivalent regardless what "..." stands for, as long as (...) is a valid expression (which it isn't if there are slices involved). Example: ``` d[1] ~~~ d[(1)] d[1,] ~~~ d[(1,)] d[1, 2] ~~~ d[(1, 2)] ``` But there is absolutely no such rule if keywords are present. FYI, Jonathan's post (once I "got" it) led me to a new way of reasoning about the various proposals (__keyfn__, __subscript__ and what I will keep calling "Steven's proposal") based on what the compiler and interpreter need to do to support this corner case. My tentative conclusion is that Steven's proposal is superior. But I have been reviewing my reasoning and pseudo-code a few times and I'm still not happy with it, so posting it will have to wait. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Guido van Rossum

2:45 a.m.

On Sat, Aug 29, 2020 at 10:29 PM Guido van Rossum <guido@python.org> wrote:

...

FYI, Jonathan's post (once I "got" it) led me to a new way of reasoning about the various proposals (__keyfn__, __subscript__ and what I will keep calling "Steven's proposal") based on what the compiler and interpreter need to do to support this corner case. My tentative conclusion is that Steven's proposal is superior. But I have been reviewing my reasoning and pseudo-code a few times and I'm still not happy with it, so posting it will have to wait.

I've spent some more thinking about this, focused on two things: absolute backwards compatibility for d[1] vs. d[1,], and what should happen at the C level. There are both the type slots and API functions like PyObject_{Get,Set}Item to consider. Interestingly, it seems Jonathan's proposal and __subscript__ seem inspired by the desire to keep both type slots and PyObject{Get,Set}Item unchanged, but have the disadvantage that it's hard to keep d[1] vs. d[1,] straight, and add a lot of complexity to the bytecode interpreter. At this point my recommendation is to go with Steven's proposal, constructing a "key" from the positional args, taking d[1] vs. d[1,] into account for backwards compatibility, and passing keywords as a dict to new C API functions PyObject_GetItemEx and PyObject_SetItemEx. These functions then call the mp_subscript and mp_ass_subscript slots. There will have to be a flag in the type object to declare whether the slot functions take a dict of keywords. This flag is off by default (for backwards compatibility) but on for type objects wrapping Python classes -- the dict of keywords will then be passed as **kwargs to __getitem__ or __setitem__ (the call will fail if the Python __getitem__ or __setitem__ doesn't take the specific keywords in the dict). If the flag in the type slot says "doesn't take dict of keywords" then passing a non-empty dict causes a type error. Ditto if there are keywords but the type slots are in tp_as_sequence rather than in tp_as_mapping. (The sequence slots only take a single int and seem to be mostly a legacy for sequences -- even slices go through tp_as_mapping.) There is one final case -- PyObject_GetItem on a type object may call __class_getitem__. This is used for PEP 585 (list[int] etc.). In this case we should pass keywords along. This should be relatively straightforward (it's not a slot). A quick summary of the proposal at the pure Python level: ``` d[1] -> d.__getitem__(1) d[1,] -> d.__getitem__((1,)) d[1, 2] -> d.__getitem__((1, 2)) d[a=3] -> d.__getitem__((), a=3) d[1, a=3] -> d.__getitem__((1,), a=3) d[1, 2, a=3] -> d.__getitem__((1, 2), a=3) d[1] = val -> d.__setitem__(1, val) d[1,] = val -> d.__setitem__((1,), val) d[1, 2] = val -> d.__setitem__((1, 2), val) d[a=3] = val -> d.__setitem__((), val, a=3) d[1, a=3] = val -> d.__setitem__((1,), val, a=3) d[1, 2, a=3] = val -> d.__setitem__((1, 2), val, a=3) ``` Do we want to support d[**kwargs]? It can be done, alternatively we could just ask the user to write the __getitem__/__setitem__ call explicitly. I think we should say no to d[*args], because that will just become d[(*args)], with awkward questions around what if args == (1,). Maybe then for consistency we needn't bother with **kwargs, though the case for that is definitely stronger. Sorry for telegraphing this -- I am way past my bedtime but looking at this from the C API POV definitely made some things clear to me. I'm probably posting this in the wrong thread -- I can't keep up (and GMail splits threads after 100 messages, which doesn't help). -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Guido van Rossum

2:57 a.m.

IMPORTANT CORRECTION! I was too eager to get to bed and made a mistake in the summary for the d[1, a=3] cases. The key here should be '1', not '(1,)'. On Sun, Aug 30, 2020 at 12:45 AM Guido van Rossum <guido@python.org> wrote:

...

A quick summary of the proposal at the pure Python level:

``` d[1] -> d.__getitem__(1) d[1,] -> d.__getitem__((1,)) d[1, 2] -> d.__getitem__((1, 2)) d[a=3] -> d.__getitem__((), a=3) d[1, a=3] -> d.__getitem__((1,), a=3)

SHOULD BE: d[1, a=3] -> d.__getitem__(1, a=3)

...

d[1, 2, a=3] -> d.__getitem__((1, 2), a=3)

d[1] = val -> d.__setitem__(1, val) d[1,] = val -> d.__setitem__((1,), val) d[1, 2] = val -> d.__setitem__((1, 2), val) d[a=3] = val -> d.__setitem__((), val, a=3) d[1, a=3] = val -> d.__setitem__((1,), val, a=3)

SHOULD BE: d[1, a=3] = val -> d.__setitem__(1, val, a=3)

...

d[1, 2, a=3] = val -> d.__setitem__((1, 2), val, a=3) ```

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Greg Ewing

4:42 a.m.

On 30/08/20 7:45 pm, Guido van Rossum wrote:

...

Do we want to support d[**kwargs]? It can be done, alternatively we could just ask the user to write the __getitem__/__setitem__ call explicitly.

I thought we usually discouraged directly calling dunders unless there's no alternative, because there is often extra processing in between the language syntax and the corresponding dunder that would get skipped. It wouldn't make a difference in this case given the implementation you describe, but I think it's just tidier to be able to avoid the direct call. We don't have to decide now, though -- it can be added later.

...

I think we should say no to d[*args], because that will just become d[(*args)],

Which is also equivalent to d[args]. But if we have d[**kwds] without d[*args] I expect there will forever be people asking how to do d[*args]. So maybe allow it but just ignore the *. -- Greg

Guido van Rossum

11:23 p.m.

On Sun, Aug 30, 2020 at 2:43 AM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:

...

On 30/08/20 7:45 pm, Guido van Rossum wrote:

I think we should say no to d[*args], because that will just become

...
d[(*args)],

Which is also equivalent to d[args]. But if we have d[**kwds] without d[*args] I expect there will forever be people asking how to do d[*args]. So maybe allow it but just ignore the *.

I don't think so. People who are asking for that are probably not expecting what they will get. Dropping such an operator silently will probably cause more confusion than it resolves. IIRC there's a situation in C where you can call a function pointer using either `(fp)(args)` or `(*fp)(args)`, and that drives me nuts. If this question is asked a lot, a StackOverflow entry for it can be crafted to explain it once and for all. (We've done this for other things.) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Greg Ewing

31 Aug 31 Aug

2:08 a.m.

On 31/08/20 4:23 pm, Guido van Rossum wrote:

...

On Sun, Aug 30, 2020 at 2:43 AM Greg Ewing <greg.ewing@canterbury.ac.nz <mailto:greg.ewing@canterbury.ac.nz>> wrote:

On 30/08/20 7:45 pm, Guido van Rossum wrote:

> I think we should say no to d[*args], because that will just become > d[(*args)],

So maybe allow it but just ignore the *.

Dropping such an operator silently will probably cause more confusion than it resolves.

I'm wondering whether parens should be required when there are both keyword args and more than one positional arg in an index. I.e instead of a[1, 2, k = 3] you would have to write a[(1, 2), k = 3] That would make it clearer that the indexing syntax still really only takes one positional arg, and why, if you transform it into idx = (1, 2) a[idx, k = 3] you don't/can't write it as a[*idx, k = 3] -- Greg

Christopher Barker

2:21 a.m.

On Mon, Aug 31, 2020 at 12:12 AM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:

...

I'm wondering whether parens should be required when there are both keyword args and more than one positional arg in an index. I.e instead of

a[1, 2, k = 3]

you would have to write

a[(1, 2), k = 3]

That would make it clearer that the indexing syntax still really only

takes one positional arg, I see the appeal here, but then: a[1, 2] would be legal, and suddenly a[1,2, this=4] would not. if the keywords were optional (aren't they always?) then that would be pretty confusing. -CHB

...

and why, if you transform it into

idx = (1, 2) a[idx, k = 3]

you don't/can't write it as

a[*idx, k = 3]

-- Greg _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/67QRIQ... Code of Conduct: http://python.org/psf/codeofconduct/

-- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython

Ricky Teachey

3:48 a.m.

On Mon, Aug 31, 2020, 3:10 AM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:

...

I'm wondering whether parens should be required when there are both keyword args and more than one positional arg in an index. I.e instead of

a[1, 2, k = 3]

you would have to write

a[(1, 2), k = 3]

That would make it clearer that the indexing syntax still really only takes one positional arg, and why, if you transform it into

idx = (1, 2) a[idx, k = 3]

you don't/can't write it as

a[*idx, k = 3]

-- Greg

Yes. Please. We've already created a screwy situation now that people seem to agree kwd args is something we want here. I really don't want to see it get even more screwy by allowing syntax that looks so very much like a standard function call, except with a different bracket shape, and acts totally differently underneath. If we are not going to fix things to utilize functional style arguments using a new dunder or dunders, and we're not willing to break anything now or in the future, this limitation seems very wise to me. And the limitation will also leave open the door to more easily allowing a functional style option. Right now we have this problem to deal with: f(1) f(1,) q[1] q[1,] ...the f's and the q's can't be reconciled because History. We are creating a second problem except worse: f(1, 2, k=3) f((1, 2), k=3) q[1, 2, k=3] q[(1, 2), k=3] ...these won't be able to be reconciled with each other if we go forward with allowing tuples to *poof* into existence regardless of whether there is a bunch of keywords hanging off the end in the subscript brackets. Would we ever in a million years have syntax like this? What could this mean? a = [x, y, k=3] I'd say no, syntax like this is just weird. But if it did, would you expect x,y to get packed into a tuple together...? Nobody would.

Guido van Rossum

10:42 a.m.

On Mon, Aug 31, 2020 at 12:09 AM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:

...

I'm wondering whether parens should be required when there are both keyword args and more than one positional arg in an index. I.e instead of

a[1, 2, k = 3]

you would have to write

a[(1, 2), k = 3]

I think this would be horrible for the poor user who just wants to do something with an array data structure (e.g. something like xarray) using three dimensions, only one of which is named.

...

That would make it clearer that the indexing syntax still really only takes one positional arg, and why, if you transform it into

idx = (1, 2) a[idx, k = 3]

you don't/can't write it as

a[*idx, k = 3]

This would make things simpler to understand for the implementer, but it's bad for the user. I'm sure if you ask the xarray implementers they don't care *how* keywords work as long as their users can write a[1, 2, k=3]. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Greg Ewing

30 Aug 30 Aug

4:56 a.m.

A thought just occurred to me. If we hadn't got rid of tuple unpacking in argument lists, we would have been able to write def __getitem__(self, (x, y, z), **kwds): ... -- Greg

Steven D'Aprano

6:25 a.m.

On Sun, Aug 30, 2020 at 09:56:28PM +1200, Greg Ewing wrote:

...

A thought just occurred to me. If we hadn't got rid of tuple unpacking in argument lists, we would have been able to write

def __getitem__(self, (x, y, z), **kwds): ...

Indeed. Now that Python is moving to a PEG parser, maybe we could consider tuple unpacking parameters again? -- Steve

Eric V. Smith

7:20 a.m.

On 8/30/2020 7:25 AM, Steven D'Aprano wrote:

...

On Sun, Aug 30, 2020 at 09:56:28PM +1200, Greg Ewing wrote:

...
A thought just occurred to me. If we hadn't got rid of tuple unpacking in argument lists, we would have been able to write

def __getitem__(self, (x, y, z), **kwds): ... Indeed.

Now that Python is moving to a PEG parser, maybe we could consider tuple unpacking parameters again?

I was sad to see tuple arg unpacking go. But PEP 3113 doesn't mention parsability as a reason to remove it. And indeed, since it worked with the old parser, the PEG parser would presumably have no problem with it. PEP 3113 mentions introspection, which is why I grudgingly accept that it had to go. Maybe if someone solved that problem we could get them back. But that would be a major effort, separate from this proposal, and would of course require another PEP. Eric

Guido van Rossum

10:37 a.m.

I recall a law from evolutionary biology that went something like “once a feature is gone it won’t evolve a second time (in the same species)”. I have no interest in restoring argument unpacking. On Sun, Aug 30, 2020 at 05:20 Eric V. Smith <eric@trueblade.com> wrote:

...

On 8/30/2020 7:25 AM, Steven D'Aprano wrote:

...
On Sun, Aug 30, 2020 at 09:56:28PM +1200, Greg Ewing wrote:

...
...
A thought just occurred to me. If we hadn't got rid of tuple

...
...
unpacking in argument lists, we would have been able to write

...
...
...
...
def __getitem__(self, (x, y, z), **kwds):

...
...
...

...
Indeed.

...
...
Now that Python is moving to a PEG parser, maybe we could consider tuple

...
unpacking parameters again?

I was sad to see tuple arg unpacking go. But PEP 3113 doesn't mention

parsability as a reason to remove it. And indeed, since it worked with

the old parser, the PEG parser would presumably have no problem with it.

PEP 3113 mentions introspection, which is why I grudgingly accept that

it had to go. Maybe if someone solved that problem we could get them

back. But that would be a major effort, separate from this proposal, and

would of course require another PEP.

Eric

_______________________________________________

Python-ideas mailing list -- python-ideas@python.org

To unsubscribe send an email to python-ideas-leave@python.org

https://mail.python.org/mailman3/lists/python-ideas.python.org/

Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/LBKATT...

Code of Conduct: http://python.org/psf/codeofconduct/

-- --Guido (mobile)

Greg Ewing

6:56 p.m.

On 31/08/20 3:37 am, Guido van Rossum wrote:

...

I recall a law from evolutionary biology that went something like “once a feature is gone it won’t evolve a second time (in the same species)”. I have no interest in restoring argument unpacking.

That can't be an absolute law. If a species loses a feature, it's because the environment has changed so as to make it no longer advantageous. If there is another environmental change that makes it advantageous again, I can't see why it couldn't come back. Maybe not exactly the same, but something very similar. Here we have a situation where the environment has changed. I'd like to propose bringing back something that is superficially similar, but with some differences. def __getitem__(self, (i, j, k)): The differences are: 1. There can only be one set of parens, and they must enclose all except the first positional argument. 2. Arguments within the parens can be specified by keyword. Difference 2 is what makes this more than just syntactic sugar for taking one argument and unpacking it later, which was the reason for eliminating argument unpacking originally. If you're worried about people abusing this for purposes other than the one intended, another restriction could be added: 3. (Optional) It can only be used in functions named __getitem__, __setitem__ or __delitem__. -- Greg

Guido van Rossum

11:11 p.m.

On Sun, Aug 30, 2020 at 4:58 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:

...

On 31/08/20 3:37 am, Guido van Rossum wrote:

...
I recall a law from evolutionary biology that went something like “once a feature is gone it won’t evolve a second time (in the same species)”. I have no interest in restoring argument unpacking.

That can't be an absolute law. If a species loses a feature, it's because the environment has changed so as to make it no longer advantageous. If there is another environmental change that makes it advantageous again, I can't see why it couldn't come back. Maybe not exactly the same, but something very similar.

I'm sure it's not an absolute law. But it's an apt observation.

...

Here we have a situation where the environment has changed. I'd like to propose bringing back something that is superficially similar, but with some differences.

def __getitem__(self, (i, j, k)):

The differences are:

1. There can only be one set of parens, and they must enclose all except the first positional argument.

2. Arguments within the parens can be specified by keyword.

How would this even work? Can I write a.__getitem__((1, 2), k=3) and the function will see (i, j, k) == (1, 2, 3)? Okay, and if I write a.__getitem__((1, 3), k=2) will the function see the same thing? I've got the feeling you're pranking me here, and I'm falling for it hook, line and sinker.

...

Difference 2 is what makes this more than just syntactic sugar for taking one argument and unpacking it later, which was the reason for eliminating argument unpacking originally.

If you're worried about people abusing this for purposes other than the one intended, another restriction could be added:

3. (Optional) It can only be used in functions named __getitem__, __setitem__ or __delitem__.

Let's define a decorator or other helper to do this instead. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Greg Ewing

31 Aug 31 Aug

1:56 a.m.

On 31/08/20 4:11 pm, Guido van Rossum wrote:

...

Can I write a.__getitem__((1, 2), k=3) and the function will see (i, j, k) == (1, 2, 3)?

Yes.

...

Okay, and if I write a.__getitem__((1, 3), k=2) will the function see the same thing?

No, it will see (i, j, k) == (1, 3, 2). It's the same as if you were calling an ordinary function and passing the tuple using *:

...

...
...
def f(i, j, k): ... print(i, j, k) ... f(*(1, 2), k = 3) 1 2 3 f(*(1, 3), k = 2) 1 3 2

...

I've got the feeling you're pranking me here, and I'm falling for it hook, line and sinker.

No, it was a serious suggestion. But if you don't like it, that's fine. -- Greg

Guido van Rossum

10:39 a.m.

On Sun, Aug 30, 2020 at 11:56 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:

...

On 31/08/20 4:11 pm, Guido van Rossum wrote:

...
Okay, and if I write a.__getitem__((1, 3), k=2) will the function see the same thing?

No, it will see (i, j, k) == (1, 3, 2).

That was a typo. I meant to ask whether `a.__getitem__((1, 3), j=2)` would see `(i, j, k) == (1, 2, 3)`. But it should probably be an error ("duplicate value for j").

...

No, it was a serious suggestion. But if you don't like it, that's fine.

Okay. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Greg Ewing

30 Aug 30 Aug

5:37 a.m.

I've written a decorator to go along with Guido's proposed implementation, to make it easier to write item dunders that take positional args that can also be specified by keyword. #----------------------------------------------- from inspect import signature def positional_indexing(m): def f(self, args, **kwds): if isinstance(args, tuple) and len(args) != 1: return m(self, *args, **kwds) else: return m(self, args, **kwds) f.__name__ = m.__name__ f.__signature__ = signature(m) return f #----------------------------------------------- Usage example: #----------------------------------------------- class Test: def __init__(self): self.data = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] @positional_indexing def __getitem__(self, i, j): return self.data[i][j] t = Test() print(signature(t.__getitem__)) print(t[1, 2]) # Have to fake this for now until we get real keyword index syntax print(t.__getitem__((), j = 2, i = 1)) #----------------------------------------------- Output: (i, j) 6 6 -- Greg

Stefano Borini

8:01 a.m.

I am going to put everything in the new PEP tonight. On Sun, 30 Aug 2020, 08:47 Guido van Rossum, <guido@python.org> wrote:

...

On Sat, Aug 29, 2020 at 10:29 PM Guido van Rossum <guido@python.org> wrote:

...
FYI, Jonathan's post (once I "got" it) led me to a new way of reasoning about the various proposals (__keyfn__, __subscript__ and what I will keep calling "Steven's proposal") based on what the compiler and interpreter need to do to support this corner case. My tentative conclusion is that Steven's proposal is superior. But I have been reviewing my reasoning and pseudo-code a few times and I'm still not happy with it, so posting it will have to wait.

I've spent some more thinking about this, focused on two things: absolute backwards compatibility for d[1] vs. d[1,], and what should happen at the C level. There are both the type slots and API functions like PyObject_{Get,Set}Item to consider.

Interestingly, it seems Jonathan's proposal and __subscript__ seem inspired by the desire to keep both type slots and PyObject{Get,Set}Item unchanged, but have the disadvantage that it's hard to keep d[1] vs. d[1,] straight, and add a lot of complexity to the bytecode interpreter.

At this point my recommendation is to go with Steven's proposal, constructing a "key" from the positional args, taking d[1] vs. d[1,] into account for backwards compatibility, and passing keywords as a dict to new C API functions PyObject_GetItemEx and PyObject_SetItemEx. These functions then call the mp_subscript and mp_ass_subscript slots. There will have to be a flag in the type object to declare whether the slot functions take a dict of keywords. This flag is off by default (for backwards compatibility) but on for type objects wrapping Python classes -- the dict of keywords will then be passed as **kwargs to __getitem__ or __setitem__ (the call will fail if the Python __getitem__ or __setitem__ doesn't take the specific keywords in the dict).

If the flag in the type slot says "doesn't take dict of keywords" then passing a non-empty dict causes a type error. Ditto if there are keywords but the type slots are in tp_as_sequence rather than in tp_as_mapping. (The sequence slots only take a single int and seem to be mostly a legacy for sequences -- even slices go through tp_as_mapping.)

There is one final case -- PyObject_GetItem on a type object may call __class_getitem__. This is used for PEP 585 (list[int] etc.). In this case we should pass keywords along. This should be relatively straightforward (it's not a slot).

A quick summary of the proposal at the pure Python level:

``` d[1] -> d.__getitem__(1) d[1,] -> d.__getitem__((1,)) d[1, 2] -> d.__getitem__((1, 2)) d[a=3] -> d.__getitem__((), a=3) d[1, a=3] -> d.__getitem__((1,), a=3) d[1, 2, a=3] -> d.__getitem__((1, 2), a=3)

d[1] = val -> d.__setitem__(1, val) d[1,] = val -> d.__setitem__((1,), val) d[1, 2] = val -> d.__setitem__((1, 2), val) d[a=3] = val -> d.__setitem__((), val, a=3) d[1, a=3] = val -> d.__setitem__((1,), val, a=3) d[1, 2, a=3] = val -> d.__setitem__((1, 2), val, a=3) ```

Do we want to support d[**kwargs]? It can be done, alternatively we could just ask the user to write the __getitem__/__setitem__ call explicitly.

I think we should say no to d[*args], because that will just become d[(*args)], with awkward questions around what if args == (1,). Maybe then for consistency we needn't bother with **kwargs, though the case for that is definitely stronger.

Sorry for telegraphing this -- I am way past my bedtime but looking at this from the C API POV definitely made some things clear to me. I'm probably posting this in the wrong thread -- I can't keep up (and GMail splits threads after 100 messages, which doesn't help).

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...> _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ZQWSIJ... Code of Conduct: http://python.org/psf/codeofconduct/

Joseph Martinot-Lagarde

10:34 a.m.

Guido van Rossum wrote:

...

Do we want to support d[**kwargs]? It can be done, alternatively we could just ask the user to write the __getitem__/__setitem__ call explicitly. I think we should say no to d[*args], because that will just become d[(args)], with awkward questions around what if args == (1,). Maybe then for consistency we needn't bother with **kwargs, though the case for that is definitely stronger. Sorry for telegraphing this -- I am way past my bedtime but looking at this from the C API POV definitely made some things clear to me. I'm probably posting this in the wrong thread -- I can't keep up (and GMail splits threads after 100 messages, which doesn't help).

I hope I understood correctly because Mailman eats the * signs for formatting, but is it possible (and desirable) to have a different behaviour for *args and index when there is only one positional value ? Using "index" would keep the current behaviour : pass a tuple except when there is only one value, in that case the value is passes as-is. On the other hand if *args is passed in the signature, it always gets the positional arguments in a tuple, whatever their number. It would avoid the classical isinstance(index, tuple) check. Here are some examples of what I mean: # Usual signature class Simple: def __getitem__(self, index): print(index) simple = Simple() simple[0] # 0 simple[0, 1] # (0, 1) # This is valid python, but useless ? class Star: def __getitem__(self, *index): print(index) star = Star() star[0] # (0,) star[0, 1] # ((0, 1),) # I propose this breaking change class NewStar: def __getitem__(self, *index): print(index) star = Star() star[0] # (0,) star[0, 1] # (0, 1) This is theoretically a breaking change, but who in his right mind would write such a Star class with the current python ? Any thoughts ?

Guido van Rossum

10:54 a.m.

You appear to be making a connection between star-args in a call and in a function definition. They are unrelated. The more I hear about this the more I favor not supporting it in the subscript syntax. On Sun, Aug 30, 2020 at 08:44 Joseph Martinot-Lagarde <contrebasse@gmail.com> wrote:

...

Guido van Rossum wrote:

...
...
Do we want to support d[**kwargs]? It can be done, alternatively we could

...
just ask the user to write the __getitem__/__setitem__ call explicitly.

...
I think we should say no to d[*args], because that will just become

...
d[(args)], with awkward questions around what if args == (1,). Maybe then

...
for consistency we needn't bother with **kwargs, though the case for that

...
is definitely stronger.

...
Sorry for telegraphing this -- I am way past my bedtime but looking at this

...
from the C API POV definitely made some things clear to me. I'm probably

...
posting this in the wrong thread -- I can't keep up (and GMail splits

...
threads after 100 messages, which doesn't help).

I hope I understood correctly because Mailman eats the * signs for formatting, but is it possible (and desirable) to have a different behaviour for *args and index when there is only one positional value ? Using "index" would keep the current behaviour : pass a tuple except when there is only one value, in that case the value is passes as-is. On the other hand if *args is passed in the signature, it always gets the positional arguments in a tuple, whatever their number. It would avoid the classical isinstance(index, tuple) check. Here are some examples of what I mean:

# Usual signature

class Simple:

def __getitem__(self, index):

print(index)

simple = Simple()

simple[0] # 0

simple[0, 1] # (0, 1)

# This is valid python, but useless ?

class Star:

def __getitem__(self, *index):

print(index)

star = Star()

star[0] # (0,)

star[0, 1] # ((0, 1),)

# I propose this breaking change

class NewStar:

def __getitem__(self, *index):

print(index)

star = Star()

star[0] # (0,)

star[0, 1] # (0, 1)

This is theoretically a breaking change, but who in his right mind would write such a Star class with the current python ?

Any thoughts ?

_______________________________________________

Python-ideas mailing list -- python-ideas@python.org

To unsubscribe send an email to python-ideas-leave@python.org

https://mail.python.org/mailman3/lists/python-ideas.python.org/

Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/HYGUIA...

Code of Conduct: http://python.org/psf/codeofconduct/

-- --Guido (mobile)

Joseph Martinot-Lagarde

31 Aug 31 Aug

3:49 a.m.

Guido van Rossum wrote:

...

You appear to be making a connection between star-args in a call and in a function definition. They are unrelated.

I initially thought that indexing was more function-like with a bit of magic to handle one or multiple arguments, and I was wondering if this "magic" can be changed. Reading the other mails, I understand now that it's just a single expression inside the backets, and the tuple creation is just because of the commas. So my proposal is wrong, sorry for the noise. Joseph

Guido van Rossum

30 Aug 30 Aug

11:17 p.m.

On Sun, Aug 30, 2020 at 8:44 AM Joseph Martinot-Lagarde < contrebasse@gmail.com> wrote:

...

Guido van Rossum wrote:

...
Do we want to support d[**kwargs]? It can be done, alternatively we could just ask the user to write the __getitem__/__setitem__ call explicitly. I think we should say no to d[*args], because that will just become d[(args)], with awkward questions around what if args == (1,). Maybe then for consistency we needn't bother with **kwargs, though the case for that is definitely stronger. Sorry for telegraphing this -- I am way past my bedtime but looking at

this

...
from the C API POV definitely made some things clear to me. I'm probably posting this in the wrong thread -- I can't keep up (and GMail splits threads after 100 messages, which doesn't help).

I hope I understood correctly because Mailman eats the * signs for formatting,

Sorry about that. Yes, where Mailman shows `args` and following text suddenly in italics, it ate a single `*`. Until the next `args` where it ended the italics, that was another `*args`.

...

but is it possible (and desirable) to have a different behaviour for *args and index when there is only one positional value ? Using "index" would keep the current behaviour : pass a tuple except when there is only one value, in that case the value is passes as-is. On the other hand if *args is passed in the signature, it always gets the positional arguments in a tuple, whatever their number. It would avoid the classical isinstance(index, tuple) check. Here are some examples of what I mean:

# Usual signature class Simple: def __getitem__(self, index): print(index) simple = Simple() simple[0] # 0 simple[0, 1] # (0, 1)

# This is valid python, but useless ? class Star: def __getitem__(self, *index): print(index) star = Star() star[0] # (0,) star[0, 1] # ((0, 1),)

# I propose this breaking change class NewStar: def __getitem__(self, *index): print(index) star = Star() star[0] # (0,) star[0, 1] # (0, 1)

This is theoretically a breaking change, but who in his right mind would write such a Star class with the current python ?

This cannot be done, because the signature of the function definition is thoroughly hidden from the call site. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Guido van Rossum

11:39 p.m.

On Sun, Aug 30, 2020 at 8:44 AM Joseph Martinot-Lagarde < contrebasse@gmail.com> wrote:

...

I hope I understood correctly because Mailman eats the * signs for formatting,

I just realized that Mailman (or some other part of the email toolchain -- maybe GMail?) has apparently a handy (:-) feature to work around this. There's an attachment named "attachment.htm". If you click to download this and then open the downloaded file in the browser, it shows the original message in plaintext. Yeah, it's not ideal, but at least it proves that I did type what I meant. :-) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Ricky Teachey

29 Aug 29 Aug

1:19 p.m.

On Sat, Aug 29, 2020, 11:42 AM Jonathan Fine <jfine2358@gmail.com> wrote:

...

For me, a satisfactory outcome from the current PEP process would be a new dunder, which I am calling __keyfn__, that has two possible values, namely None or True. (And of course, the associated syntax and semantics changes. And documentation changes. These are not minor matters.)

This is obviously a different way of doing things than I have been explaining but it accomplishes nearly the same purpose in a simpler and probably cheaper way. Thank you Jonathan Fine. I would support this version of the idea implementation, too. Mine is far more complicated, as Paul Moore rightly pointed out.

Adam Johnson

10:49 a.m.

On Sat, 29 Aug 2020 at 15:12, Ricky Teachey <ricky@teachey.org> wrote:

...

But if we want to have the same behavior without supporting function style syntax, we will have to write code like this:

MISSING = object()

def __getitem__(self, key, x=MISSING, y=MISSING): if x is MISSING and y is MISSING:: x, y = key if x is missing: x, = key if y is MISSING: y, = key

And probably that code I just wrote has bugs. And it gets more complicated if we want to have more arguments than just two. And even more complicated if we want some of the arguments to be positional only or any other combination of things.

This is code you would not have to write if we could do this instead with a new dunder or subscript processor:

def __getx__(self, x, y): ...

And these all just work:

q[1, 2] q[1, y=2] q[y=2, x=1]

1 is assigned to x and 2 is assigned to y in all of these for both versions, but the second certain requires not parsing of parameters. Python does it for us. That's a lot of easily available flexibility.

I was partway through writing a message outlining this very point. It's all well and good stating that named indices are an intended use case (as in the PEP), but in cases where named indices would be useful they presumably aren't currently being used (abuses of slice notation notwithstanding). As such, if a variant of PEP 472 were implemented, code that would benefit from named indices must still find a way to support 'anonymous' indices for backwards compatibility. I believe the code to implement that ought to be much more obvious (both to the author and to readers) if 'anonymous' and named indices were simply handled as positional and keyword arguments, rather than manually parsing and validating the allowable combinations of indices. That being said, it should be noted that even if there are no new dunders, you don't necessarily need to parse the subscripts manually. You could still take advantage of python's function-argument parsing via something resembling the following: ```python def _parse_subscripts(self, /, x, y): return x, y def __getitem__(self, item=MISSING, /, **kwargs): if item is MISSING: args = () elif isintance(item, tuple): args = item else: args = (item,) x, y = self._parse_subscripts(*args, **kwargs) return do_stuff(x, y) ``` However that's still not exactly obvious, as the 'true' signature has been moved away from `__getitem__`, to an arbitrarily named (non-dunder) method. A major difference between the above, and the case where we had one or more new dunders, is that of introspection: new dunders would mean that there would be a 'blessed' method whose signature exactly defines the accepted subscripts. That would be useful in terms of documentation and could be used to implement parameter completion within subscripts. --- Theoretically the signature could instead be defined in terms of `typing.overload`s, something like the following (assuming x & y are integers): ```python @overload def __getitem__(self, item: tuple[int, int], /): ... @overload def __getitem__(self, item: int, /, *, y: int): ... @overload def __getitem__(self, /, *, x: int, y: int): ... def __getitem__(self, item=MISSING, /, **kwargs): # actual implementation, as above ... ``` However that is incredibly verbose compared to the signature of any new dunder, and would only grow worse with a greater number of keyword subscripts.

Greg Ewing

7:14 p.m.

On 30/08/20 3:49 am, Adam Johnson wrote:

...

def _parse_subscripts(self, /, x, y): return x, y

def __getitem__(self, item=MISSING, /, **kwargs): if item is MISSING: args = () elif isintance(item, tuple): args = item else: args = (item,)

x, y = self._parse_subscripts(*args, **kwargs) return do_stuff(x, y)

I think this could be done more simply as def __getitem__(self, index, **kwds): self.real_getitem(*index, **kwds) def real_getitem(self, x, y): ... The point about obscuring the signature still remains, though. Also, this is a hack that we would never be able to get rid of, whereas new dunders would provide a path to cleaning things up in the future. -- Greg

Random832

30 Aug 30 Aug

10:35 p.m.

On Sat, Aug 29, 2020, at 20:14, Greg Ewing wrote:

...

I think this could be done more simply as

def __getitem__(self, index, **kwds): self.real_getitem(*index, **kwds)

def real_getitem(self, x, y): ...

The point about obscuring the signature still remains, though.

Also, this is a hack that we would never be able to get rid of, whereas new dunders would provide a path to cleaning things up in the future.

The thing that bothers me with new dunders is - can it be done in such a way that *all* possible ways of calling it with only positional arguments behave the same as presently when the old one is defined, and an intuitive way with the new one, without requiring the calling bytecode to know which signature is going to be used? __getitem__(self, arg) vs __getitem_ex__(self, *args, **kwargs) x[1] - old arg=1; new args=(1,) x[1,] - old arg=(1,); new args=(1,)? x[(1,)] old arg=(1,); new args=((1,),)? x[1,2] old arg=(1,2); new args=(1,2) x[(1,2)] old arg=(1,2); new args=((1,2),)? x[*a] old arg=a new args=a? Also, do we want to walk the MRO looking for both in turn, or look for the new one on the whole chain before then looking for the old one? is there a precedent for two different ways to define a method on a single class? [getattr vs getattribute is not a case of this, and i say "single class" to exclude the complexity of binary operators] Another concern is where a new setitem should put the value argument? i may have missed something mentioning it, but I don't think I've seen a proposal that goes into detail on that? Having the user define a __setitem__ that calls the real_setitem if needed gets around that by leaving the signature up to the user [where they can e.g. put it in a keyword arg with a name they know they won't need]

Chris Angelico

11:26 p.m.

On Mon, Aug 31, 2020 at 1:38 PM Random832 <random832@fastmail.com> wrote:

...

Another concern is where a new setitem should put the value argument? i may have missed something mentioning it, but I don't think I've seen a proposal that goes into detail on that? Having the user define a __setitem__ that calls the real_setitem if needed gets around that by leaving the signature up to the user [where they can e.g. put it in a keyword arg with a name they know they won't need]

Maybe I'm misreading this, but my understanding is that the definition of the dunder is actually how it's called, not what its signature is. The simplest and most obvious way to do it would be to have the interpreter pass the value positionally, and then keyword arguments separately. You can use whatever name you like for the value, and if you want, you can mandate that it be positional-only: def __setitem__(self, key, value, /, **kw): and then you can accept any arguments you like by keyword, even "self", "key", or "value". This doesn't change the issue of tuplification, but it does mean that the issue is no worse for setitem than for getitem. ChrisA

Random832

11:40 p.m.

On Mon, Aug 31, 2020, at 00:26, Chris Angelico wrote:

...

Maybe I'm misreading this, but my understanding is that the definition of the dunder is actually how it's called, not what its signature is. The simplest and most obvious way to do it would be to have the interpreter pass the value positionally, and then keyword arguments separately. You can use whatever name you like for the value, and if you want, you can mandate that it be positional-only:

def __setitem__(self, key, value, /, **kw):

and then you can accept any arguments you like by keyword, even "self", "key", or "value".

Keep in mind that I am responding to a post that seems to call for new dunder methods that are passed multiple positional "key" arguments instead of a single one. Passing the value last [i.e. in between the passed-as-positional and passed-as-keyword arguments] seems like a non-starter: a[1, 2] = 3: f(self, 1, 2, 3) a[1, y=2] = 3: f(self, 1, 3, y=2) a[2, x=1] = 3: f(self, 2, 3, x=1) There's no possible function signature that can reasonably deal with those. essentially, by being a named argument that is not an intended keyword argument, value acts like / in the argument list, preventing any arguments before it from being passed in as keywords without causing errors. But since it is passed in positionally, it also prevents any arguments after it from being passed in as positional at all. passing the value *first* might be reasonable (it may be the only viable way), but it would be a change from how it's currently done, and I do think this needs to be discussed for any proposal around passing in multiple positional keys. passing the value in as a keyword called __value__ might be another possible way.

...

This doesn't change the issue of tuplification, but it does mean that the issue is no worse for setitem than for getitem.

Guido van Rossum

11:28 p.m.

On Sun, Aug 30, 2020 at 8:36 PM Random832 <random832@fastmail.com> wrote:

...

The thing that bothers me with new dunders is - can it be done in such a way that *all* possible ways of calling it with only positional arguments behave the same as presently when the old one is defined, and an intuitive way with the new one, without requiring the calling bytecode to know which signature is going to be used?

Probably not. (And knowing the signature is impossible -- there are too many layers of C code between the bytecode and the function object being called.) This is why I ended up with the simplest proposal possible -- keyword args get added to the end of `__getitem__` and `__setitem__`, ensuring that `d[1, k=3]` is like `d[1]` + keyword, not like `d[1,]` + keyword. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Random832

11:57 p.m.

On Mon, Aug 31, 2020, at 00:28, Guido van Rossum wrote:

...

On Sun, Aug 30, 2020 at 8:36 PM Random832 <random832@fastmail.com> wrote:

...
The thing that bothers me with new dunders is - can it be done in such a way that *all* possible ways of calling it with only positional arguments behave the same as presently when the old one is defined, and an intuitive way with the new one, without requiring the calling bytecode to know which signature is going to be used?

Probably not. (And knowing the signature is impossible -- there are too many layers of C code between the bytecode and the function object being called.) This is why I ended up with the simplest proposal possible -- keyword args get added to the end of `__getitem__` and `__setitem__`, ensuring that `d[1, k=3]` is like `d[1]` + keyword, not like `d[1,]` + keyword.

perhaps, but I did have a thought after making that post. A new bytecode operation (we'll need one anyway, right?) which, in addition to passing in the positionals and the keywords, also passes along the information of whether or not the subscript contents consisted of precisely a single expression without a trailing comma (or with one, if that'd work better... i believe flagging either one of these cases provides enough information to determine what form of argument should be passed to the single-argument __getitem__).

Guido van Rossum

31 Aug 31 Aug

12:10 a.m.

On Sun, Aug 30, 2020 at 21:58 Random832 <random832@fastmail.com> wrote:

...

A new bytecode operation (we'll need one anyway, right?)

Yes. which, in addition to passing in the positionals and the keywords, also

...

passes along the information of whether or not the subscript contents consisted of precisely a single expression without a trailing comma (or with one, if that'd work better... i believe flagging either one of these cases provides enough information to determine what form of argument should be passed to the single-argument __getitem__).

Sure, but that flag would have to be passed through all the C API layers until it reaches the tp slots. In particular it would uglify PyObject_[GS]etItemEx. -- --Guido (mobile)

Christopher Barker

1:33 a.m.

On Sun, Aug 30, 2020 at 10:01 PM Random832 <random832@fastmail.com> wrote:

...

...
like `d[1,]` + keyword.

perhaps, but I did have a thought after making that post.

A new bytecode operation (we'll need one anyway, right?) which, in addition to passing in the positionals and the keywords, also passes along the information of whether or not the subscript contents consisted of precisely a single expression without a trailing comma

I *think* the trailing comma is shorthand for a larger class of problems. That is, in the current system, you can only put a single expression in the [], so a comma creates a tuple. Which means that: i = (a,) thing[i] = x is the same as: thing[a,] = x and i = (a, b, c) thing[i] = x is the same as: thing[a, b, c] = x etc .... And I don't think, when the interpreter sees a comma inside a [], there is any way to know whether the user explicitly wanted a tuple, or wanted to express multiple indexes. I suppose we could assume that a single comma was not intended to be a tuple, but maybe it was? THe real challenge here is that I suspect most users don't realize that when you do, e.g. arr[i, j] in numpy, you are passing a tuple of two indexes, rather than two separate values. Conversely, when you do, e.g. a_dict[(a,b,c)] that you could hve omited the brackets and gotten the same result. but in current Python, it doesn't matter that folks don't realize that, as you can in fact only put one expression in there, and the tuple creating ends up being an implementation detail for most users.' But once we add keyword arguments, then it will look a LOT more like function call, and then folks will expect it to behave like a function call, where thing(a, b) and thing((a,b)) are not, in fact the same -- but I don't think there is any way to meet that expectation without breaking existing code. However, if we keep it like it is, only adding keywords, but not changing anything about how the "positional" arguments are handled, then we have backward compatibility, and naive users will only notice (maybe) that you can't do *args. Which is not that big a deal -- all you have to say, really is that tuple unpacking isn't allowed in indexing. Done. Anyone that wants to know *why* not can do more research. Key point: as a rule, classes that handle keyword indexes (e.g. xarray and the like) will be written by a small number of people, and used by many more -- and the authors of such classes are by definition more experienced in general. So the priority has to be to keep things as simple as possible for users of such classes. -CHB

...

(or with one, if that'd work better... i believe flagging either one of these cases provides enough information to determine what form of argument should be passed to the single-argument __getitem__). _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/6KOU2F... Code of Conduct: http://python.org/psf/codeofconduct/

-- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython

Guido van Rossum

10:37 a.m.

On Sun, Aug 30, 2020 at 11:33 PM Christopher Barker <pythonchb@gmail.com> wrote:

...

I *think* the trailing comma is shorthand for a larger class of problems. That is, in the current system, you can only put a single expression in the [], so a comma creates a tuple. Which means that:

i = (a,) thing[i] = x

is the same as: thing[a,] = x

and i = (a, b, c) thing[i] = x is the same as: thing[a, b, c] = x

etc ....

It's not *quite* so simple (though almost so). The parser still treats it specially, because the slice notation `a:b`, `a:b:c`, (and degenerate forms like `:` or `::`) are only allowed at the top level. That is, `d[::]` is syntactically valid, but `d[(::)]` is not. Try it.

...

And I don't think, when the interpreter sees a comma inside a [], there is any way to know whether the user explicitly wanted a tuple, or wanted to express multiple indexes. I suppose we could assume that a single comma was not intended to be a tuple, but maybe it was?

In the current system nobody has ever had to think about that. A single comma most definitely makes a tuple.

...

THe real challenge here is that I suspect most users don't realize that when you do, e.g. arr[i, j] in numpy, you are passing a tuple of two indexes, rather than two separate values. Conversely, when you do, e.g. a_dict[(a,b,c)] that you could hve omited the brackets and gotten the same result.

These are all mindgames. What your intention is depends on what kind of data structure it is (e.g dicts have a one-dimensional key set, but keys may be tuples, whereas numpy arrays have any number of dimensions, but each dimension is numeric). How what you write is interpreted is entirely up to the `__getitem__` and `__setitem__` implementation of the data structure you are using. Those methods must be written to handle edge cases according to their intended model (I'm sure there's lots of code in e.g. numpy and Pandas to handle the special case where the "key" argument is not a tuple.) The protocol is what it is and you can use it in different ways, as long as you follow the protocol.

...

but in current Python, it doesn't matter that folks don't realize that, as you can in fact only put one expression in there, and the tuple creating ends up being an implementation detail for most users.'

(Technically not so, because of the slices. Note that dict doesn't even check for slices -- they fail because they're unhashable. But you can write `d[...] = 1` and now you have `Ellipsis` as a key.) But once we add keyword arguments, then it will look a LOT more

...

like function call, and then folks will expect it to behave like a function call, where thing(a, b) and thing((a,b)) are not, in fact the same -- but I don't think there is any way to meet that expectation without breaking existing code.

I think it's up to the implementers of extended `__getitem__` and `__setitem__` methods to set the right expectations **for their specific data type**.

...

However, if we keep it like it is, only adding keywords, but not changing anything about how the "positional" arguments are handled, then we have backward compatibility, and naive users will only notice (maybe) that you can't do `*args`. Which is not that big a deal -- all you have to say, really is that tuple unpacking isn't allowed in indexing. Done. Anyone that wants to know *why* not can do more research.

Right.

...

Key point: as a rule, classes that handle keyword indexes (e.g. xarray and the like) will be written by a small number of people, and used by many more -- and the authors of such classes are by definition more experienced in general. So the priority has to be to keep things as simple as possible for users of such classes.

Which leads to what design choices? I think we need to watch out that we're not trying to make `a[1, 2, k=3]` look like a function call with funny brackets. It is a subscript operation with keyword parameters. But it is still first and foremost a subscript operation. And yeah, backwards compatibility is a b****. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Ricky Teachey

10:59 a.m.

On Mon, Aug 31, 2020 at 11:38 AM Guido van Rossum <guido@python.org> wrote:

...

On Sun, Aug 30, 2020 at 11:33 PM Christopher Barker <pythonchb@gmail.com> wrote:

...
I *think* the trailing comma is shorthand for a larger class of problems. That is, in the current system, you can only put a single expression in the [], so a comma creates a tuple. Which means that:

i = (a,) thing[i] = x

is the same as: thing[a,] = x

and i = (a, b, c) thing[i] = x is the same as: thing[a, b, c] = x

etc ....

It's not *quite* so simple (though almost so). The parser still treats it specially, because the slice notation `a:b`, `a:b:c`, (and degenerate forms like `:` or `::`) are only allowed at the top level. That is, `d[::]` is syntactically valid, but `d[(::)]` is not. Try it.

Omg. This is a huge problem that I didn't even consider when I wrote my previous reply asking for naked, comma-separated subscript arguments to be disallowed... thanks GvR, for surfacing that one! Yick. <http://python.org/psf/codeofconduct/> It looks like we'd have to snowball many more syntax changes to disallow that. I think I still find myself in the camp of asking for either 3 new dunders or a single new dunder. But it doesn't appear that is the direction things are going. --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler

Greg Ewing

1:45 a.m.

On 31/08/20 3:35 pm, Random832 wrote:

...

x[(1,)] old arg=(1,); new args=((1,),)? x[(1,2)] old arg=(1,2); new args=((1,2),)?

...

Also, do we want to walk the MRO looking for both in turn, or look for the new one on the whole chain before then looking for the old one? I would look for one and then the other -- looking for both in

No, I proposed *not* to do those -- putting parens around the arguments would continue to make no difference, regardless of which dunder was being called. parallel would be weird, unprecedented and probably not necessary.

...

Another concern is where a new setitem should put the value argument?

My solution is to put it before the index arguments, e.g. def __setindex__(self, value, i, j, k):

...

Having the user define a __setitem__ that calls the real_setitem if needed gets around that by leaving the signature up to the user [where they can e.g. put it in a keyword arg with a name they know they won't need]

If you're worried about people doing things like a[1, 2, 3, value = 4] = 5 I'm not sure that's really a problem -- usually it will result in an exception due to specifying more than one value for a parameter. If you're creating a class that needs to be able to take arbitrary keyword indexes, you can use a positional-only parameter for the value. def __setindex__(self, value, /, *args, **kwds): -- Greg

Random832

1:38 p.m.

On Mon, Aug 31, 2020, at 02:45, Greg Ewing wrote:

...

On 31/08/20 3:35 pm, Random832 wrote:

...
x[(1,)] old arg=(1,); new args=((1,),)? x[(1,2)] old arg=(1,2); new args=((1,2),)?

No, I proposed *not* to do those -- putting parens around the arguments would continue to make no difference, regardless of which dunder was being called.

What about passing in a tuple object that's in a variable? a=1,2 x[a] should args be ((1,2),) or (1, 2)? Having x[a] be different from x[(1,2)] would be *bizarre*, but having it result in args=(1,2) would be keeping almost as much baggage from the current paradigm as not having a new dunder at all. I think that's why I assumed as a matter of course that a new dunder meant a tuple argument would unambiguously become a single argument.

Paul Moore

29 Aug 29 Aug

11:29 a.m.

On Sat, 29 Aug 2020 at 15:12, Ricky Teachey <ricky@teachey.org> wrote:

...

Here's one reason I find a new dunder or dunders compelling that I haven't seen anyone respond to directly:

I can write functions that define named arguments, and if I pass them positionally, they get assigned the right name automatically (unless disallowed by the signature using 3.8 positional only syntax):

def f(x, y): ...

f(1, 2) f(1, y=2) f(y=2, x=1)

If we add kwargs to the subscript operator, we'll be able to add new required or optional names to item dunder signatures and supply named arguments :

def __getitem__(self, key, x, y): ...

q[x=1, y=2]

But if we want to have the same behavior without supporting function style syntax, we will have to write code like this:

MISSING = object()

def __getitem__(self, key, x=MISSING, y=MISSING): if x is MISSING and y is MISSING:: x, y = key if x is missing: x, = key if y is MISSING: y, = key

And probably that code I just wrote has bugs. And it gets more complicated if we want to have more arguments than just two. And even more complicated if we want some of the arguments to be positional only or any other combination of things.

This is code you would not have to write if we could do this instead with a new dunder or subscript processor:

def __getx__(self, x, y): ...

And these all just work:

q[1, 2] q[1, y=2] q[y=2, x=1]

1 is assigned to x and 2 is assigned to y in all of these for both versions, but the second certain requires not parsing of parameters. Python does it for us. That's a lot of easily available flexibility.

But you don't give any reason why you'd want to do that. Why are you using subscript notation rather than a simple function call? This is just another variation on the "it would be nice if..." argument, which has been covered over and over again. What is the use case here, and why is the proposed solution more compelling than anything that can already be done in Python? Not simply compelling in the sense of "it looks nice", but how does subscript notation map to the problem domain, and how would all the variations you say would "just work" be meaningful in terms of the real-world problem? The question isn't whether named arguments are useful - function calls demonstrate that perfectly well. The question is why are they needed *for subscripting*, and what do they *mean* when used in applications where subscripts are the natural way to express the problem logic? Paul

Jonathan Fine

12:04 p.m.

Paul Moore wrote: But you don't give any reason why you'd want to do that. Why are you

...

using subscript notation rather than a simple function call?

Good point. Consider >>> def f(*argv): pass >>> d = dict() Now compare >>> f(1, 2) = 3 SyntaxError: can't assign to function call >>> d[1, 2] = 3 >>> d[1, 2] 3 Item assignment (ie __setitem__) is the one thing that a function call can't do. If we want keywords in our __getitem__ and so on commands, then one route for item assignment is to allow >>> d[1, 2, a=3, b=4] = 5 as valid syntax. By the way, another route is to use a simple function call, like so >>> d[o(1, 2, a=3, b=4)] = 5 which is already possible today. Some of us don't like this route. -- Jonathan

Paul Moore

1:22 p.m.

On Sat, 29 Aug 2020 at 18:04, Jonathan Fine <jfine2358@gmail.com> wrote:

...

Paul Moore wrote:

...
But you don't give any reason why you'd want to do that. Why are you using subscript notation rather than a simple function call?

Good point. Consider >>> def f(*argv): pass >>> d = dict()

Now compare >>> f(1, 2) = 3 SyntaxError: can't assign to function call >>> d[1, 2] = 3 >>> d[1, 2] 3

Obviously. As it says, you can't assign to a function call.

...

Item assignment (ie __setitem__) is the one thing that a function call can't do. If we want keywords in our __getitem__ and so on commands, then one route for item assignment is to allow >>> d[1, 2, a=3, b=4] = 5 as valid syntax.

Again, obvious. But you still haven't given any reason why we would want to do that. No-one's arguing that these things aren't possible, or that the proposals can't be implemented. What I'm asking, and you aren't answering, is what is the use case? When, in real world code, would this be used? Also, by being this abstract, you've ended up arguing without any context. If I say "yes, d[1, 2, a=3, b=4] = 5" might be useful, you've not got anywhere, because all of the proposals being discussed are about allowing this, so even if you *do* get agreement on this point, you're barely moving the discussion forward at all.

...

By the way, another route is to use a simple function call, like so >>> d[o(1, 2, a=3, b=4)] = 5 which is already possible today. Some of us don't like this route.

Exactly. Pick a proposed solution and find use cases and arguments for it. Don't argue for ideas that are so abstract that they gloss over all the details that differentiate between the various implementation options. Paul

Jonathan Fine

1:44 p.m.

Paul Moore wrote: Again, obvious. But you still haven't given any reason why we would want to

...

do that. No-one's arguing that these things aren't possible, or that the proposals can't be implemented. What I'm asking, and you aren't answering, is what is the use case? When, in real world code, would this be used?

The canonical answer to this question is https://www.python.org/dev/peps/pep-0472/#use-cases I'd like the examples there to be expanded and improved upon. To help support this I've created https://pypi.org/project/kwkey/, which emulates in today's Python the proposed enhancement. I'd like us to write code now that can use the enhancement, if and when it arrives. By the way, thank you Paul and your colleagues for your work on creating and developing pip. It's one of our most important tools. I'm all for having real-world use cases, and where possible working solutions, as a key part of the discussion. I admit that there has been a past deficiency here. I hope that we will correct this soon. For my part, I'm working on an enhanced kwkey, which will provide a wider range of emulation. -- Jonathan

Ricky Teachey

1:56 p.m.

On Sat, Aug 29, 2020, 2:24 PM Paul Moore <p.f.moore@gmail.com> wrote:

...

On Sat, 29 Aug 2020 at 18:04, Jonathan Fine <jfine2358@gmail.com> wrote:

...
Paul Moore wrote:

...
But you don't give any reason why you'd want to do that. Why are you using subscript notation rather than a simple function call?

Good point. Consider >>> def f(*argv): pass >>> d = dict()

Now compare >>> f(1, 2) = 3 SyntaxError: can't assign to function call >>> d[1, 2] = 3 >>> d[1, 2] 3

Obviously. As it says, you can't assign to a function call.

...
Item assignment (ie __setitem__) is the one thing that a function call can't do. If we want keywords in our __getitem__ and so on commands, then one route for item assignment is to allow >>> d[1, 2, a=3, b=4] = 5 as valid syntax.

Again, obvious. But you still haven't given any reason why we would want to do that. No-one's arguing that these things aren't possible, or that the proposals can't be implemented. What I'm asking, and you aren't answering, is what is the use case? When, in real world code, would this be used?

I'd like to use syntax like this for defining a named mathematical functions:

...

...
...
f, g, h = mymath.funcs("f g h") x, y, z = mymath.vars("x y z") f[x] = x**2 + 2*x + 1 g[x, y] = x**2 + x*y + y**2 h[z] = f[z] + g[1, z] h[2] 15 h[z=2] 15

Why? Because it looks like handwritten math. Far more pleasant for my colleagues to read than lambda expressions and def statements. I can already do most of this now of course (on fact, I have). But it'll be far far easier to write, read, and maintain the supporting code with function-like argument parsing in subscripts.

...

Steven D'Aprano

11:06 p.m.

On Sat, Aug 29, 2020 at 07:22:52PM +0100, Paul Moore wrote:

...

...
Item assignment (ie __setitem__) is the one thing that a function call can't do. If we want keywords in our __getitem__ and so on commands, then one route for item assignment is to allow >>> d[1, 2, a=3, b=4] = 5 as valid syntax.

Again, obvious. But you still haven't given any reason why we would want to do that.

See the PEP: https://www.python.org/dev/peps/pep-0472/ although I think we can eliminate a couple of the use-cases from contention, such as changes to built-in dicts and lists. For example, consider the case where we might have named axes (I think it is Todd who wants this for xarray). We want to be able to use the same notation for getters, setters and deleters. I don't know the axis names xarray uses, so I'm going to make them up: If you can use subscripting to get an axis: obj[7:87:3, axis='widdershins'] then you ought to be able to use subscripting to set or delete an axis: obj[15::9, axis='hubwise'] del obj[2:5, axis='turnwise'] (Users of xarray may be able to suggest some more realist examples.) I expect pandas could use this too. I would love to be able to do this in a matrix: # standard element access notation matrix[2, 3] = 2.5 # delete an entire column del matrix[column=4] # and replace an entire row matrix[row=3] = [1.5, 2.5, 3.5, 4.5, 6.5] Of course we could use named getter, setter and deleter methods for this, but subscript syntax is a succinct notation which comes very close to the standard notation used in the relevant domains. The bottom line here is that if your operations come in sets of three, for getting, setting and deleting a specific sub-element (a row, a column, item, key, etc.) then subscript syntax is a more natural notation than separate getter/setter/deleter methods. Even if your object is immutable and so only the getter is defined, it's still more natural to use subscript notation. -- Steve

Random832

30 Aug 30 Aug

12:58 a.m.

On Sat, Aug 29, 2020, at 10:12, Ricky Teachey wrote:

...

If we add kwargs to the subscript operator, we'll be able to add new required or optional names to item dunder signatures and supply named arguments :

def __getitem__(self, key, x, y): ...

q[x=1, y=2]

But if we want to have the same behavior without supporting function style syntax, we will have to write code like this:

MISSING = object()

def __getitem__(self, key, x=MISSING, y=MISSING): if x is MISSING and y is MISSING:: x, y = key if x is missing: x, = key if y is MISSING: y, = key

[side note: I don't believe the behavior suggested by this last "if y is MISSING:" clause is supported by your proposal either. It's certainly not supported by function calls. Are you suggesting q[x=1, 2], or q[2, x=1] to be equivalent to y=2?] def _getitem(self, x=..., y=...): ... def __getitem__(self, arg, /, **kwargs): if isinstance(arg, tuple): return self._getitem(*arg, **kwargs) else: return self._getitem(arg, **kwargs) or, as a more compact if slightly less efficient version def __getitem__(self, arg, /, **kwargs): return self._getitem(*(arg if isinstance(arg, tuple) else (arg,)), **kwargs) This is only a few lines [which would have to be duplicated for set/del, but still], you're free to implement it on your own if necessary for your use case. My code assumes that no positional arguments results in an empty tuple being passed - it would be slightly more complex but still manageable if something else is used instead.

Steven D'Aprano

29 Aug 29 Aug

9:10 p.m.

On Sat, Aug 29, 2020 at 12:50:15AM -0400, Ricky Teachey wrote:

...

I was really trying to explain the semantics multiple times but as I look over my messages your criticism is correct, I was throwing out too many detailed examples rather than focusing on the idea.

Also there are so many different proposals being thrown out at the same time, by different people, that it is difficult to keep track of what is what.

...

For your benefit- since my explanations weren't sufficient- I wrote a bunch of admittedly junkie code, which is sometimes easier to understand (even if it is buggy) than English, in an attempt to showcase the idea more clearly.

Coders often say that a line of code says more than ten lines of explanation, and often that's true, but sometimes, especially when dealing with a new proposal for functionality that doesn't actually exist yet, a line of explanation is better than a hundred lines of code :-) [...]

...

...
The status quo is that the subscript is passed as either a single value, or a tuple, not multiple arguments. If that *parsing rule* remains in place, then these two calls are indistinguishable:

obj[spam, eggs] obj[(spam, eggs)]

and your subscript dunder will only receive a single argument because that's what the parser sees. So you need to change the parser rule.

Yes I've made this observation myself in a couple different replies and I agree it's a problem. Greg Ewing (again!) had a helpful comment about it, perhaps he is correct:

[... snip Greg's suggestion ...]

...

However, I'm not sure whether it's necessary to go that far. The important thing isn't to make the indexing syntax exactly match function call syntax, it's to pass multiple indexes as positional arguments to __getindex__. So I'd be fine with having to write a[(1,)] to get a one-element tuple in both the old and new cases.

But the problem is that this is a change in behaviour, which is not backwards compatible and will break old code. You might be happy with writing `a[(1,)]` to get a tuple but there is almost certainly a ton of existing code that is already writing `a[1,]` to get a tuple, and I don't think you are volunteering to go out and fix it all :-) So we have some choices: (1) No breaking old code at all. In that case your proposal is dead in the water, or we need a complicated (maybe fragile?) techical solution that fixes it, as suggested by Greg. (Assuming it is practical.) The more complicated the fix, the less attractive the proposal becomes. (2) We're okay with breaking code but need a future directive. In which case, why bother with this signature dunder? It's no longer necessary. (3) It's only a *little* breakage, so just go ahead and make the change and don't worry about breaking old classes. I would expect that the Steering Council would only agree if this little breakage was balanced by a *large* benefit, and I'm not seeing the large benefit given that we've heard from a numpy developer that the status quo re positional arguments is no burden, and I don't think anyone from the xarray or pandas projects have disagreed. So it seems to me that this "change the parsing of commas" is being driven by people who aren't actually affected by the change, and the people who are affected are saying "we don't need this". (Which is very different from the original part of the proposal, namely adding *keywords*.)

...

...
Your subscript preprocessor would allow the coder to stick spam and eggs into a tuple and pass it on, but it also returns a dict so the getitem dunder still needs to be re-written to accept `**kwargs` and check that it's empty, so you're adding more, not less, work.

No, that's not right. The kwargs mapping included in the return by the preprocessor gets unpacked in the item dunder call. An unpacked empty dict in an existing item dunder (without kwargs support) creates no error at all.

Yes, you're right, sorry.

...

...
Okay, let's make the interpreter smarter: it parses spam, eggs as two arguments, and then sees that there is no subscript dunder, so it drops back to "legacy mode", and assembles spam and eggs into a tuple before passing it on to the item getter.

That's one way to think about it. Another way to think about it is, there is an existing preprocessing function in cpython, today -- and that function is currently not exposed for external use, and it currently gets passed (spam, eggs) as a tuple and silently sends that tuple on to the item dunders:

So this preprocessor is a do-nothing identity function that does nothing and doesn't actually get called by anything or even exist. I don't think it is a helpful argument to invent non-existent pre- processors that do nothing. [...]

...

if len(args) == 1, and if it is, it returns args[0]. Otherwise, it returns just args (a tuple). I believe that should replicate the behavior we have now.

So if I currently do this: my_tuple = (1,) obj[obj] your updated preprocessor will pass just 1, not the tuple, on to the item getter. That's not the current behaviour.

...

...
This sort of change in behaviour is exactly why the future mechanism was invented. If it is desirable to change subscripting to pass multiple positional arguments, then we should use that, not complicated jerry- rigged "Do What I Mean" cunning plans that fail to Do What I Meant.

Cool. I'm interested.

If we decide that this change in behaviour is required, we define a new future directive that simply changes the way subscripts are parsed. People who want to use the new behaviour include that future directive at the top of their module to get the new behaviour. Those who don't, don't, and nothing changes. Eventually the change in behaviour becomes standard, but that could be put off for two or four releases. https://docs.python.org/3/reference/simple_stmts.html#future https://docs.python.org/3/library/__future__.html -- Steve

Greg Ewing

11:36 p.m.

On 30/08/20 2:10 pm, Steven D'Aprano wrote:

...

I wrote:

...
So I'd be fine with having to write a[(1,)] to get a one-element tuple in both the old and new cases.

But the problem is that this is a change in behaviour,

Sorry, I got that backwards! What I should have said was that a[1,] would continue to create a tuple, regardless of whether old or new style indexing was happening. -- Greg

Greg Ewing

1:24 a.m.

On 29/08/20 2:07 pm, Steven D'Aprano wrote:

...

Better would be to add a new future directive to change the parsing of subscripts, and allow people to opt-in when they are ready on a per-module basis.

from __future__ import subscript_arguments

I don't think that would help, at least not on its own. The style of subscript argument passing required depends on the object being indexed, not the module it's being done from. There could be a future import that just makes a[1,] mean a[1] instead of a[(1,)], but how useful would that really be? How often have you wanted to put a trailing comma on your indexes and have it do nothing? -- Greg

Greg Ewing

1:48 a.m.

Another idea: Instead of new dunders, have a class attribute that flags the existing ones as taking positional and keyword arguments. class ILikeMyIndexesPositional: __newstyleindexing__ = True def __getitem__(self, i, j, spam = 42): ... Advantages: No new dunders or type slots required. Disadvantages: Code that expects to be able to delegate to another object's item dunder methods could break. There are two ways that existing code might perform such delegation: def __getitem__(self, index): return other_object[index] def __getitem__(self, index): return other_object.__getitem__(index) Neither of these will work if index is a tuple and other_object takes new-style indexes, because it will get passed as a single argument instead of being unpacked into positional arguments. The only reliable way to perform such delegation would be __newstyleindexing__ = True def __getitem__(self, *args, **kwds): return other_object[*args, **kwds] but that requires the delegating object to be aware of new-style indexing. So having thought it through, I'm actually anti-proposing this solution. -- Greg -- Greg

1583

Age (days ago)

1588

Last active (days ago)

List overview

Download

109 comments

17 participants

participants (17)

Adam Johnson
Alexandre Brault
Antoine Pitrou
Chris Angelico
Christopher Barker
David Mertz
Eric V. Smith
Greg Ewing
Guido van Rossum
Jonathan Fine
Joseph Martinot-Lagarde
Paul Moore
Random832
Ricky Teachey
Stefano Borini
Steven D'Aprano
Todd

PEP 472 - new dunder attribute, to influence item access

tags

participants (17)