On Thu, Aug 20, 2020 at 5:11 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Aug 20, 2020 at 10:41:42AM -0700, Christopher Barker wrote:
> Current indexing behavior is an oddball now:
 
> The signature of __getitem__ is always:
>
> def __getitem__(self, index):

Mostly correct -- you can still declare the method with additional
parameters, so long as they have default values. Calling the method with
subscript syntax will always bind the entire subscript to the first
parameter, but if you call the dunder method directly you can do
anything you like. It's just a method.

This is likely to be a rare and unusual case, but we don't want to break
anyone who already has a dunder something like this:

    def __getitem__(self, index, extra=None)

Is that an issue we care about? As I think about it, the dunders are special: they are reserved by python to support particular things, and in this case, the "official" use will never pass anything other than the one argument. So isn't it OK to break code that might be "abusing" the __getitem__ dunder for some other use?

honestly, folks could have code that used any dunder in any way -- but I think it's OK to break that code.

xarray is an example, it *could* have extended __getitem__ in a similar way, but it didn't, because that really would have been a "bad idea".

But in any case the keyword_index object approach would be less likely to break this code than "regular" keywords would be.

> So: if we want to maintain backward compatibility, we *can't* use the
> regula rargument passing approach, it will have to be a slightly odd
> special case.

It's already a slightly odd special case, the question is whether we
want to extend that oddness to the keyword arguments as well as the
positional arguments?

yes, that was my point :-) The question is which new odd special case we go with, one that's closer to the current one, or one that's closer to "regular" argument passing.
 
> def __getitem__(self, index):
>     if isinstance(index, tuple):
>         handle_the_tuple_of_indices(index)
>     else:
>         handle_a_single_index(index)
>
> would still work as it does now.

I don't think it would, because that "single index" may turn into an
unexpected keywords_object, as you describe here:

yes, but existing code doesn't accept any old thing as a single index, Each class has particular things is accepts, For example:

Sequences accept something with __Index__ or a slice object. So if it got a keywords_index object it would raise a TypeError, just like they do when you pass in any number of other types.

Mappings are more of a challenge, 'cause they accept any hashable type -- so if the keywords_index object were hashable, then it would just work -- which is what Jonathan wants, but I'm not sure that's a great idea. If it weren't hashable, then nothing would change for Mapping either.

Of course, arbitrary objects can have arbitrary handling of the single index -- but at least most would expect to get a particular type or types, and would hopefully raise on some new object that has never existed before. With dynamic duck typing, it's possible that it would work, but do the wrong thing, but that's got to be rare.

But if anyone has an example of already existing code that would be broken by this, I'd love to see it.

Also: I think that type hints use the [] operator as well -- though I don't know anything about them -- but does the semantics of [] for square brackets need to be the same as for indexing?

But if you *don't* add that clause, your `handle_a_single_index` will
start to receive keyword_objects instead of whatever you expect. That
may lead to an easily diagnosed exception, but it may not.

sure -- though I would like to see an actual example of where it wouldn't be a TypeError (from otherwise robust code)

> But for libraries that expect a single index to be key or int or slice,
say, Jonathan's proposal will mean they will receive unexpected
keyword_objects as well, and we don't know how they will be handled.

We can already pass arbitrary objects in as an index, so any robust code would already handle that well. Unless it was designed to handle an object that happened to duck-type to the new keywords_index object -- which seems very unlikely to me.

> Jonathan's proposal would be grand if we all agreed that the primary
use-case for this feature is "I want to use a collection of name:value
pairs as a key in a dict":

I agree that that is not the primary use case for this feature, and that that is really a separate issue anyway. I think a ImmutableMapping would be nice to have, but as you say, you can write one yourself, and it's apparently not useful enough to be a commonly used thing.

which is why I'm making the case for this approach completely separately from how dicts might work.

I'm kind of thinking out loud here, I'm still not sure which approach I prefer.

-CHB


--
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython