On Fri, Aug 28, 2020 at 10:10 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Aug 27, 2020 at 09:57:26AM -0400, Ricky Teachey wrote:

> Sorry, I need to stop coding in shorthand.

That might help.

What might help even more is if you spend less time showing imaginary,
and invariably buggy, examples and more time explaining in words the
intended semantics of this, and the reason why you want those semantics.

I'm sorry you had to read over my email so many times. I'm sorry I have confused things by throwing out more than one idea at a time. I am sorry I am so bad at explaining things. Hopefully now it is clear, regardless of how we got there. It sounds like it is. 

I was really trying to explain the semantics multiple times but as I look over my messages your criticism is correct, I was throwing out too many detailed examples rather than focusing on the idea. For your benefit- since my explanations weren't sufficient- I wrote a bunch of admittedly junkie code, which is sometimes easier to understand (even if it is buggy) than English, in an attempt to showcase the idea more clearly.

So let me see if I have this. You want to add a special dunder method
which, if it exists, is automatically called by the interpreter to
preprocess the subscript before passing it to the usual get-, set- and
del-item dunders.

Yes, that's the basic idea as I envision it and as Jonathan Fine wote in the first message in this thread.
 
...

* interpreter passes arguments to the subscript dunder
* which preprocesses them and returns them
* and the interpreter then passes them to the appropriate dunder.

I'm underwhelmed.

I think a new dunder is a good idea. I've explained why a couple times but I can try again if you'd like. On the other hand, we've established I'm bad at explaining things so maybe not a great idea.

I can point you to this comment from Greg Ewing in the other thread where I first brought up the new dunders (3 new dunders, in that case) idea, maybe it will be better than I can do (however he's talking about 3 dunders-- still hoping he and others might come around to the idea of just one):

https://mail.python.org/archives/list/python-ideas@python.org/message/NIJAZKTHPBP6Q3462V3DBUWDTQ5NU7QU/
 
I *think* your intention here is to handle the transition from the
status quo to full function-like parameters in subscripts in a backwards
compatible way, but that's not going to work.

I'm not so sure that's fully true. There are certainly problems that need to be worked out.
 
The status quo is that the subscript is passed as either a single value,
or a tuple, not multiple arguments. If that *parsing rule* remains in
place, then these two calls are indistinguishable:

    obj[spam, eggs]
    obj[(spam, eggs)]

and your subscript dunder will only receive a single argument because
that's what the parser sees. So you need to change the parser rule.

Yes I've made this observation myself in a couple different replies and I agree it's a problem. Greg Ewing (again!) had a helpful comment about it, perhaps he is correct:

https://mail.python.org/archives/list/python-ideas@python.org/message/XWE73VLLGYWYLNFMRKZXIBILBOLAI6Z3/

"We could probably cope with that by generating different bytecode

when there is a single argument with a trailing comma, so that a runtime decision can be made as to whether to tupleify it.

However, I'm not sure whether it's necessary to go that far. The important thing isn't to make the indexing syntax exactly match function call syntax, it's to pass multiple indexes as positional arguments to __getindex__. So I'd be fine with having to write a[(1,)] to get a one-element tuple in both the old and new cases.

It might actually be better that way, because having trailing commas mean different things depending on the type of object being indexed could be quite confusing."

Your subscript preprocessor would allow the coder to stick spam and eggs
into a tuple and pass it on, but it also returns a dict so the getitem
dunder still needs to be re-written to accept `**kwargs` and check that
it's empty, so you're adding more, not less, work.

No, that's not right. The kwargs mapping included in the return by the preprocessor gets unpacked in the item dunder call. An unpacked empty dict in an existing item dunder (without kwargs support) creates no error at all.

Yes, if the kwargs dunder contains argument names not supported by the signature of the item dunders, we will get an error. But that's true with any function call. So, you know, don't do that.

And besides, if I
have to add a brand new dunder method to my class in order for my item
getter to not break, it's not really backwards-compatible.

How is it going to be broken? Unless you add the new dunder to your class, it won't be operative. The implicitly existing internal python function that does the job of this proposed dunder acts as the preprocessor instead of the dunder.

Okay, let's make the interpreter smarter: it parses spam, eggs as two
arguments, and then sees that there is no subscript dunder, so it drops
back to "legacy mode", and assembles spam and eggs into a tuple before
passing it on to the item getter.

That's one way to think about it. Another way to think about it is, there is an existing preprocessing function in cpython, today-- and that function is currently not exposed for external use, and it currently gets passed (spam, eggs) as a tuple and silently sends that tuple on to the item dunders:

def cpython_subscript_preprocessor(key_or_index): ...

We modify that existing function signature so that it is passed positional arguments, something like this:

def cpython_subscript_preprocessor(*args): ...

To achieve the behavior we have today, that updated function checks to see if len(args) == 1, and if it is, it returns args[0]. Otherwise, it returns just args (a tuple). I believe that should replicate the behavior we have now.

This is intended as a semantic description, just using code an aid in explaining the idea.

Only that's not really backwards compatible either, because the
interpreter can't distinguish the two cases:

    # single argument with trailing comma
    obj[spam,]

    # no trailing comma
    obj[spam]

As I explained above, that can be handled just fine by replicating the existing cpython preprocessor with the correct signature, and parsing it consistent with current python behavior. Cpython pseudo code to explain:

def cpython_subscript_preprocessor(*args): ...
    try:
        args,  = args
    except (ValueError, TypeError):
        pass

Then return args in whatever way the API becomes defined.

My idea at the moment is to return it as a two tuple that looks like this:

(item_dunder_args, item_dunder_kwargs)

...and that two-tuple gets unpacked in the item dunders this way:

obj.__getitem__(*item_dunder_args, **item_dunder_kwargs)

For the current default internal python preprocessing function, in the case of spam, eggs it would return;

(((spam, eggs),), {})

...and that two-tuple gets unpacked in the item dunders this way:

obj.__getitem__(*((spam, eggs),), **{})

In both cases this looks like a single argument to the interpreter, but
the status quo is that they are different. The first one needs to be a
tuple of one item.

Why add all this complexity only to fail to remain backwards-compatible?

It doesn't fail, as I explained.
 
Better would be to add a new future directive to change the parsing of
subscripts, and allow people to opt-in when they are ready on a
per-module basis.

    from __future__ import subscript_arguments

This might be an even better idea. Are you proposing it? But it would certainly break a lot of code to eventually make that change, so I'm unsure I would support it... maybe I could be talke into it I don't know.

A new dudner seems far more friendly to existing code.

This sort of change in behaviour is exactly why the future mechanism was
invented. If it is desirable to change subscripting to pass multiple
positional arguments, then we should use that, not complicated jerry-
rigged "Do What I Mean" cunning plans that fail to Do What I Meant.

Cool. I'm interested.

Notice that none of the above needs to refer to keyword arguments. We
could leave keyword arguments out of your proposal, and the argument
parsing issue remains.



--
Steve

Understood, but if the intention of the entire proposal is to shift subscripting as much as possible to a function calling paradiam, it would be very weird to leave out kwd arguments in the process.

---
Ricky.

"I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler