PEP 9999 (provisional): support for indexing with keyword arguments
PEP for support for indexing with keyword arguments is now submitted as PR. https://github.com/python/peps/pull/1612 Thanks to everybody involved in the development of the PEP and the interesting discussion. All your contributions have been received and often added to the document. If the PEP is approved, I would like to attempt an implementation, but I am not particularly skilled in the python internals. If a core developer is willing to teach me the ropes (especially in the new parser, I barely understood the syntax of the old one, but have no idea about the new one) and review my code I can give it a try. I would not mind refreshing my C a bit. -- Kind regards, Stefano Borini
Dear all, "Support for indexing with keyword arguments" has now been merged with the assigned PEP number 637. For future reference, this email will be added to the Post-History of the PEP. On Mon, 21 Sep 2020 at 21:34, Stefano Borini <stefano.borini@gmail.com> wrote:
PEP for support for indexing with keyword arguments is now submitted as PR.
https://github.com/python/peps/pull/1612
Thanks to everybody involved in the development of the PEP and the interesting discussion. All your contributions have been received and often added to the document. If the PEP is approved, I would like to attempt an implementation, but I am not particularly skilled in the python internals. If a core developer is willing to teach me the ropes (especially in the new parser, I barely understood the syntax of the old one, but have no idea about the new one) and review my code I can give it a try. I would not mind refreshing my C a bit.
-- Kind regards, Stefano Borini
Hmmm, getting a 404 at: https://www.python.org/dev/peps/pep-0637 Is this just a temporary condition or a bug? --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler On Wed, Sep 23, 2020 at 4:56 PM Stefano Borini <stefano.borini@gmail.com> wrote:
Dear all,
"Support for indexing with keyword arguments" has now been merged with the assigned PEP number 637.
For future reference, this email will be added to the Post-History of the PEP.
On Mon, 21 Sep 2020 at 21:34, Stefano Borini <stefano.borini@gmail.com> wrote:
PEP for support for indexing with keyword arguments is now submitted as
PR.
https://github.com/python/peps/pull/1612
Thanks to everybody involved in the development of the PEP and the interesting discussion. All your contributions have been received and often added to the document. If the PEP is approved, I would like to attempt an implementation, but I am not particularly skilled in the python internals. If a core developer is willing to teach me the ropes (especially in the new parser, I barely understood the syntax of the old one, but have no idea about the new one) and review my code I can give it a try. I would not mind refreshing my C a bit.
-- Kind regards,
Stefano Borini _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/25ZPXA... Code of Conduct: http://python.org/psf/codeofconduct/
On Thu, Sep 24, 2020 at 7:33 AM Ricky Teachey <ricky@teachey.org> wrote:
Hmmm, getting a 404 at:
https://www.python.org/dev/peps/pep-0637
Is this just a temporary condition or a bug?
Everything seems to be happy, so I'm going to guess that it's just taking a bit of time to propagate out. Give it a few minutes, maybe an hour tops, and see if it's live; if it still isn't, there might be an infrastructure issue somewhere. For now, you can read the text of the PEP here: https://github.com/python/peps/blob/master/pep-0637.txt ChrisA
It is working for me now. On Wed, Sep 23, 2020, 17:33 Ricky Teachey <ricky@teachey.org> wrote:
Hmmm, getting a 404 at:
https://www.python.org/dev/peps/pep-0637
Is this just a temporary condition or a bug?
--- Ricky.
"I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
On Wed, Sep 23, 2020 at 4:56 PM Stefano Borini <stefano.borini@gmail.com> wrote:
Dear all,
"Support for indexing with keyword arguments" has now been merged with the assigned PEP number 637.
For future reference, this email will be added to the Post-History of the PEP.
On Mon, 21 Sep 2020 at 21:34, Stefano Borini <stefano.borini@gmail.com> wrote:
PEP for support for indexing with keyword arguments is now submitted as
PR.
https://github.com/python/peps/pull/1612
Thanks to everybody involved in the development of the PEP and the interesting discussion. All your contributions have been received and often added to the document. If the PEP is approved, I would like to attempt an implementation, but I am not particularly skilled in the python internals. If a core developer is willing to teach me the ropes (especially in the new parser, I barely understood the syntax of the old one, but have no idea about the new one) and review my code I can give it a try. I would not mind refreshing my C a bit.
-- Kind regards,
Stefano Borini _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/25ZPXA... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/IEXUPA... Code of Conduct: http://python.org/psf/codeofconduct/
I noticed a sentence that was not completed in PEP 637. Though I have made (pretty minor) contributions to CPython and some other things, it isn't entirely clear to me whether it would be appropriate for me to submit an issue or pull request for this, and what the general policy is? https://github.com/python/peps/blob/master/CONTRIBUTING.rst does not make it any clearer to me, I guess it is kinda the Wild West and depends on the PEP/what the change is? Would it be worth clarifying this? Secondly, I wasn't 100% sure this make was a rendering mistake or in the source. Would it be possible/an idea to include a link to the source (like the Python documentation does)? [I am sure people will be curious what the mistake was:
a[3] # returns the fourth element of a
has the comment unfinished. I guess it should say list or something similar.] On Wed, 23 Sep 2020 at 21:59, Stefano Borini <stefano.borini@gmail.com> wrote:
Dear all,
"Support for indexing with keyword arguments" has now been merged with the assigned PEP number 637.
For future reference, this email will be added to the Post-History of the PEP.
On Thu, Sep 24, 2020 at 8:29 AM Henk-Jaap Wagenaar <wagenaarhenkjaap@gmail.com> wrote:
I noticed a sentence that was not completed in PEP 637. Though I have made (pretty minor) contributions to CPython and some other things, it isn't entirely clear to me whether it would be appropriate for me to submit an issue or pull request for this, and what the general policy is?
https://github.com/python/peps/blob/master/CONTRIBUTING.rst does not make it any clearer to me, I guess it is kinda the Wild West and depends on the PEP/what the change is? Would it be worth clarifying this?
Secondly, I wasn't 100% sure this make was a rendering mistake or in the source. Would it be possible/an idea to include a link to the source (like the Python documentation does)?
[I am sure people will be curious what the mistake was:
a[3] # returns the fourth element of a
has the comment unfinished. I guess it should say list or something similar.]
Actually it's returning the fourth element of "the thing in the variable named 'a'", so it's not the English article. If you feel that's unclear, you could propose a change that renames the variable, perhaps. Pull requests are absolutely appropriate for simple changes, copyediting, etc. You'll be asked to sign the licensing agreement, but for extremely tiny changes (say, just fixing a missed bit of punctuation), we can override the CLA bot and merge the change anyway. ChrisA
And here: https://github.com/python/peps/blob/master/pep-0637.txt is the source to look at -- and the repo to do a PR against. -CHB On Wed, Sep 23, 2020 at 3:34 PM Chris Angelico <rosuav@gmail.com> wrote:
On Thu, Sep 24, 2020 at 8:29 AM Henk-Jaap Wagenaar <wagenaarhenkjaap@gmail.com> wrote:
I noticed a sentence that was not completed in PEP 637. Though I have
made (pretty minor) contributions to CPython and some other things, it isn't entirely clear to me whether it would be appropriate for me to submit an issue or pull request for this, and what the general policy is?
https://github.com/python/peps/blob/master/CONTRIBUTING.rst does not
make it any clearer to me, I guess it is kinda the Wild West and depends on the PEP/what the change is? Would it be worth clarifying this?
Secondly, I wasn't 100% sure this make was a rendering mistake or in the
source. Would it be possible/an idea to include a link to the source (like the Python documentation does)?
[I am sure people will be curious what the mistake was:
a[3] # returns the fourth element of a
has the comment unfinished. I guess it should say list or something
similar.]
Actually it's returning the fourth element of "the thing in the variable named 'a'", so it's not the English article. If you feel that's unclear, you could propose a change that renames the variable, perhaps.
Pull requests are absolutely appropriate for simple changes, copyediting, etc. You'll be asked to sign the licensing agreement, but for extremely tiny changes (say, just fixing a missed bit of punctuation), we can override the CLA bot and merge the change anyway.
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/IXZOMR... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On 2020-09-23 at 23:28:11 +0100, Henk-Jaap Wagenaar <wagenaarhenkjaap@gmail.com> wrote:
a[3] # returns the fourth element of a
has the comment unfinished. I guess it should say list or something similar.]
Yes, I agree: it looks like it's broken, but it's okay. a[3] returns the fourth element of a sequence named "a." If the sequence were named something_descriptive, then something_descriptive[3] returns the fourth element of a sequence named "something_descriptive."
Where is the discussion on this PEP going to be? In this thread, or a new thread? Sorry for not having followed these closely enough to know. I'd like to point out that boost-histogram and xarray (at least) would love this as well, as we both (independently) came up with dict-in-index workarounds for missing keyword arguments. An argument for "why not a function call" could be added: function calls do not allow assignment, so `h[x=3] = ...` does not have a pretty functional replacement. Also, maybe a mention as to why simply making a new set of magic methods, `__get_item__(self, *args, **kwargs)` for example, is not a valid option? The nice thing about that is the current oddity that you can't tell `[(1,2)]` from `[1, 2]` could also be fixed, along with avoiding all the special cases mentioned (which are pretty safe). Then a class that defines the new magic methods would use those instead of the old one, similar to getslice/setslice/delslice?
not sure where further discussion will be, but absolutely look at the length discussion already on this list, where your question has been much discussed. -CHB On Thu, Sep 24, 2020 at 1:01 PM <henryfs@princeton.edu> wrote:
Where is the discussion on this PEP going to be? In this thread, or a new thread? Sorry for not having followed these closely enough to know. I'd like to point out that boost-histogram and xarray (at least) would love this as well, as we both (independently) came up with dict-in-index workarounds for missing keyword arguments. An argument for "why not a function call" could be added: function calls do not allow assignment, so `h[x=3] = ...` does not have a pretty functional replacement.
Also, maybe a mention as to why simply making a new set of magic methods, `__get_item__(self, *args, **kwargs)` for example, is not a valid option? The nice thing about that is the current oddity that you can't tell `[(1,2)]` from `[1, 2]` could also be fixed, along with avoiding all the special cases mentioned (which are pretty safe). Then a class that defines the new magic methods would use those instead of the old one, similar to getslice/setslice/delslice? _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7LFLWA... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Thu, Sep 24, 2020 at 07:32:37PM -0000, henryfs@princeton.edu wrote:
Where is the discussion on this PEP going to be? In this thread, or a new thread?
I see no reason why we cannot continue discussion in this thread.
Sorry for not having followed these closely enough to know. I'd like to point out that boost-histogram and xarray (at least) would love this as well, as we both (independently) came up with dict-in-index workarounds for missing keyword arguments.
xarray should already be in the PEP. Do you have an example from boost-histogram, and do you speak for the developers or just as a user? Can you give an example of where boost-histogram might use this?
Also, maybe a mention as to why simply making a new set of magic methods, `__get_item__(self, *args, **kwargs)` for example, is not a valid option?
Having to add an additional three methods (get, set and delete) is a much more heavyweight change than proposed in the PEP, requiring more language changes, more changes to the interpreter, and some runtime costs. It will add significant confusion and code duplication between the old `__getitem__` and new `__get_item__` methods, especially if they are spelled so similarly as that. My recollection is that the PEP discusses this, but I might be conflating that with previous discussions on the mailing list. -- Steve
I feel like this paragraph in the PEP goes a little bit too far, but I understand its good intention. The first difference is in meaning to the reader. A function call says
"arbitrary function call potentially with side-effects". An indexing operation says "lookup", typically to point at a subset or specific sub-aspect of an entity (as in the case of typing notation). This fundamental difference means that, while we cannot prevent abuse, implementors should be aware that the introduction of keyword arguments to alter the behavior of the lookup may violate this intrinsic meaning.
Smuggling in a generic function calls in square brackets is undesirable (but obviously not preventable at a language level). However, I feel like keywords might very often "alter the behavior." For example, imagine we have a distributed array within an engine that is intended to be "eventually consistent." I would find code like this, in some hypothetical library, to be clear and useful, and not to violate the spirit of indexing. snapshot1 = remote_array[300:310, 50:60, 30:35, source=worker1]
snapshot2 = remote_array[300:310, 50:60, 30:35, source=worker2] if not (snapshot1 == snapshot2).all(): print("Wait a bit for worker synchronization...")
-- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.
Is this a breaking change? It feels borderline. Keyword-only subscripts are permitted. The positional index will be the
empty tuple: obj[spam=1, eggs=2] # calls type(obj).__getitem__(obj, (), spam=1, eggs=2)
I.e. consider:
d = dict() d[()] = "foo" d {(): 'foo'}
I don't really object to this fact, and one could argue it's not a breaking change since a built-in dict will simply raise an exception with keyword arguments. However, it does make the empty tuple the "default key" for new objects that will accept keyword indices. -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.
I'd like to hear more about why the empty tuple has been selected as the default index. On Fri, Sep 25, 2020, 1:07 AM David Mertz <mertz@gnosis.cx> wrote:
Is this a breaking change? It feels borderline.
Keyword-only subscripts are permitted. The positional index will be the
empty tuple: obj[spam=1, eggs=2] # calls type(obj).__getitem__(obj, (), spam=1, eggs=2)
I.e. consider:
d = dict() d[()] = "foo" d {(): 'foo'}
I don't really object to this fact, and one could argue it's not a breaking change since a built-in dict will simply raise an exception with keyword arguments. However, it does make the empty tuple the "default key" for new objects that will accept keyword indices.
-- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/65KCDF... Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, Sep 25, 2020 at 6:05 AM Ricky Teachey <ricky@teachey.org> wrote:
I'd like to hear more about why the empty tuple has been selected as the default index.
It makes sense to me: if more than one index is passed, they are passed as a tuple. so many classes need to handle tuples anyway. What other options are there? I suppose None is a possibility, but None is a valid dict key, so probably not a great idea. Hmm, so is an empty tuple. Darn. I think having no default is a better option, as someone pointed out already in this thread. -CHB
On Fri, Sep 25, 2020, 1:07 AM David Mertz <mertz@gnosis.cx> wrote:
Is this a breaking change? It feels borderline.
Keyword-only subscripts are permitted. The positional index will be the
empty tuple: obj[spam=1, eggs=2] # calls type(obj).__getitem__(obj, (), spam=1, eggs=2)
I.e. consider:
d = dict() d[()] = "foo" d {(): 'foo'}
I don't really object to this fact, and one could argue it's not a breaking change since a built-in dict will simply raise an exception with keyword arguments. However, it does make the empty tuple the "default key" for new objects that will accept keyword indices.
-- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/65KCDF... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/C4JVKD... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Fri, Sep 25, 2020 at 3:36 PM Christopher Barker <pythonchb@gmail.com> wrote:
On Fri, Sep 25, 2020 at 6:05 AM Ricky Teachey <ricky@teachey.org> wrote:
I'd like to hear more about why the empty tuple has been selected as the default index.
It makes sense to me: if more than one index is passed, they are passed as a tuple. so many classes need to handle tuples anyway.
What other options are there? I suppose None is a possibility, but None is a valid dict key, so probably not a great idea. Hmm, so is an empty tuple. Darn.
I think having no default is a better option, as someone pointed out already in this thread.
-CHB
That is where my thinking went as well, but I probably haven't thought through all the implications. Essentially, you'd be letting the write of the __XXXitem__ method(s) choose the default, rather than making the decision for them. --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
On 2020-09-25 20:36, Christopher Barker wrote:
On Fri, Sep 25, 2020 at 6:05 AM Ricky Teachey <ricky@teachey.org <mailto:ricky@teachey.org>> wrote:
I'd like to hear more about why the empty tuple has been selected as the default index.
It makes sense to me: if more than one index is passed, they are passed as a tuple. so many classes need to handle tuples anyway.
What other options are there? I suppose None is a possibility, but None is a valid dict key, so probably not a great idea. Hmm, so is an empty tuple. Darn.
I think having no default is a better option, as someone pointed out already in this thread.
It currently doesn't support multiple indexes, so there's no distinction between one index that's a 2-tuple and 2 indexes: d[(1, 2)] == d[1, 2]. Using an empty tuple as the default index isn't that bad, assuming you're going to allow a default.
On Fri, Sep 25, 2020 at 1:36 PM MRAB <python@mrabarnett.plus.com> wrote:
It currently doesn't support multiple indexes, so there's no distinction between one index that's a 2-tuple and 2 indexes: d[(1, 2)] == d[1, 2].
yeah, but one index isn't in a 1-tuple (much discussed on this thread), so now we have the someone awkward (if consistent with the language): obj[i] --> __getitem__(i) obj[i, j] --> __getitem__((i, j)) if make a default an empty tuple, then we'll have: obj[] --> __getitem__(()) obj[i] --> __getitem__(i) obj[i, j] --> __getitem__((i, j)) Or would the default only be used if there were one or more keyword arguments? if so, we'd still have: obj[keyword=k] --> __getitem__((), keyword=k) obj[i, keyword=k] --> __getitem__(i, keyword=k) obj[i, j, keyword=k] --> __getitem__((i, j), keyword=k) Which is, shall we say, not ideal. -CHB
Using an empty tuple as the default index isn't that bad, assuming you're going to allow a default. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/MN7MOV... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Fri, 25 Sep 2020 at 14:07, Ricky Teachey <ricky@teachey.org> wrote:
I'd like to hear more about why the empty tuple has been selected as the default index.
It's still not settled. Steven proposed None, I propose empty tuple for affinity with *args behavior, others proposed no positional arg at all, but then there's the problem of the setter. I sent a mail to the numpy mailing list yesterday, and all people who replied seemed to prefer empty tuple. Here is the thread http://numpy-discussion.10968.n7.nabble.com/Request-for-comments-on-PEP-637-... -- Kind regards, Stefano Borini
On Fri, Sep 25, 2020 at 4:57 PM Stefano Borini <stefano.borini@gmail.com> wrote:
On Fri, 25 Sep 2020 at 14:07, Ricky Teachey <ricky@teachey.org> wrote:
I'd like to hear more about why the empty tuple has been selected as the
default index.
It's still not settled. Steven proposed None, I propose empty tuple for affinity with *args behavior, others proposed no positional arg at all, but then there's the problem of the setter. I sent a mail to the numpy mailing list yesterday, and all people who replied seemed to prefer empty tuple.
Here is the thread
http://numpy-discussion.10968.n7.nabble.com/Request-for-comments-on-PEP-637-...
-- Kind regards,
Stefano Borini
* sheepishly, whispering mostly to myself * All of that could be solved with new dunders. --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
Ricky, I'd love new dunders too but it would explode in complexity. The likelihood of being approved is pretty much non-existent as far as I understand. If you want, we can write together a competing PEP with that solution, so that the Steering Council can have two ideas. I would not mind exploring that option a bit more, but it's likely to become an exercise. On Fri, 25 Sep 2020 at 22:06, Ricky Teachey <ricky@teachey.org> wrote:
On Fri, Sep 25, 2020 at 4:57 PM Stefano Borini <stefano.borini@gmail.com> wrote:
On Fri, 25 Sep 2020 at 14:07, Ricky Teachey <ricky@teachey.org> wrote:
I'd like to hear more about why the empty tuple has been selected as the default index.
It's still not settled. Steven proposed None, I propose empty tuple for affinity with *args behavior, others proposed no positional arg at all, but then there's the problem of the setter. I sent a mail to the numpy mailing list yesterday, and all people who replied seemed to prefer empty tuple.
Here is the thread
http://numpy-discussion.10968.n7.nabble.com/Request-for-comments-on-PEP-637-...
-- Kind regards,
Stefano Borini
* sheepishly, whispering mostly to myself *
All of that could be solved with new dunders.
--- Ricky.
"I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
-- Kind regards, Stefano Borini
On Fri, Sep 25, 2020 at 05:05:52PM -0400, Ricky Teachey wrote:
* sheepishly, whispering mostly to myself *
All of that could be solved with new dunders.
What would be the signature of the new dunders? How will the new dunders be backwards compatible? Previous discussions on this required runtime introspection to decide which set of dunders is called. That's going to be slow. There's also the confusion of having two sets of very similar dunders. We got rid of `__getslice__` and friends back in the early 2.x days. Do we want to bring back another set of dunders with very similar names? What would they be called? I've seen proposals `__get_item__` and `__getindex__` which are begging to be confused with `__getitem__`. -- Steve
On Fri, Sep 25, 2020 at 6:06 PM Stefano Borini <stefano.borini@gmail.com> wrote:
Ricky, I'd love new dunders too but it would explode in complexity. The likelihood of being approved is pretty much non-existent as far as I understand.
If you want, we can write together a competing PEP with that solution, so that the Steering Council can have two ideas. I would not mind exploring that option a bit more, but it's likely to become an exercise.
You're kind to offer. I really don't feel qualified. I'm not even a professional developer. On Fri, Sep 25, 2020 at 10:47 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Fri, Sep 25, 2020 at 05:05:52PM -0400, Ricky Teachey wrote:
* sheepishly, whispering mostly to myself *
All of that could be solved with new dunders.
What would be the signature of the new dunders?
Well, either new dunders, or the other approach where there is a flag the changes the way the index content are passed to the existing dunders, in this form: def __setindex__(self, value, *args, **kwds) But honestly I was apprehensive of even bringing it up again (hence my *sheepishly* comment) and now regret doing so. It's probably not worth rehashing it. Especially after Guido's summary post he provided-- when he looked at the various ideas and what it would take to implement them at the C level-- it feels like the idea was defeated. So I should probably just give over. However if any of the people who were in favor of the idea want to take up the flag for it, I would be very happy for them to do that. If it's just me at this point, I'm just not capable/qualified to make the case for it, so it isn't appropriate for me to continue hammering at it. Just bringing it up one last time to make sure nobody else is as passionate about it as me. --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
Quoting levels are a bit messed up in David's post, I've tried to fix them bu apologies if I'm attributing words to David that he didn't write. On Thu, Sep 24, 2020 at 07:04:31PM -1000, David Mertz wrote:
Is this a breaking change? It feels borderline.
Keyword-only subscripts are permitted. The positional index will be the empty tuple:
obj[spam=1, eggs=2] # calls type(obj).__getitem__(obj, (), spam=1, eggs=2)
I.e. consider:
d = dict() d[()] = "foo" d {(): 'foo'}
I don't really object to this fact, and one could argue it's not a breaking change since a built-in dict will simply raise an exception with keyword arguments. However, it does make the empty tuple the "default key" for new objects that will accept keyword indices.
I agree with Ricky that the choice of empty tuple should be justified better by the PEP, and alternatives (None, NotImplemented) discussed. But I don't think this is a breaking change. Can you explain further what you think will break? -- Steve
On Fri, Sep 25, 2020 at 4:36 PM Steven D'Aprano <steve@pearwood.info> wrote:
I.e. consider:
d = dict() d[()] = "foo" d {(): 'foo'}
I agree with Ricky that the choice of empty tuple should be justified better by the PEP, and alternatives (None, NotImplemented) discussed. But I don't think this is a breaking change. Can you explain further what you think will break?
I think empty tuple is the best choice, and the discussion on the NumPy list mostly seems to agree. I agree that it's basically just a question of making the PEP more explicit in explaining the choice. In a direct sense, no matter what `newthing[foo=1, bar=4:5]` does internally, it CANNOT be a breaking change, since that is a SyntaxError now. However, it's also the case that if any sentinel is used for "keywords only", whether None or () or EmptyIndexSentinel, it "steps on" using that value as an actual index. E.g. we would have: newthing[(), foo=1, bar=4:5] == newthing[foo=1, bar=4:5] ... well, we have that if subscript access is not mutation, that is. Which it hopefully will not be, but anything is possible in code. -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.
On Fri, Sep 25, 2020 at 8:26 PM David Mertz <mertz@gnosis.cx> wrote:
E.g. we would have:
newthing[(), foo=1, bar=4:5] == newthing[foo=1, bar=4:5]
Right, but we also have newthing[(2, 3), foo=1, bar=4:5] == newthing[2, 3, foo=1, bar=4:5] which seems exactly analogous. A disambiguation scheme that worked for every n might be worth it, but one that only works for n=0 and makes that case less consistent with n>=2 doesn't seem worth it to me.
On Fri, 25 Sep 2020 at 05:55, David Mertz <mertz@gnosis.cx> wrote:
Smuggling in a generic function calls in square brackets is undesirable (but obviously not preventable at a language level). However, I feel like keywords might very often "alter the behavior."
For example, imagine we have a distributed array within an engine that is intended to be "eventually consistent." I would find code like this, in some hypothetical library, to be clear and useful, and not to violate the spirit of indexing.
snapshot1 = remote_array[300:310, 50:60, 30:35, source=worker1] snapshot2 = remote_array[300:310, 50:60, 30:35, source=worker2]
I would personally be very uneasy with this. A better solution would be to have a proxy object that handles that:
snapshot1 = remote_array.source(worker1)[300:310, 50:60, 30:35]
Of course people can (and will) abuse the feature, but I would personally consider it poor design. These tricks were discussed in PEP-472 (e.g. specify the unit to be returned), but I always felt uncomfortable with them. -- Kind regards, Stefano Borini
On Fri, Sep 25, 2020 at 09:53:41PM +0100, Stefano Borini wrote:
On Fri, 25 Sep 2020 at 05:55, David Mertz <mertz@gnosis.cx> wrote:
Smuggling in a generic function calls in square brackets is undesirable (but obviously not preventable at a language level). However, I feel like keywords might very often "alter the behavior."
For example, imagine we have a distributed array within an engine that is intended to be "eventually consistent." I would find code like this, in some hypothetical library, to be clear and useful, and not to violate the spirit of indexing.
snapshot1 = remote_array[300:310, 50:60, 30:35, source=worker1] snapshot2 = remote_array[300:310, 50:60, 30:35, source=worker2]
I would personally be very uneasy with this. A better solution would be to have a proxy object that handles that:
snapshot1 = remote_array.source(worker1)[300:310, 50:60, 30:35]
Of course people can (and will) abuse the feature, but I would personally consider it poor design.
Did you just completely undermine the rationale for your own PEP? Isn't the entire purpose of this PEP to allow subscripts to include keyword arguments? And now you are describing it as "poor design"? For what it's worth, I think David's example is perfectly clear, within the spirit of subscripting, and much more understandable to the reader than a `.source` method that returns a proxy. I'm not really sure why this hypothetical call: snapshot1 = remote_array[300:310, 50:60, 30:35, source=worker1] is "abuse" or should make us more uneasy that this hypothetical call: snapshot1 = remote_array[300:310, 50:60, 30:35, axis=1] say. Both cases would change the behaviour in some way. David's example specifies the distributed source to use, the second specifies the axis to use. Why is the first abuse and the second not? -- Steve
On Sat, 26 Sep 2020 at 04:02, Steven D'Aprano <steve@pearwood.info> wrote:
Did you just completely undermine the rationale for your own PEP?
Isn't the entire purpose of this PEP to allow subscripts to include keyword arguments? And now you are describing it as "poor design"?
Not really. to _me_, an indexing operation remains an indexing operation. My personal use cases are two: 1. naming axes (e.g. replace, if desired obj[1, 2] with obj[row=1, col=2]) 2. typing generics MyType[T=int] Other use cases are certainly allowed, but to me, something like a[1, 2, unit="meters"] makes me feel uncomfortable, although I might learn to accept it. In particular, the above case becomes kind of odd when you use it for setitem and delitem a[1, 2, unit="meters"] = 3 what does this mean? convert 3 to meters and store the value in a? Then why isn't the unit close to 3, as in a[1,2] = 3 * meters and what about this one? del a[1, 2, unit="meters"] # and this one? I feel that, for some of those use cases (like the source one), there's a well established, traditional design pattern that fits it "better" (as it, it feels "right", "more familiar")
I'm not really sure why this hypothetical call:
snapshot1 = remote_array[300:310, 50:60, 30:35, source=worker1]
is "abuse" or should make us more uneasy that this hypothetical call:
I don't know... it just doesn't feel... "right" :) but maybe there's a logic to it. You are indexing on the indexes, and also on the source. Yeah, makes sense. Sold. -- Kind regards, Stefano Borini
On Sat, Sep 26, 2020 at 7:51 AM Stefano Borini <stefano.borini@gmail.com> wrote:
On Sat, 26 Sep 2020 at 04:02, Steven D'Aprano <steve@pearwood.info> wrote:
Did you just completely undermine the rationale for your own PEP?
Isn't the entire purpose of this PEP to allow subscripts to include keyword arguments? And now you are describing it as "poor design"?
Not really. to _me_, an indexing operation remains an indexing operation. My personal use cases are two:
1. naming axes (e.g. replace, if desired obj[1, 2] with obj[row=1, col=2]) 2. typing generics MyType[T=int]
In this fashion have you considering having keyword only indices, that is to only allow either obj[1, 2] or obj[row=1, col=2] (if the class supports it), and disallow mixing positional and keyword indices, meaning obj[1, col=2] would be a SyntaxError. If we followed that path, then adding a new set of dunders may not be that problematic as the use case would be slightly different than the current semantics. One could implement the current set of dunders __[get|set|del]item__ in case you want to support keywordless indexing, and this hypothetical new set of dunders if you wanted to support keyword indices. If you want both you need to implement both. However, a decorator may be added to easily allow both semantics. Has anyone provided compelling use cases for mixing indices with and without keywords? I also agree with Stefano that something like a[1, 2, unit="meters"] feels really odd, but maybe by adding the names to the first 2 dimensions the intent could be clearer.
Other use cases are certainly allowed, but to me, something like
a[1, 2, unit="meters"]
makes me feel uncomfortable, although I might learn to accept it. In particular, the above case becomes kind of odd when you use it for setitem and delitem
a[1, 2, unit="meters"] = 3
what does this mean? convert 3 to meters and store the value in a? Then why isn't the unit close to 3, as in
a[1,2] = 3 * meters
and what about this one?
del a[1, 2, unit="meters"] # and this one?
I feel that, for some of those use cases (like the source one), there's a well established, traditional design pattern that fits it "better" (as it, it feels "right", "more familiar")
I'm not really sure why this hypothetical call:
snapshot1 = remote_array[300:310, 50:60, 30:35, source=worker1]
is "abuse" or should make us more uneasy that this hypothetical call:
I don't know... it just doesn't feel... "right" :) but maybe there's a logic to it. You are indexing on the indexes, and also on the source.
Yeah, makes sense.
Sold.
-- Kind regards,
Stefano Borini _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/IUGWBN... Code of Conduct: http://python.org/psf/codeofconduct/
-- Sebastian Kreft
On Sat, Sep 26, 2020 at 01:47:56PM -0300, Sebastian Kreft wrote:
In this fashion have you considering having keyword only indices, that is to only allow either obj[1, 2] or obj[row=1, col=2] (if the class supports it), and disallow mixing positional and keyword indices, meaning obj[1, col=2] would be a SyntaxError.
That would severely reduce the usefulness of this feature for me, probably by 80 or 90%, and possibly make it useless for xarray and pandas. (I don't speak for the pandas or xarray devs, I'm happy to be corrected.) But for my own code, my primary use-case is to have a mixed positional index and keyword arguments. I would *much* rather give up keyword-only subscripting. Keyword-only subscripting would be a convenience and a useful feature, but it's mixed subscripting that I am really hungry for. Hmmm, that's a thought... maybe we should just prohibit keyword-only calls for now? People who want a "keyword only" call can provide their own "do nothing" sentinel as the index. For my own purposes, I could easily adapt the keyword-only cases to use None as an explicit sentinal: matrix[row=1] # Nice to have. matrix[None, row=1] # Good enough for my purposes. so for my own purposes, if keyword-only is too hard, I'd be happy to drop it out of the PEP and require an explicit index.
If we followed that path, then adding a new set of dunders may not be that problematic as the use case would be slightly different than the current semantics.
I don't see how "the use case is different" solves the problems with adding new dunders. Adding new dunders is problematic because: * they probably require new C level slots in objects, increasing their size; * and the added complexity and likelihood of confusion for developers. Do I write `__getitem__` or `__getindex__`? We had this back in Python 1 and 2 days with `__*item__` and `__*slice__` dunders and moved away from that, let's not revert back to that design if we can avoid it.
I also agree with Stefano that something like a[1, 2, unit="meters"] feels really odd,
Could be because of the misspelling of metres (unit of measurement) versus meters (things you read data from) *wink* Without knowing what the object `a` represents, or what the meaning of the subscript is, how can we possibly judge whether this is a reasonable use of a keyword subscript or not? Stefano wrote:
a[1, 2, unit="meters"]
makes me feel uncomfortable, although I might learn to accept it. In particular, the above case becomes kind of odd when you use it for setitem and delitem
a[1, 2, unit="meters"] = 3
what does this mean? convert 3 to meters and store the value in a?
You made the example up, so you should know what it means :-) I don't even know what `a[1, 2, unit="meters"]` means since we don't know what `a` is.
Then why isn't the unit close to 3, as in
a[1,2] = 3 * meters
and what about this one?
del a[1, 2, unit="meters"] # and this one?
In my own use-case, all three get/set/delete operations will support the same keywords, but just because we syntactically allow subscripts doesn't make it mandatory to use them identically useful for all three opperations in all cases. There is no need for the dunder to support a keyword that makes no sense for that dunder. For example, suppose we had a sequence-like mapping that supports this get operation: table[index, default="index out of range"] That doesn't mean that table has to support the same keyword when setting or deleting, it can use different keywords that make no sense when getting: table = Table([1, 2, 3, 4]) table[10, fill=-1] = 5 print(table) # => [1, 2, 3, 4, -1, -1, -1, -1, -1, -1, 5] del table[index, errors="ignore"] # Silence error if out of range. There's no requirement that the three get/set/delete operations have to use the same keywords, or any keywords at all, only that they *could* if required. *Allowing* symmetry between the three options is a must; *requiring* symmetry between them is not. -- Steve
On Sun, Sep 27, 2020 at 12:43 AM Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Sep 26, 2020 at 01:47:56PM -0300, Sebastian Kreft wrote:
In this fashion have you considering having keyword only indices, that is to only allow either obj[1, 2] or obj[row=1, col=2] (if the class supports it), and disallow mixing positional and keyword indices, meaning obj[1, col=2] would be a SyntaxError.
That would severely reduce the usefulness of this feature for me, probably by 80 or 90%, and possibly make it useless for xarray and pandas.
(I don't speak for the pandas or xarray devs, I'm happy to be corrected.)
But for my own code, my primary use-case is to have a mixed positional index and keyword arguments. I would *much* rather give up keyword-only subscripting. Keyword-only subscripting would be a convenience and a useful feature, but it's mixed subscripting that I am really hungry for.
Hi Steven, could you share some examples of what you have in mind. Having a more concrete example of an API that would benefit from mixed-subscripting would allow us to better understand its usefulness. I'm uneasy with the abuse-of-notation potential of this proposed feature, especially around the use of **kwd and mixed subscripting. That being said, I do understand why it may be appealing to others, as that could open the door for a more expressive language and maybe even better suited for some DSLs.
Hmmm, that's a thought... maybe we should just prohibit keyword-only calls for now? People who want a "keyword only" call can provide their own "do nothing" sentinel as the index.
For my own purposes, I could easily adapt the keyword-only cases to use None as an explicit sentinal:
matrix[row=1] # Nice to have. matrix[None, row=1] # Good enough for my purposes.
so for my own purposes, if keyword-only is too hard, I'd be happy to drop it out of the PEP and require an explicit index.
If we followed that path, then adding a new set of dunders may not be that problematic as the use case would be slightly different than the current semantics.
I don't see how "the use case is different" solves the problems with adding new dunders.
Adding new dunders is problematic because:
* they probably require new C level slots in objects, increasing their size;
* and the added complexity and likelihood of confusion for developers. Do I write `__getitem__` or `__getindex__`?
We had this back in Python 1 and 2 days with `__*item__` and `__*slice__` dunders and moved away from that, let's not revert back to that design if we can avoid it.
I also agree with Stefano that something like a[1, 2, unit="meters"] feels really odd,
Could be because of the misspelling of metres (unit of measurement) versus meters (things you read data from) *wink*
Without knowing what the object `a` represents, or what the meaning of the subscript is, how can we possibly judge whether this is a reasonable use of a keyword subscript or not?
Stefano wrote:
a[1, 2, unit="meters"]
makes me feel uncomfortable, although I might learn to accept it. In particular, the above case becomes kind of odd when you use it for setitem and delitem
a[1, 2, unit="meters"] = 3
what does this mean? convert 3 to meters and store the value in a?
You made the example up, so you should know what it means :-)
I don't even know what `a[1, 2, unit="meters"]` means since we don't know what `a` is.
Then why isn't the unit close to 3, as in
a[1,2] = 3 * meters
and what about this one?
del a[1, 2, unit="meters"] # and this one?
In my own use-case, all three get/set/delete operations will support the same keywords, but just because we syntactically allow subscripts doesn't make it mandatory to use them identically useful for all three opperations in all cases. There is no need for the dunder to support a keyword that makes no sense for that dunder.
For example, suppose we had a sequence-like mapping that supports this get operation:
table[index, default="index out of range"]
That doesn't mean that table has to support the same keyword when setting or deleting, it can use different keywords that make no sense when getting:
table = Table([1, 2, 3, 4]) table[10, fill=-1] = 5 print(table) # => [1, 2, 3, 4, -1, -1, -1, -1, -1, -1, 5]
del table[index, errors="ignore"] # Silence error if out of range.
There's no requirement that the three get/set/delete operations have to use the same keywords, or any keywords at all, only that they *could* if required.
*Allowing* symmetry between the three options is a must; *requiring* symmetry between them is not.
-- Steve _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/OEBA3N... Code of Conduct: http://python.org/psf/codeofconduct/
-- Sebastian Kreft
On Sun, Sep 27, 2020 at 07:59:18AM -0300, Sebastian Kreft wrote:
Hi Steven, could you share some examples of what you have in mind. Having a more concrete example of an API that would benefit from mixed-subscripting would allow us to better understand its usefulness.
I have an experimental Matrix class: https://en.wikipedia.org/wiki/Matrix_(mathematics) There are (at least) three indexing operations needed: - row - column - individual cell The first two support get, set and delete; the last supports only get and set. One obvious API would be a keyword to disambiguate between the first two cases: matrix[3, 4] # unambiguously a cell reference matrix[3] # ambiguous, forbidden matrix[3, axis='row'] # unambiguously a row matrix[3, axis='col'] # unambiguously a column These could be supported for all of get, set and delete (except for cells) operations. A quick sketch of the implementation with minimal error checking for brevity: def __setitem__(self, index, value, *, axis=None): if isinstance(index, tuple): # Operate on a cell. if axis is not None: raise TypeError('cell ops don't take axis keyword') i, j = index ... # bind a single cell elif isinstance(index, int): if axis == 'row': ... # bind the row elif axis == 'col': ... # bind the column else: raise ValueError('bad axis') else: raise TypeError -- Steve
On Wed, 30 Sep 2020 at 06:44, Steven D'Aprano <steve@pearwood.info> wrote:
matrix[3, 4] # unambiguously a cell reference matrix[3] # ambiguous, forbidden matrix[3, axis='row'] # unambiguously a row matrix[3, axis='col'] # unambiguously a column
I guess everybody already knows this, but I would hope the obvious choice here is matrix[row=3] But in any case, the PEP does not prescribe how the keyword arguments should be used, only that are possible. How they are used (or misused.. :) ) is of course up to the implementation. -- Kind regards, Stefano Borini
On Wed, Sep 30, 2020 at 2:44 AM Steven D'Aprano <steve@pearwood.info> wrote:
On Sun, Sep 27, 2020 at 07:59:18AM -0300, Sebastian Kreft wrote:
Hi Steven, could you share some examples of what you have in mind. Having a more concrete example of an API that would benefit from mixed-subscripting would allow us to better understand its usefulness.
I have an experimental Matrix class:
https://en.wikipedia.org/wiki/Matrix_(mathematics)
There are (at least) three indexing operations needed:
- row - column - individual cell
The first two support get, set and delete; the last supports only get and set.
One obvious API would be a keyword to disambiguate between the first two cases:
matrix[3, 4] # unambiguously a cell reference matrix[3] # ambiguous, forbidden matrix[3, axis='row'] # unambiguously a row matrix[3, axis='col'] # unambiguously a column
Have you considered using matrix[row=3], matrix[col=3]? In that case it would be a keyword only access. What advantages do you see with your current API? Or alternatively, using numpy's current syntax matrix[3, :], matrix[:, 3] (maybe `...` could be another option, if `:` is too magic)
These could be supported for all of get, set and delete (except for cells) operations. A quick sketch of the implementation with minimal error checking for brevity:
def __setitem__(self, index, value, *, axis=None): if isinstance(index, tuple): # Operate on a cell. if axis is not None: raise TypeError('cell ops don't take axis keyword') i, j = index ... # bind a single cell
elif isinstance(index, int): if axis == 'row': ... # bind the row elif axis == 'col': ... # bind the column else: raise ValueError('bad axis')
else: raise TypeError
-- Steve _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/CGMHA4... Code of Conduct: http://python.org/psf/codeofconduct/
-- Sebastian Kreft
On Wed, Sep 30, 2020 at 11:03:28AM -0300, Sebastian Kreft wrote:
Have you considered using matrix[row=3], matrix[col=3]? In that case it would be a keyword only access. What advantages do you see with your current API?
Yes I have considered it. One possible advantage is that with axis='row', I don't have to worry about, or test for, mutually exclusive keywords: matrix[row=2, col=3] # can't get both a row and a column at the same time But it's too early for me to worry too much about the exact choice of keywords. The PEP isn't even accepted yet :-) -- Steve
On Wed, 7 Oct 2020 at 11:39, Steven D'Aprano <steve@pearwood.info> wrote:
One possible advantage is that with axis='row', I don't have to worry about, or test for, mutually exclusive keywords:
matrix[row=2, col=3] # can't get both a row and a column at the same time
Well, they are not mutually exclusive in this case. You would get the element at row 2 col 3. -- Kind regards, Stefano Borini
On Sat, Sep 26, 2020 at 8:40 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Sep 26, 2020 at 01:47:56PM -0300, Sebastian Kreft wrote:
In this fashion have you considering having keyword only indices, that is to only allow either obj[1, 2] or obj[row=1, col=2] (if the class supports it), and disallow mixing positional and keyword indices, meaning obj[1, col=2] would be a SyntaxError.
That would severely reduce the usefulness of this feature for me, probably by 80 or 90%, and possibly make it useless for xarray and pandas.
(I don't speak for the pandas or xarray devs, I'm happy to be corrected.)
From my perspective as a developer for both xarray and pandas, both "mixed" and "keyword only" indexing have use cases, but I would guess keyword only indexing is more important. In xarray, we currently have methods that awkwardly approximate keyword only indexing (e.g., xarray.DataArray.sel() and xarray.DataArray.isel() both allow for named dimensions with **kwargs), but nothing for the "mixed" case (neither method supports positional *args).
Sorry if this isn't the right thread -- there's a few now. But for an example of using both positional and keyword index parameters: I maintain a library (gridded) that provides an abstraction over data on various types of grid (in this case generally Oceanographic model output) -- they can be rectangular grids, curvilinear, unstructured triangular, .... The point of the library is to save the user from having to understand how all those grids work and, rather, be able to work with the data as if it were a continuous field. For example, if I want to know the sea surface temperature at a given location, I need to figure out what cell that location is in, what the values are at the corners of that cell, and then interpolate over the cell. After abstracting that, one can create a gridded.Variable object, and then do: sea_surface_temp.at(-78.123, 28.432) and get the value at those coordinates. So it would be pretty nifty to do: sea_surface_temp[-78.123, 28.432], which of course I could do with Python as it is. But in some instance, there is more than one way to interpolate, so it would be great to have: sea_surface_temp[-78.123, 28.432, interp='linear'] and that would require having mixed positional and keyword index parameters. -CHB On Sun, Sep 27, 2020 at 6:48 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Sat, Sep 26, 2020 at 8:40 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Sep 26, 2020 at 01:47:56PM -0300, Sebastian Kreft wrote:
In this fashion have you considering having keyword only indices, that is to only allow either obj[1, 2] or obj[row=1, col=2] (if the class supports it), and disallow mixing positional and keyword indices, meaning obj[1, col=2] would be a SyntaxError.
That would severely reduce the usefulness of this feature for me, probably by 80 or 90%, and possibly make it useless for xarray and pandas.
(I don't speak for the pandas or xarray devs, I'm happy to be corrected.)
From my perspective as a developer for both xarray and pandas, both "mixed" and "keyword only" indexing have use cases, but I would guess keyword only indexing is more important.
In xarray, we currently have methods that awkwardly approximate keyword only indexing (e.g., xarray.DataArray.sel() and xarray.DataArray.isel() both allow for named dimensions with **kwargs), but nothing for the "mixed" case (neither method supports positional *args). _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/IPEP5Y... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Tue, Sep 29, 2020 at 2:29 AM Christopher Barker <pythonchb@gmail.com> wrote:
Sorry if this isn't the right thread -- there's a few now.
But for an example of using both positional and keyword index parameters:
I maintain a library (gridded) that provides an abstraction over data on various types of grid (in this case generally Oceanographic model output) -- they can be rectangular grids, curvilinear, unstructured triangular, .... The point of the library is to save the user from having to understand how all those grids work and, rather, be able to work with the data as if it were a continuous field. For example, if I want to know the sea surface temperature at a given location, I need to figure out what cell that location is in, what the values are at the corners of that cell, and then interpolate over the cell.
After abstracting that, one can create a gridded.Variable object, and then do:
sea_surface_temp.at(-78.123, 28.432)
and get the value at those coordinates.
So it would be pretty nifty to do:
sea_surface_temp[-78.123, 28.432], which of course I could do with Python as it is.
But in some instance, there is more than one way to interpolate, so it would be great to have:
sea_surface_temp[-78.123, 28.432, interp='linear']
I presume you would only use this to get the temperature and not to set or delete measurements. Is that correct?
and that would require having mixed positional and keyword index parameters.
-CHB
On Sun, Sep 27, 2020 at 6:48 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Sat, Sep 26, 2020 at 8:40 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Sep 26, 2020 at 01:47:56PM -0300, Sebastian Kreft wrote:
In this fashion have you considering having keyword only indices, that is to only allow either obj[1, 2] or obj[row=1, col=2] (if the class supports it), and disallow mixing positional and keyword indices, meaning obj[1, col=2] would be a SyntaxError.
That would severely reduce the usefulness of this feature for me, probably by 80 or 90%, and possibly make it useless for xarray and pandas.
(I don't speak for the pandas or xarray devs, I'm happy to be corrected.)
From my perspective as a developer for both xarray and pandas, both "mixed" and "keyword only" indexing have use cases, but I would guess keyword only indexing is more important.
In xarray, we currently have methods that awkwardly approximate keyword only indexing (e.g., xarray.DataArray.sel() and xarray.DataArray.isel() both allow for named dimensions with **kwargs), but nothing for the "mixed" case (neither method supports positional *args). _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/IPEP5Y... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD
Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/UBQS7Y... Code of Conduct: http://python.org/psf/codeofconduct/
-- Sebastian Kreft
On Tue, Sep 29, 2020 at 8:02 AM Sebastian Kreft <skreft@gmail.com> wrote:
But in some instance, there is more than one way to interpolate, so it would be great to have:
sea_surface_temp[-78.123, 28.432, interp='linear']
I presume you would only use this to get the temperature and not to set or delete measurements. Is that correct?
yes. working with this gridded data is pretty much a one-way street. If you want to change the values, you really need to work with the underlying arrays. Though now you mention it, I may think about that some more -- maybe there's something that could be done there? I haven't thought about it because I (nor any of my users) don't have a use case for that. -CHB
and that would require having mixed positional and keyword index parameters.
-CHB
On Sun, Sep 27, 2020 at 6:48 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Sat, Sep 26, 2020 at 8:40 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Sep 26, 2020 at 01:47:56PM -0300, Sebastian Kreft wrote:
In this fashion have you considering having keyword only indices, that is to only allow either obj[1, 2] or obj[row=1, col=2] (if the class supports it), and disallow mixing positional and keyword indices, meaning obj[1, col=2] would be a SyntaxError.
That would severely reduce the usefulness of this feature for me, probably by 80 or 90%, and possibly make it useless for xarray and pandas.
(I don't speak for the pandas or xarray devs, I'm happy to be corrected.)
From my perspective as a developer for both xarray and pandas, both "mixed" and "keyword only" indexing have use cases, but I would guess keyword only indexing is more important.
In xarray, we currently have methods that awkwardly approximate keyword only indexing (e.g., xarray.DataArray.sel() and xarray.DataArray.isel() both allow for named dimensions with **kwargs), but nothing for the "mixed" case (neither method supports positional *args). _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/IPEP5Y... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD
Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/UBQS7Y... Code of Conduct: http://python.org/psf/codeofconduct/
-- Sebastian Kreft
-- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Sat, Sep 26, 2020 at 12:50 AM Stefano Borini <stefano.borini@gmail.com> wrote:
Other use cases are certainly allowed, but to me, something like
a[1, 2, unit="meters"] a[1, 2, unit="meters"] = 3
makes me feel uncomfortable, although I might learn to accept it. Then why isn't the unit close to 3, as in
a[1,2] = 3 * meters
I think this is a very natural looking use case, and is not as well addressed by putting the multiplier outside the indexing. Imagine you have two arrays that store lengths, but not necessarily in the same units. Whatever the underlying representation, we would be able to ask, e.g.: a[1, 2, unit="furlongs"] == b[1, 2, unit="furlongs"] Sure, `a` might store its data as AUs and `b` might store its data as Plank lengths. But the comparison like that would still work. More importantly, a user wouldn't have to KNOW the underlying representation if she knew there was a "unit" term. Even though there obviously *is* some computation in converting the units, it still feels like it's within the concept of "lookup" into an array (or other data structure).
del a[1, 2, unit="meters"] # and this one?
I presume in this case that the 'unit' parameter would just be ignored, but the operation would still succeed. Obviously, that's up to the class, but that feels like a natural use case. I'm deleting the thing at index (1, 2), but for deletion, the idea of conversion from "native units" to the specified units is merely irrelevant (but doesn't make the intent unclear either). -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.
On Sat, Sep 26, 2020 at 9:35 PM David Mertz <mertz@gnosis.cx> wrote:
On Sat, Sep 26, 2020 at 12:50 AM Stefano Borini <stefano.borini@gmail.com> wrote:
Other use cases are certainly allowed, but to me, something like
a[1, 2, unit="meters"] a[1, 2, unit="meters"] = 3
makes me feel uncomfortable, although I might learn to accept it.
Then why isn't the unit close to 3, as in
a[1,2] = 3 * meters
I think this is a very natural looking use case, and is not as well addressed by putting the multiplier outside the indexing. Imagine you have two arrays that store lengths, but not necessarily in the same units. Whatever the underlying representation, we would be able to ask, e.g.:
a[1, 2, unit="furlongs"] == b[1, 2, unit="furlongs"]
Sure, `a` might store its data as AUs and `b` might store its data as Plank lengths. But the comparison like that would still work. More importantly, a user wouldn't have to KNOW the underlying representation if she knew there was a "unit" term.
I don't find such examples a conclusive argument in favour of this syntax at all. Anything that stored a length and could do conversions in this way would presumably need to be some kind of length object that was able to handle conversions. In that case, the way that equality was calculated should take care of any conversions needed. I can't help feeling that this is a syntax looking for a set of solutions to apply itself to. It will add enormous complexity (together, no doubt, with speed issues and backwards-compatibility issues) to the language, and I'm not sure that I see the gain, yet. I'm obviously missing something.
On Sat, Sep 26, 2020 at 2:44 PM Nicholas Cole <nicholas.cole@gmail.com> wrote:
I can't help feeling that this is a syntax looking for a set of solutions to apply itself to. It will add enormous complexity (together, no doubt, with speed issues and backwards-compatibility issues) to the language, and I'm not sure that I see the gain, yet. I'm obviously missing something.
Yes, reading through about 500 messages before the PEP was drafted. Nearly every aspect of this has been debated to death. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On Sat, Sep 26, 2020 at 11:45 AM Nicholas Cole <nicholas.cole@gmail.com> wrote:
I think this is a very natural looking use case, and is not as well addressed by putting the multiplier outside the indexing. Imagine you have two arrays that store lengths, but not necessarily in the same units. Whatever the underlying representation, we would be able to ask, e.g.:
a[1, 2, unit="furlongs"] == b[1, 2, unit="furlongs"]
I don't find such examples a conclusive argument in favour of this syntax at all. Anything that stored a length and could do conversions in this way would presumably need to be some kind of length object that was able to handle conversions.
My comment was not remotely meant to be "the conclusive argument." If you read the PEP, or the 500 prior discussion messages that Guido references, you can see much more. My comment was simply about a corner case, and the phrasing of one paragraph in the PEP. One of the PEP authors said he thought that usage was a misuse, and I claimed that I think it would be a good and appropriate use for the syntax. Not the only use. Not the primary one. And obviously not the only possible way to deal with units (although I think it WOULD be somewhat more natural than the other approaches you mention in your comment). However, right now it's obviously a purely hypothetical library that would do that. Possibly someone would create such an API if the syntax is added, but quite possibly not as well. -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.
On Sat, Sep 26, 2020 at 10:43:01PM +0100, Nicholas Cole wrote:
I don't find such examples a conclusive argument in favour of this syntax at all.
I wouldn't expect you would -- Stefano made it up as an example of something that he felt was a poor use of subscripts! You should not be surprised that an example designed to be a poor fit for keyword subscripts is a poor fit for keyword subscripts.
I can't help feeling that this is a syntax looking for a set of solutions to apply itself to.
Have you read the PEP? This solves existing problems. The PEP gives examples of existing code that uses workarounds for the lack of this. # Existing df[df['x'] == 1] # Proposed df[x=1] # Existing ds["empty"].loc[{'lon': slice(1, 5), 'lat': slice(3, None)}] # Proposed ds["empty"][lon=1:5, lat=6:] Do you wish to say that the existing workarounds are superior to the proposed syntax? I know that for my own case, this would immediately solve a lot of problems for me. I'll be able to drop something like five proxy classes from a single public class: Public class: Matrix Proxies to support: * 1-based cell indexing (start at 1 rather than 0) * rows (0-based and 1-based) * columns (0-based and 1-based) Of course I could refuse to provide certain APIs, or redesign the Matrix class to use methods for indexing, but I'd much rather use indexing for indexing. This would be a far more natural and clean way of handling the interface, with a simpler implementation (no more proxies!).
It will add enormous complexity (together, no doubt, with speed issues and backwards-compatibility issues) to the language,
That is, I think, pure FUD. We have spent a lot of time working on this to ensure that there are no backwards compatibility issues, or at least none that are serious. By my recollection, the *only* change for existing code is that code like this: {}[None, kw=1] will change from a SyntaxError to a TypeError. As far as speed and complexity goes, I do not understand the C implementation well enough to categorically dismiss your claims, but from everything I have seen, neither is true: this should not have any significant slowdown, and the increase in complexity should be quite small. There have been more complex proposals that would require significantly more complex rules, and costly runtime look-ups. But this PEP rejects those. As far as I can tell, you are unfairly accusing the PEP of being everything that the PEP authors, and the Python-Ideas mailing list, has spent a *lot* of effort designing the PEP to *not* be. -- Steve
On Sun, 27 Sep 2020 at 06:27, Steven D'Aprano <steve@pearwood.info> wrote:
As far as speed and complexity goes, I do not understand the C implementation well enough to categorically dismiss your claims, but from everything I have seen, neither is true: this should not have any significant slowdown, and the increase in complexity should be quite small.
If my understanding of the C part is correct, this will have practically zero impact on speed. The compiler will (should) be able to create BINARY_SUBSCR opcodes when it sees an invocation without keyword arguments. It will use a new opcode BINARY_SUBSCR_KW when there's a keyword argument. The only loss in performance is that the C handler for the old variant will likely end up calling the new handler with a kwarg set to NULL, so the total cost is one routine call. But I am not an expert in the C interpreter internals, so I might be wrong. -- Kind regards, Stefano Borini
Do you have an example from boost-histogram..
One of the dev's here, and actually, I think "Hist" would use it, rather than boost-histogram directly. In boost-histogram, axis are only represented by numbers, so `h[{0:slice(2,5)}]` probably could not be written `h[0=2:5]` even after this PEP; but we could have ax0, etc, for example. See the SciPy 2020 or PyHEP 2020 talks for more examples. Hist extends boost-histogram and adds named axes, where this would be _very_ useful. Now you could make a histogram: ```pyhton h = Hist(axis.Regular(10,-1,1, name="x"), axis.Boolean(name="signal")) h_signal= h[x=2:8, signal=True] ``` (of course, you could actually have a lot of axes, and you don't need to slice all of them every time) I realize now on further reading the PEP does discuss the idea, though I only agree with one of the arguments; the slow part might be solvable, the complexity issue is reversed (adding a new one-off rule ontop of the _already_ one-off rule for tuplizing arguments is more complex, IMO, than reusing function syntax), the transition/mix could be done, the one where "*args" would be needed to represent a tuple of arguments makes no sense to me (of course that's how you should write a Python signature for a variable number of args, that's not a downside). It would be reasonably easy to write a conversion to/from `__getitem_func__`, and the end result would be a cleaner, nicer language (in 10ish years?). There have been similar transitions in the past. Furthermore, you currently can't tell the difference between `x[(a, b)]` and `x[a, b]`; with the new function, libraries could differentiate, and maybe eventually make them behave reasonably (you can always use x[*c] if you already have a tuple, just like for a function, and it's one of the rare / only places where list vs. tuple matters in Python). Just some thoughts, still excited to see this become available in some form. :)
At this point I think we're all set for use cases, both for keyword-only and for mixed use. Clearly a lot of libraries are going to be able to provide better APIs using this PEP, and mixed use of positionals and keywords will be quite useful to some of these. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
I still think it would improve the PEP significantly if it added one case of mixed positional/keyword indexing. Lots of suggestions have floated by, and I don't care which one is used, but something to demonstrate that within the PEP. On Tue, Sep 29, 2020 at 11:43 AM Guido van Rossum <guido@python.org> wrote:
At this point I think we're all set for use cases, both for keyword-only and for mixed use. Clearly a lot of libraries are going to be able to provide better APIs using this PEP, and mixed use of positionals and keywords will be quite useful to some of these.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...> _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7KTEXJ... Code of Conduct: http://python.org/psf/codeofconduct/
-- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.
On Tue, Sep 29, 2020 at 6:56 PM David Mertz <mertz@gnosis.cx> wrote:
I still think it would improve the PEP significantly if it added one case of mixed positional/keyword indexing. Lots of suggestions have floated by, and I don't care which one is used, but something to demonstrate that within the PEP.
I agree, that use case should ideally be one that could have get, set and delete semantics. As most of the examples provided seem to only be meant for accessing the data.
On Tue, Sep 29, 2020 at 11:43 AM Guido van Rossum <guido@python.org> wrote:
At this point I think we're all set for use cases, both for keyword-only and for mixed use. Clearly a lot of libraries are going to be able to provide better APIs using this PEP, and mixed use of positionals and keywords will be quite useful to some of these.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...> _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7KTEXJ... Code of Conduct: http://python.org/psf/codeofconduct/
-- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/2FDXU6... Code of Conduct: http://python.org/psf/codeofconduct/
-- Sebastian Kreft
On Tue, Sep 29, 2020 at 12:09 PM Sebastian Kreft <skreft@gmail.com> wrote:
On Tue, Sep 29, 2020 at 6:56 PM David Mertz <mertz@gnosis.cx> wrote:
I still think it would improve the PEP significantly if it added one case of mixed positional/keyword indexing. Lots of suggestions have floated by, and I don't care which one is used, but something to demonstrate that within the PEP.
I agree, that use case should ideally be one that could have get, set and delete semantics. As most of the examples provided seem to only be meant for accessing the data.
Both the units and source examples I gave would be very natural with set and del semantics; enough that I just assumed those were understood. Probably units moreso. internally_inches[4:6, 8:10, unit="meters"] = [[4, 2], [1, 3]] # Stores [[157.48, ...], [..., ...]] in block internally_cm[4:6, 8:10, unit="metres"] = [[4, 2], [1, 3]] # apparently allow two spellings del internally_inches[1, unit="furlong"] # Delete one row, unit ignored -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.
On Tue, Sep 29, 2020 at 7:27 PM David Mertz <mertz@gnosis.cx> wrote:
On Tue, Sep 29, 2020 at 12:09 PM Sebastian Kreft <skreft@gmail.com> wrote:
On Tue, Sep 29, 2020 at 6:56 PM David Mertz <mertz@gnosis.cx> wrote:
I still think it would improve the PEP significantly if it added one case of mixed positional/keyword indexing. Lots of suggestions have floated by, and I don't care which one is used, but something to demonstrate that within the PEP.
I agree, that use case should ideally be one that could have get, set and delete semantics. As most of the examples provided seem to only be meant for accessing the data.
Both the units and source examples I gave would be very natural with set and del semantics; enough that I just assumed those were understood. Probably units moreso.
internally_inches[4:6, 8:10, unit="meters"] = [[4, 2], [1, 3]] # Stores [[157.48, ...], [..., ...]] in block
You mean that internally_inches means the stored values are in inches, and that by specifying a unit, you will scale up all the passed values? So that, internally_inches[4:6, 8:10, unit="meters"] = [[4, 2], [1, 3]] would save in the (4:6, 8:10) block the value [[157.48, 78.7402], [39.3701, 118.11]]? Is that right? When I originally read these examples, I thought the unit modifier would change the block location. It feels weird to me to have the unit disassociated from the actual value being stored. If I understood correctly what's going on, what would the difference between del internally_inches[4:6, 8:10, unit="meters"] and del internally_inches[4:6, 8:10, unit="inches"]
internally_cm[4:6, 8:10, unit="metres"] = [[4, 2], [1, 3]] # apparently allow two spellings
del internally_inches[1, unit="furlong"] # Delete one row, unit ignored
--
The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.
-- Sebastian Kreft
On Wed, Sep 30, 2020 at 10:41 AM Sebastian Kreft <skreft@gmail.com> wrote:
If I understood correctly what's going on, what would the difference between
del internally_inches[4:6, 8:10, unit="meters"] and del internally_inches[4:6, 8:10, unit="inches"]
Probably nothing, which shows that the ability to have keyword args doesn't mean you HAVE to have keyword args. It would make good sense to have an API that involves units for set and get, but ignores them for delete. ChrisA
On Tue, Sep 29, 2020, 2:41 PM Sebastian Kreft
You mean that internally_inches means the stored values are in inches, and that by specifying a unit, you will scale up all the passed values?
So that, internally_inches[4:6, 8:10, unit="meters"] = [[4, 2], [1, 3]] would save in the (4:6, 8:10) block the value [[157.48, 78.7402], [39.3701, 118.11]]? Is that right?
Yes, that's exactly the hypothetical library/class I imagined. A keyword changing an "aspect" or "interpretation" of the data rather than the the memory location where bits are stored. The reason I'd want this is because, e.g. several HDF5 files I work with together might have different native units. Rather than globally convert all 10 billion numbers on first read, and convert them back before writing, only those values specifically utilized/modified would need conversion. del internally_inches[4:6, 8:10, unit="meters"] and
del internally_inches[4:6, 8:10, unit="inches"]
Again, this is hypothetical. But my way of thinking is that for this code, unit is ignored. But it shows consistency with the same unit being used everywhere else in the same program.
On Tue, 29 Sep 2020 at 22:31, <henryfs@princeton.edu> wrote:
Furthermore, you currently can't tell the difference between `x[(a, b)]` and `x[a, b]`; with the new function, libraries could differentiate, and maybe eventually make them behave reasonably (you can always use x[*c] if you already have a tuple, just like for a function, and it's one of the rare / only places where list vs. tuple matters in Python).
You still would not be able to. The semantics of that operation are unchanged (for backward compatibility) -- Kind regards, Stefano Borini
I know, I was just referring to making this a standard python function with *args, **kwargs. This PEP only solves one specific problem (keyword arguments), while it seems like it would be worth while solving all of them (no arguments, tuple vs. list, and having to learn a special one-off syntax for just this special method). That’s what I meant. Being able to select on axis out of many to perform an operation (possibly using h[_3=2:5] ) would already be a big win for boost-histogram, for example.
You still would not be able to. The semantics of that operation are unchanged (for backward compatibility)
By the way, a second problem with __getitem__ and __setitem__ is they are really messy to type (and this likely would make it worse). If, however, there was a new __getindex__ and __setindex__ with normal Python *args and **kwargs symantics, that would be really easy to type. For example, when I recently added typing to getitem in boost-histogram/hist: InnerIndexing = Union[SupportsIndex, str, Callable[[bh.axis.Axis], int], slice] IndexingWithMapping = Union[InnerIndexing, Mapping[Union[int, str], InnerIndexing]] IndexingExpr = Union[IndexingWithMapping, Tuple[IndexingWithMapping, ...]] def __getitem__(self, item: IndexingExpr) (And pretty much the first thing I tried it on was h[...] = , which of course fails a type check, because I forgot "ellipsis" (discussion in typing-sig). It's also really hard to use types internally, due to the Tuple and non-Tuple versions. If this was a normal expression, then something like: InnerIndexing = Union[SupportsIndex, str, Callable[[bh.axis.Axis], int], slice] def __getindex__(self, *args: IndexingExpr | Mapping[int | str, IndexingExpr], **kwargs: IndexingExpr) Would work and be significantly simpler, both inside and out.
https://www.python.org/dev/peps/pep-0637/ Thank you Stefano and Jonathan for a very carefully written and thought-out PEP. I trust that the background etc. are representing past discussion, so I am going to focus on the spec itself. Fortunately I only have a few nits, really. (If you submit new PRs to the peps repo, one of the PEP editors will merge it quickly without questions, as long as the markup passes the tests.) - I recommend that you carefully look over the PEP as rendered on python.org (link above) and try to fix any markup oddities. E.g. some comments are line-wrapped, which looks ugly, and some bulleted lists have an extra blank line above. - Looking at the generated ToC, you have two top-level sections labeled "Syntax and Semantics". That seems a bit odd. I'm not sure what you meant to happen here, but I recommend renaming at least one of these. (Another recommendation: don't mix description of the status quo with the specification.) - While I can understand the desire to keep C function names short, I don't see we should continue the tradition of using the meaningless 'Ex' suffix for API variations that take an additional dict of keywords. Looking through the existing APIs, I recommend PyObject_{Get,Set,Del}ItemWithKeywords instead. (Note you have a typo here, "DetItem". Also I recommend adding markup (e.g. bullets) so each signature is on a line by itself. That's it from me! -- --Guido van Rossum (python.org/~guido) Pronouns: he/him (why is my pronoun here?)
On Thu, 24 Sep 2020 at 22:22, Guido van Rossum <guido@python.org> wrote:
- I recommend that you carefully look over the PEP as rendered on python.org (link above) and try to fix any markup oddities. E.g. some comments are line-wrapped, which looks ugly, and some bulleted lists have an extra blank line above.
- Looking at the generated ToC, you have two top-level sections labeled "Syntax and Semantics". That seems a bit odd. I'm not sure what you meant to happen here, but I recommend renaming at least one of these. (Another recommendation: don't mix description of the status quo with the specification.)
No matter how many times one re-reads something, there's always going to be a typo :) Just as a question, the generated html using make pep-0637.html are not supposed to get the css style. I get no stylisation, but I do see a .css file. Code seems to support that css is only there to be pushed to remote deployment, but maybe there's a trick I haven't spotted.
- While I can understand the desire to keep C function names short, I don't see we should continue the tradition of using the meaningless 'Ex' suffix for API variations that take an additional dict of keywords. Looking through the existing APIs, I recommend PyObject_{Get,Set,Del}ItemWithKeywords instead. (Note you have a typo here, "DetItem". Also I recommend adding markup (e.g. bullets) so each signature is on a line by itself.
Will do. Thanks. In fact I'll focus tonight on fixing the visual aspect and typos. I also have feedback from the numpy mailing list about the argument to pass in case no positional index is specified), but I'll add that later, as well as the rest of this ml feedback. I also tried to start implementing the feature as a PoC... but the parser grammar is beyond my skills I fear. I understand it, but unsure (at the moment) how to modify it. Nevertheless I am willing to spend some days and do some experiments. I'll keep you all posted if I can achieve something that seems to work. -- Kind regards, Stefano Borini
Change of subject line as I wish to focus on a single critical point of the PEP: keyword-only subscripts. TL;DR: 1. We have to pass a sentinel to the setitem dunder if there is no positional index passed. What should that sentinel be? * None * the empty tuple () * NotImplemented * something else 2. Even though we don't have to pass the same sentinel to getitem and delitem dunders, we could. Should we? * No, getitem and delitem should use no sentinel. * Yes, all three dunders should use the same rules. * Just prohibit keyword-only subscripts. (Voting is non-binding. It's feedback, not a democracy :-) Please read the details below before voting. Comments welcome. ---------------------------------------------------------------------- For all three dunders, there is no difficulty in retrofitting keyword subscripts to the dunder signature if there is there is a positional index: obj[index, spam=1, eggs=2] # => calls type(obj).__getitem__(index, spam=1, eggs=2) del obj[index, spam=1, eggs=2] # => calls type(obj).__delitem__(index, spam=1, eggs=2) obj[index, spam=1, eggs=2] = value # => calls type(obj).__setitem__(index, value, spam=1, eggs=2) If there is no positional index, the getitem and delitem calls are easy: obj[spam=1, eggs=2] # => calls type(obj).__getitem__(spam=1, eggs=2) del obj[spam=1, eggs=2] # => calls type(obj).__delitem__(spam=1, eggs=2) If the dunders are defined with a default value for the index, the call will succeed; if there is no default, you will get a TypeError. This is what we expect to happen. But setitem is hard: obj[spam=1, eggs=2] = value # => calls type(obj).__setitem__(???, value, spam=1, eggs=2) Python doesn't easily give us a way to call a method and skip over positional arguments. So it seems that setitem needs to fill in a fake placeholder. Three obvious choices are None, the empty tuple () or NotImplemented. All three are hashable, so they could be used as legitimate keys in a mapping; but in practice, I expect that only None and () would be. I can't see very many objects actually using NotImplemented as a key. numpy also uses None to force creation of a new axis. I don't think that *quite* rules out None: numpy could distinguish the meaning of None as a subscript depending on whether or not there are keyword args. But NotImplemented is special: - I don't expect anyone is using NotImplemented as a key or index. - NotImplemented is already used a sentinel for operators to say "I don't know how to handle this"; it's not far from that to interpret it as saying "I don't know what value to put in this positional argument". - Starting in Python 3.9 or 3.10, NotImplemented is even more special: it no longer ducktypes as truthy or falsey. This will encourage people to explicitly check for it: if index is NotImplemented: ... rather than `if index: ...`. So I think that NotImplemented is a better choice than None or an empty tuple. Whatever sentinel we use, that implies that setitem cannot distingish these two cases: obj[SENTINEL, spam=1, eggs=2] = value obj[spam=1, eggs=2] = value Since both None and () are likely to be legitimate indexes, and NotImplemented is less likely to be such, I think this supports using NotImplemented. But whichever sentinel we choose, that brings us to the second part of the problem. What should getitem and delitem do? setitem must provide a sentinel for the first positional argument, but getitem and delitem don't have to. So we could have this: # Option 1: only setitem is passed a sentinel obj[spam=1, eggs=2] # => calls type(obj).__getitem__(spam=1, eggs=2) del obj[spam=1, eggs=2] # => calls type(obj).__delitem__(spam=1, eggs=2) obj[spam=1, eggs=2] = value # => calls type(obj).__setitem__(SENTINEL, value, spam=1, eggs=2) Advantages: - The simple getitem and delitem cases stay simple; it is only the complicated setitem case that is complicated. - getitem and delitem can distinguish the "no positional index at all" case from the case where the caller explicitly passes the sentinel as a positional index; only setitem cannot distinguish them. If your class doesn't support setitem, this might be useful to you. Disadvantages: - Inconsistency: the rules for one dunder are different from the other two dunders. - If your class does distinguish between no positional index, and the sentinel, that means that there is a case that getitem and delitem can handle but setitem cannot. Or we could go with an alternative: # Option 2: all three dunders are passed a sentinel obj[spam=1, eggs=2] # => calls type(obj).__getitem__(SENTINEL, spam=1, eggs=2) del obj[spam=1, eggs=2] # => calls type(obj).__delitem__(SENTINEL, spam=1, eggs=2) obj[spam=1, eggs=2] = value # => calls type(obj).__setitem__(SENTINEL, value, spam=1, eggs=2) Even though the getitem and delitem cases don't need the sentinel, they get them anyway. This has the advantage that all three dunders are treated the same, and that there is no case that two of the dunders will handle but the third does not. But it also means that subscript dunders cannot meaningfully provide their own default for the index in the function signature: def __getitem__(self, index=0, *, spam, eggs) will always receive a value for index, not the default. So we need to check that inside the body: if index is SENTINEL: index = 0 There's a third option: just prohibit keyword-only subscripts. I think that's harsh, throwing the baby out with the bathwater. I personally have use-cases where I would use keyword-only subscripts so I would prefer options 1 or 2. -- Steve
On Fri, Sep 25, 2020 at 11:50 PM Steven D'Aprano <steve@pearwood.info> wrote:
TL;DR:
1. We have to pass a sentinel to the setitem dunder if there is no positional index passed. What should that sentinel be?
Isn't there a problem with this starting assumption? Say that I have item dunder code of the signature that is common today, and I have no interest in providing direct support for kwd args in my item dunders. Say also, kwd unpacking is supported. If a person unpacks an empty dictionary (something that surely will happen occasionally), and a SENTIAL is provided, this poses a pretty good possibility to lead to unintended behavior in many cases: class C: def __getitem__(self, index): ... d = {} c = C() c[**d] The above will call: c.__getitem__(SENTINEL) Who knows what the effect of c[SENTINEL] will be for so much existing code out there? Bugs are surely to be created, aren't they? So a lot of people will have to make changes to existing code to handle this with a sentinel. This seems like a big ugly problem to me that needs to be avoided. There's a third option: just prohibit keyword-only subscripts. I think
that's harsh, throwing the baby out with the bathwater. I personally have use-cases where I would use keyword-only subscripts so I would prefer options 1 or 2.
Maybe there's a middle way option that could avoid the particular problem outlined above: # Option 4: do use a sentinel at all, user provides own (if desired) obj[spam=1, eggs=2] # => calls type(obj).__getitem__(USER_PROVIDED_SENTINEL, spam=1, eggs=2) del obj[spam=1, eggs=2] # => calls type(obj).__delitem__(USER_PROVIDED_SENTINEL, spam=1, eggs=2) obj[spam=1, eggs=2] = value # => calls type(obj).__setitem__(USER_PROVIDED_SENTINEL, value, spam=1, eggs=2) How do we write the dunder item to allow for this? Simple: just require that if it is desired to support kwd-only item setting, the writer of the code has to provide a default-- ANY default-- to the value argument in setitem (and this default will never be used). I suggest the ellipses object as the standard convention: MyPreferredSentinel = object() class C: def __setitem__(self, index=MyPreferredSentinel, value=..., **kwargs): ... c=C() c[x=1] = "foo" # calls => c.__setitem__(MyPreferredSentinel, "foo", x=1) This option provides an additional benefit: if I don't want to provide support for the kwd-only case, I don't have to handle it explicitly-- errors occur without any effort on my part, just as they do today (although the error itself is different): class C: def __setitem__(self, index, value, **kwargs): ... c = C() c[x=1] = "foo" # yay, this produces an error with no effort Providing a default sentinel nullifies this possibility. In that case, have to handle a kwd-only indexing myself: class C: def __setitem__(self, index, value, **kwargs): if index is SENTINEL: handle_kwd_args_only() --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
But this requires introspection. On Fri, Sep 25, 2020 at 22:46 Ricky Teachey <ricky@teachey.org> wrote:
On Fri, Sep 25, 2020 at 11:50 PM Steven D'Aprano <steve@pearwood.info> wrote:
TL;DR:
1. We have to pass a sentinel to the setitem dunder if there is no
positional index passed. What should that sentinel be?
Isn't there a problem with this starting assumption?
Say that I have item dunder code of the signature that is common today, and I have no interest in providing direct support for kwd args in my item dunders. Say also, kwd unpacking is supported.
If a person unpacks an empty dictionary (something that surely will happen occasionally), and a SENTIAL is provided, this poses a pretty good possibility to lead to unintended behavior in many cases:
class C: def __getitem__(self, index): ...
d = {} c = C() c[**d]
The above will call:
c.__getitem__(SENTINEL)
Who knows what the effect of c[SENTINEL] will be for so much existing code out there? Bugs are surely to be created, aren't they? So a lot of people will have to make changes to existing code to handle this with a sentinel. This seems like a big ugly problem to me that needs to be avoided.
There's a third option: just prohibit keyword-only subscripts. I think
that's harsh, throwing the baby out with the bathwater. I personally
have use-cases where I would use keyword-only subscripts so I would
prefer options 1 or 2.
Maybe there's a middle way option that could avoid the particular problem outlined above:
# Option 4: do use a sentinel at all, user provides own (if desired)
obj[spam=1, eggs=2] # => calls type(obj).__getitem__(USER_PROVIDED_SENTINEL, spam=1, eggs=2)
del obj[spam=1, eggs=2] # => calls type(obj).__delitem__(USER_PROVIDED_SENTINEL, spam=1, eggs=2)
obj[spam=1, eggs=2] = value # => calls type(obj).__setitem__(USER_PROVIDED_SENTINEL, value, spam=1, eggs=2)
How do we write the dunder item to allow for this? Simple: just require that if it is desired to support kwd-only item setting, the writer of the code has to provide a default-- ANY default-- to the value argument in setitem (and this default will never be used).
I suggest the ellipses object as the standard convention:
MyPreferredSentinel = object()
class C: def __setitem__(self, index=MyPreferredSentinel, value=..., **kwargs): ...
c=C() c[x=1] = "foo" # calls => c.__setitem__(MyPreferredSentinel, "foo", x=1)
This option provides an additional benefit: if I don't want to provide support for the kwd-only case, I don't have to handle it explicitly-- errors occur without any effort on my part, just as they do today (although the error itself is different):
class C: def __setitem__(self, index, value, **kwargs): ...
c = C() c[x=1] = "foo" # yay, this produces an error with no effort
Providing a default sentinel nullifies this possibility. In that case, have to handle a kwd-only indexing myself:
class C: def __setitem__(self, index, value, **kwargs): if index is SENTINEL: handle_kwd_args_only()
--- Ricky.
"I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/G2ZGET...
Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido (mobile)
On Sat, Sep 26, 2020 at 1:51 AM Guido van Rossum <guido@python.org> wrote:
But this requires introspection.
I'm sorry Guido I'm staring at your message and reread mine several times and I don't understand where introspection is required. It seems like all the parser has to do is what it has always done-- pass the arguments to the item dunders and let the chips fall where they may (other than perhaps adjusting the TypeError message that could result). But you don't have to explain to me, I am certain you know what you're talking about. Anyway: I don't have an opinion on which of the offered possibilities is the best SENTINEL. But I would also suggest considering adding the Ellipses object, ..., to the list of possibilities. It's a unique value that sort of stands out (though it is Truthy rather than Falsey like None and the empty tuple). --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
On Fri, Sep 25, 2020 at 11:26 PM Ricky Teachey <ricky@teachey.org> wrote:
On Sat, Sep 26, 2020 at 1:51 AM Guido van Rossum <guido@python.org> wrote:
But this requires introspection.
I'm sorry Guido I'm staring at your message and reread mine several times and I don't understand where introspection is required. It seems like all the parser has to do is what it has always done-- pass the arguments to the item dunders and let the chips fall where they may (other than perhaps adjusting the TypeError message that could result). But you don't have to explain to me, I am certain you know what you're talking about.
Anyway: I don't have an opinion on which of the offered possibilities is the best SENTINEL.
But I would also suggest considering adding the Ellipses object, ..., to the list of possibilities. It's a unique value that sort of stands out (though it is Truthy rather than Falsey like None and the empty tuple).
I meant that introspection would be needed to determine whether a default index was given or not. On second thought, I made a mistake, and it's worse -- it's just impossible to do it right from every POV. IIUC you're proposing that the user add a default to the `index` value, like this: ``` def __setitem__(self, index=42, value=None, **kwargs): ... ``` But in order to be correct we'd have to make `index` and `value` (and `self`) positional-only arguments (PEP 570), otherwise you'd get problems with `a[43, value=1] = 2`, so the def should really be ``` def __setitem__(self, index=42, value=None, /, **kwargs): ... ``` But then the interpreter must pass `index` and `value` by position. How would it pass `value` without passing `index`? By keyword -- but we've just determined it must be a positional-only arg. Ergo, we've reached a contradiction, and there's no solution. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On Sat, Sep 26, 2020 at 01:43:18AM -0400, Ricky Teachey wrote:
On Fri, Sep 25, 2020 at 11:50 PM Steven D'Aprano <steve@pearwood.info> wrote:
TL;DR:
1. We have to pass a sentinel to the setitem dunder if there is no positional index passed. What should that sentinel be?
Isn't there a problem with this starting assumption?
It's not an assumption, it's a conclusion. I can't see any other clean, and *easy*, way for this to work with setitem. If you can, I'm listening. The problem is that there's no simple way to get an optional first positional argument and mandatory second positional argument: def __setitem__(self, [index,] value, [*, spam]): but adding this to the language is out of scope for the PEP. But see below.
Say that I have item dunder code of the signature that is common today, and I have no interest in providing direct support for kwd args in my item dunders. Say also, kwd unpacking is supported.
Okay. Existing code doesn't change.
If a person unpacks an empty dictionary (something that surely will happen occasionally), and a SENTIAL is provided, this poses a pretty good possibility to lead to unintended behavior in many cases:
Correct, and well-spotted. The solution is: *don't do that*. If you, the caller, do so, that's on your own head. We cannot expect to protect users from their own errors. Don't insert arbitrary unpacking into any function or method call, and that includes subscripts, if you don't know what it will do. Consider: obj = {} kwargs = {'spam': 1} obj[**kwargs] = value Right now this is a syntax error, so nobody is doing it. If the PEP is accepted, it will become a TypeError since dict won't support keywords. Either way, nobody is going to get into the habit of doing this in dicts, or lists, or any other class that doesn't support keyword subscripts. The only way to get something surprising is: obj = {} kwargs = {} obj[**kwargs] = value which is equivalent to: obj[SENTINEL] = value That's another argument for choosing NotImplemented over None or the empty tuple. We can't expect subscriptable classes to reject a subscript of () or None, since these are legitimate keys and have existing uses, but we can say to class creators "if you care about this, just reject NotImplemented if it is the subscript". I expect most classes won't bother. I wouldn't expect `dict` to bother either.
Who knows what the effect of c[SENTINEL] will be for so much existing code out there? Bugs are surely to be created, aren't they? So a lot of people will have to make changes to existing code to handle this with a sentinel. This seems like a big ugly problem to me that needs to be avoided.
If this PEP is accepted, it won't magically insert keyword unpacking into existing code. So it's not going to break existing code. The only code that might be broken is future code, and even then, we can can just declare that its not broken, it's working as designed: obj[**{}] # equivalent to obj[NotImplemented] by definition If you don't want that, don't do it. I don't think that's the perfect solution, but I think it is good enough.
Maybe there's a middle way option that could avoid the particular problem outlined above:
# Option 4: do use a sentinel at all, user provides own (if desired) [...]
How do we write the dunder item to allow for this? Simple: just require that if it is desired to support kwd-only item setting, the writer of the code has to provide a default-- ANY default-- to the value argument in setitem (and this default will never be used).
Simple to say, but I don't think it will be simple to implement. Currently, the interpreter treats all methods alike. Given some arguments, it binds them to the parameters from left to right, then handles keyword arguments. That's why it can't just skip the first positional argument (the index) and bind to the second (the value from the right hand side). Currently the interpreter doesn't know or care what method is being called when it packs and unpacks arguments, all method and function calls get unpacked more or less in the same way: positional arguments get bound from the left to the right. If we were willing to add a second set of rules for parameter binding, we could avoid this problem. Something roughly like this. For every other function and method: * bind positional arguments from left to right; * if too many positional arguments, raise * handle keyword arguments; * if too many keyword arguments, or duplicate, raise * any parameter that doesn't have a value, fetch its default * if any parameter still doesn't have a value, raise * call the function For `__setitem__` only: * if there is a non-keyword subscript, use the procedure above * otherwise, bind the RHS value to the second positional parameter * handle keywords as above * fetch the default for the first positional argument * if there is no default, raise * call the function I've probably skipped a few steps, but you get the drift. We would need two distinct ways of packing arguments into parameters, one of which is specialised for setitem alone. Is it worth it? In my opinion, I don't think so, but I'd like to hear from somebody who knows more about the parameter binding code and can comment on how hard this would be. I assume it would be tedious but not impossible. To me it seems like deploying a giant hammer to squash a tiny insect, but maybe my instinct for the difficulty of this change is way off. Maybe it's trivially easy and we should just do it :-)
I suggest the ellipses object as the standard convention:
Ellipsis doesn't make a good choice, because it is heavily used by numpy. In fact is was specifically added to Python for numpy. -- Steve
On Fri, Sep 25, 2020 at 11:50 PM Steven D'Aprano <steve@pearwood.info> wrote:
1. We have to pass a sentinel to the setitem dunder if there is no positional index passed.
I still don't follow this logic -- why can't nothing be passed? The dunders either require an index or they don't, would that be just like function calling? So (adapting the example in the PEP: obj[spam=1, eggs=2] # calls type(obj).__getitem__(obj, spam=1, eggs=2) This sure seems like the obvious way to handle it. If the class requires a positional argument then it will fail with a TypeError. This sure seems like the most straightforward way to handle it. I'm sure I'm missing something, but reading the PEP, and my own experiments haven't clarified it for me. NOTE: one inconsistency would be that: obj[] would be a SyntaxError, and obj[this=x] would be a TypeError, if a positional index were expected. but that's less weird to me than the other incionsisentcies we are introducing for backward compatibility. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
Another inconsistency is that the case of keyword arguments only would bind the RHS value to the first positional argument, which is the index, and not the value. I think this is what Guido was referring to when he responded talking about introspection being required? Not sure. in any case, to me that doesn't seem like such a big deal period it might lead to some weird error messages but I'm not sure why it's such a big problem. Maybe it poses a difficulty for type hinting? On Sat, Sep 26, 2020, 6:48 PM Christopher Barker <pythonchb@gmail.com> wrote:
On Fri, Sep 25, 2020 at 11:50 PM Steven D'Aprano <steve@pearwood.info> wrote:
1. We have to pass a sentinel to the setitem dunder if there is no positional index passed.
I still don't follow this logic -- why can't nothing be passed? The dunders either require an index or they don't, would that be just like function calling? So (adapting the example in the PEP:
obj[spam=1, eggs=2] # calls type(obj).__getitem__(obj, spam=1, eggs=2)
This sure seems like the obvious way to handle it. If the class requires a positional argument then it will fail with a TypeError.
This sure seems like the most straightforward way to handle it.
I'm sure I'm missing something, but reading the PEP, and my own experiments haven't clarified it for me.
NOTE: one inconsistency would be that:
obj[]
would be a SyntaxError,
and
obj[this=x]
would be a TypeError, if a positional index were expected. but that's less weird to me than the other incionsisentcies we are introducing for backward compatibility.
-CHB
-- Christopher Barker, PhD
Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Sat, Sep 26, 2020 at 07:12:00PM -0400, Ricky Teachey wrote:
Another inconsistency is that the case of keyword arguments only would bind the RHS value to the first positional argument, which is the index, and not the value. I think this is what Guido was referring to when he responded talking about introspection being required? Not sure.
in any case, to me that doesn't seem like such a big deal period it might lead to some weird error messages but I'm not sure why it's such a big problem. Maybe it poses a difficulty for type hinting?
I had a few paragraphs about exactly that scenario in my earlier post, but I deleted it because I didn't think it would be popular. So let me see if I can recreate it from memory. You are suggesting that if there is no positional index, the interpreter should just pack the value into the left-most parameter. Here's the signature of my method: # Round 1 def __setitem__(self, index, value, *, spam=0): and the caller uses this: obj[spam=1] = 99 so 99 gets packed into the index and the exception says something like: TypeError: missing 1 required positional argument: 'value' which is surely going to confuse a lot of people, because the value is clearly 99. It's the index which is missing. Round 2: change the signature. def __setitem__(self, index, value=None, *, spam=0): and now the parameters get packed as index=99 and value=None, so there's no exception, but that looks like `obj[99, spam=1] = None` which would be indistinguishable from a perfectly normal call. Round 3: SENTINEL = object() # Private sentinel value for subscripts. # Or maybe use NotImplemented? def __setitem__(self, index, value=SENTINEL, *, spam=0): if value is SENTINEL: value = index index = SENTINEL # or raise if index is SENTINEL: # Handle keywords with no index else: # Handle index + keywords and now we have something usable. That's not a lot of boilerplate code, but if you don't write it, you get a weirdly misleading error message, as in Round 1 above. Ultimately I think it's going to be up to the PEP authors to decide what they want to propose here. I think we now have these choices: 1. Fill in a default index with one of: a. None b. empty tuple () c. NotImplemented d. a new, unhashable builtin Missing or NoIndex 2. Prohibit keyword-only subscripts. 3. Bind the right-hand side value to the index parameter, and leave the value parameter blank (as above). Did I miss any? There are probably more heavyweight alternatives that require changes to parameter binding, or new dunders, or runtime introspection of the dunder, but I'm not sure that the PEP authors want to consider those (and if they do, they should probably be in a competing PEP). I think that: 1a. is difficult (but not impossible) for numpy to use; 1b. makes a certain sense but is confusable with an actual tuple subscript; 1c. recycles an existing builtin that is very unlikely to be currently used as a subscript; 1d. avoids any chance of that, but requires a new builtin; 2. would be disappointing but I could live with it; 3. feels ugly and inelegant and will probably confuse people. Stefano, Jonathan, I think that if there's no more comments on this specific issue in the next few days, it's up to you now to make a choice, put it in the PEP, list the alternatives as above and why you are rejecting them (perhaps by linking to this thread -- you don't have to recap the entire discussion inside the PEP, just a brief summary). -- Steve
On Sun, Sep 27, 2020 at 7:08 PM Steven D'Aprano <steve@pearwood.info> wrote:
1. Fill in a default index with one of:
a. None b. empty tuple () c. NotImplemented d. a new, unhashable builtin Missing or NoIndex
1d. avoids any chance of that, but requires a new builtin;
An interesting and very good point. I kinda like the idea that it would cause an immediate exception if you use it with any of the default types (by not being numeric or hashable), but it raises the question "why isn't this hashable", since it would be unique and immutable. Ultimately I don't think it's worth it, but the unhashability WOULD be advantageous here. I'm currently inclined towards the empty tuple option, myself. ChrisA
On Sun, Sep 27, 2020 at 8:49 AM Christopher Barker <pythonchb@gmail.com> wrote:
On Fri, Sep 25, 2020 at 11:50 PM Steven D'Aprano <steve@pearwood.info> wrote:
1. We have to pass a sentinel to the setitem dunder if there is no positional index passed.
I still don't follow this logic -- why can't nothing be passed? The dunders either require an index or they don't, would that be just like function calling? So (adapting the example in the PEP:
obj[spam=1, eggs=2] # calls type(obj).__getitem__(obj, spam=1, eggs=2)
Yes, getitem is fine - but look at setitem. ChrisA
On Sat, Sep 26, 2020 at 03:48:42PM -0700, Christopher Barker wrote:
On Fri, Sep 25, 2020 at 11:50 PM Steven D'Aprano <steve@pearwood.info> wrote:
1. We have to pass a sentinel to the setitem dunder if there is no positional index passed.
I still don't follow this logic -- why can't nothing be passed? The dunders either require an index or they don't, would that be just like function calling? So (adapting the example in the PEP:
obj[spam=1, eggs=2] # calls type(obj).__getitem__(obj, spam=1, eggs=2)
This sure seems like the obvious way to handle it.
Christopher, with the greatest respect, it is really demoralising for me to explain this issue something like three, four, maybe five times now (I'm not going to go back and count), including this thread which is specifically about this issue, and then have people seemingly not even read it before writing back to disagree :-( The problem isn't with the `__getindex__` dunder. It's the `__setindex__` dunder. I said it right there in the comment you quoted. Okay, I was lazy and dropped the underscores, but still, **set**index is right there. Then I spent a lot of time explaining why setindex is a problem. The problem is, how does the interpreter pass the second positional argument without passing something as the first positional argument? This isn't a problem for subscripting alone. It's a problem for any function call: def function(first=None, second=None, /): print(first, second) I've explicitly flagged the arguments as "positional only" to avoid (non-)solutions that rely on the interpreter knowing the names of the parameters at runtime. You can only pass arguments by position, they have to be filled left-to-right, and the aim is to successfully pass a value for `second` but no value for `first`. You can't rely on the author of the function to do the argument processing like this: def function(*args): if len(args) == 1: second = args[0] first = None # default elif len(args) == 2: first, second = args else: raise TypeError('wrong number of arguments') That's why there are so few functions in the Python ecosystem with a signature similar to range: range( [start,] end [, step] ) and that's the problem that we solve here by auto-filling some sentinel for the index when the subscript is keyword-only. If the constraints are: * the right hand side assignment value gets bound to the second positional argument (not counting "self"), as expected; * there are no changes to way the interpreter binds arguments to parameters; * there is no requirement that all `__setitem__` methods use the same fixed parameter name for the value argument so that it can be passed by a standard name (even "self" is just a convention); * and no runtime introspection by the interpreter to find out what the parameter name is (too slow); then it is hard to see any other solution than to pass a special sentinel to setitem to represent the missing index. If you have any other solutions, or if you have a persuasive argument in favour of relaxing one or more of those constraints, I'd love to hear it. -- Steve
On Sat, Sep 26, 2020 at 22:57 Steven D'Aprano wrote:
Christopher, with the greatest respect, it is really demoralising for me to explain this issue something like three, four, maybe five times now (I'm not going to go back and count), including this thread which is specifically about this issue, and then have people seemingly not even read it before writing back to disagree :-(
+1 —Guido -- --Guido (mobile)
On Fri, Sep 25, 2020, 5:49 PM Steven D'Aprano
Since both None and () are likely to be legitimate indexes, and NotImplemented is less likely to be such, I think this supports using NotImplemented.
I think your arguments for NotImplemented vs None or () are solid. But I'm having trouble crossing my eyes in just the way that will make those words fit the purpose. What about instead subclassing NotImplemented as maybe NoIndex? That would make sense in reading, and have the same favorable characteristics. Moreover, being a brand new class, we can be certain no one is already using it.
On Fri, Sep 25, 2020 at 08:52:45PM -1000, David Mertz wrote:
On Fri, Sep 25, 2020, 5:49 PM Steven D'Aprano
Since both None and () are likely to be legitimate indexes, and NotImplemented is less likely to be such, I think this supports using NotImplemented.
I think your arguments for NotImplemented vs None or () are solid. But I'm having trouble crossing my eyes in just the way that will make those words fit the purpose.
What about instead subclassing NotImplemented as maybe NoIndex? That would make sense in reading, and have the same favorable characteristics. Moreover, being a brand new class, we can be certain no one is already using it.
I don't object to this. If we added NoIndex, we could make it unhashable too. But I wouldn't want the PEP to rely on the Steering Council accepting a new builtin just for this. If the Steering Council will accept a new builtin, then I agree, this is a nice way of handling it and avoiding even the faintest possibility of collision with code that uses NotImplemented as a key. -- Steve
I don't understand the problem here. d[p=q] --> d.__{get,set,del}item__((), ..., p=q) d[1, p=q] --> d.__{get,set,del}item__((1), ..., p=q) d[1, 2, p=q] --> d.__{get,set,del}item__((1, 2), ..., p=q) d[1, 2, 3, p=q] --> d.__{get,set,del}item__((1, 2, 3), ..., p=q) d[1, 2, ..., n, p=q] --> d.__{get,set,del}item__((1, 2, ..., n), ..., p=q) Now obviously the n=1 case is a wart. But the n=0 case isn't a wart. It's just like n=2, n=3, etc. You can't tell the difference between a single tuple argument and n arguments for any n≠1. As far as I can tell the problem when n=0 is no more or less serious than the problem when n≥2. I don't see the point of adding another special case to the spec when it doesn't even solve the general problem.
On Sat, Sep 26, 2020 at 10:30 PM Ben Rudiak-Gould <benrudiak@gmail.com> wrote:
I don't understand the problem here.
d[p=q] --> d.__{get,set,del}item__((), ..., p=q) d[1, p=q] --> d.__{get,set,del}item__((1), ..., p=q) d[1, 2, p=q] --> d.__{get,set,del}item__((1, 2), ..., p=q) d[1, 2, 3, p=q] --> d.__{get,set,del}item__((1, 2, 3), ..., p=q) d[1, 2, ..., n, p=q] --> d.__{get,set,del}item__((1, 2, ..., n), ..., p=q)
Now obviously the n=1 case is a wart. But the n=0 case isn't a wart. It's just like n=2, n=3, etc.
You can't tell the difference between a single tuple argument and n arguments for any n≠1. As far as I can tell the problem when n=0 is no more or less serious than the problem when n≥2. I don't see the point of adding another special case to the spec when it doesn't even solve the general problem.
The problem is that there is lots of existing code like this: def __setitem__(self, index, value): ... But the new features will almost certainly lead people to write new code like this: d={} obj[**d] = "foo" # no kwd arguments provided here ...which, if someone does this arbitrarily against classes that use the existing kind of code I gave at the first, will call (if the sentinel is () ): obj.__setitem__(()) ...which could have all kinds of weird effects rather than giving an error, as I think would be better. Steven's response to that was essentially "well then, don't unpack dictionaries against arbitrary subscriptable types" which I fully agree is a perfectly legitimate response. But I think having no sentinel at all and instead telling people "either provide a default argument for both index AND value in your __setitem__ method, or neither" is also a perfectly legitimate way forward. Not sure which is better. --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
On Sat, Sep 26, 2020 at 10:43 PM Ricky Teachey <ricky@teachey.org> wrote:
...which, if someone does this arbitrarily against classes that use the existing kind of code I gave at the first, will call (if the sentinel is () ):
obj.__setitem__(())
...which could have all kinds of weird effects rather than giving an error, as I think would be better.
Sorry correction: obj.__setitem__((), "foo") --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
On 27/09/20 3:43 pm, Ricky Teachey wrote:
telling people "either provide a default argument for both index AND value in your __setitem__ method, or neither" is also a perfectly legitimate way forward.
Giving the value parameter of a __setitem__ a default seems like a weird thing to do, since under normal circumstances it will never get used. -- Greg
On Sat, Sep 26, 2020, 11:30 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 27/09/20 3:43 pm, Ricky Teachey wrote:
telling people "either provide a default argument for both index AND value in your __setitem__ method, or neither" is also a perfectly legitimate way forward.
Giving the value parameter of a __setitem__ a default seems like a weird thing to do, since under normal circumstances it will never get used.
-- Greg
I agree it's certainly weird. But it fixes the problem. And there's not much effort required.
On Sat, Sep 26, 2020 at 7:44 PM Ricky Teachey <ricky@teachey.org> wrote:
The problem is that there is lots of existing code like this:
def __setitem__(self, index, value): ...
But the new features will almost certainly lead people to write new code like this:
d={} obj[**d] = "foo" # no kwd arguments provided here
...which, if someone does this arbitrarily against classes that use the existing kind of code I gave at the first, will call (if the sentinel is () ):
obj.__setitem__((), "foo") # [Correction by Guido after Ricky's email]
...which could have all kinds of weird effects rather than giving an error, as I think would be better.
Steven's response to that was essentially "well then, don't unpack dictionaries against arbitrary subscriptable types" which I fully agree is a perfectly legitimate response. But I think having no sentinel at all and instead telling people "either provide a default argument for both index AND value in your __setitem__ method, or neither" is also a perfectly legitimate way forward.
As I said there clearly are problems with your solution, because there's no way to call a function with the first positional argument omitted and the second positional argument given (and as I have argued, in the presence of `**kwargs`, index and value *must* be positional-only arguments). And honestly, if you want to shoot yourself in the foot (or have some use for it!), you can write ``` d[()] = "foo" ``` in today's Python. Presumably someone who uses `obj[**kwargs]` is doing it for an object that takes keyword args, so that object must already be prepared for index being `()`. PS. Maybe we can just stop debating this PEP? I have a feeling that we're going around in circles. I think Stefano has some changes that he would like to see vetted, but the best way to signal an empty index is not one of them. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On Sat, Sep 26, 2020 at 11:49 PM Guido van Rossum <guido@python.org> wrote:
On Sat, Sep 26, 2020 at 7:44 PM Ricky Teachey <ricky@teachey.org> wrote:
The problem is that there is lots of existing code like this:
def __setitem__(self, index, value): ...
But the new features will almost certainly lead people to write new code like this:
d={} obj[**d] = "foo" # no kwd arguments provided here
...which, if someone does this arbitrarily against classes that use the existing kind of code I gave at the first, will call (if the sentinel is () ):
obj.__setitem__((), "foo") # [Correction by Guido after Ricky's email]
...which could have all kinds of weird effects rather than giving an error, as I think would be better.
Steven's response to that was essentially "well then, don't unpack dictionaries against arbitrary subscriptable types" which I fully agree is a perfectly legitimate response. But I think having no sentinel at all and instead telling people "either provide a default argument for both index AND value in your __setitem__ method, or neither" is also a perfectly legitimate way forward.
As I said there clearly are problems with your solution, because there's no way to call a function with the first positional argument omitted and the second positional argument given (and as I have argued, in the presence of `**kwargs`, index and value *must* be positional-only arguments).
Ok I do understand, my suggestion was just to go ahead and let them be bound to the wrong arguments when someone makes what is a mistake.
And honestly, if you want to shoot yourself in the foot (or have some use for it!), you can write ``` d[()] = "foo" ``` in today's Python. Presumably someone who uses `obj[**kwargs]` is doing it for an object that takes keyword args, so that object must already be prepared for index being `()`.
Yeah good point, people can make mistakes as things stand. My desire was to prevent creation of specific new possibility of mistake that leads to side effects that might be hard to uncover/debug. PS. Maybe we can just stop debating this PEP? I have a feeling that we're
going around in circles. I think Stefano has some changes that he would like to see vetted, but the best way to signal an empty index is not one of them.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
Ok I'll refrain from responding on this topic anymore. Sorry Guido just trying to help since the assumption that a sentinel is needed at all didn't really make sense to me (still doesn't). --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler On Sat, Sep 26, 2020 at 11:49 PM Guido van Rossum <guido@python.org> wrote:
On Sat, Sep 26, 2020 at 7:44 PM Ricky Teachey <ricky@teachey.org> wrote:
The problem is that there is lots of existing code like this:
def __setitem__(self, index, value): ...
But the new features will almost certainly lead people to write new code like this:
d={} obj[**d] = "foo" # no kwd arguments provided here
...which, if someone does this arbitrarily against classes that use the existing kind of code I gave at the first, will call (if the sentinel is () ):
obj.__setitem__((), "foo") # [Correction by Guido after Ricky's email]
...which could have all kinds of weird effects rather than giving an error, as I think would be better.
Steven's response to that was essentially "well then, don't unpack dictionaries against arbitrary subscriptable types" which I fully agree is a perfectly legitimate response. But I think having no sentinel at all and instead telling people "either provide a default argument for both index AND value in your __setitem__ method, or neither" is also a perfectly legitimate way forward.
As I said there clearly are problems with your solution, because there's no way to call a function with the first positional argument omitted and the second positional argument given (and as I have argued, in the presence of `**kwargs`, index and value *must* be positional-only arguments).
And honestly, if you want to shoot yourself in the foot (or have some use for it!), you can write ``` d[()] = "foo" ``` in today's Python. Presumably someone who uses `obj[**kwargs]` is doing it for an object that takes keyword args, so that object must already be prepared for index being `()`.
PS. Maybe we can just stop debating this PEP? I have a feeling that we're going around in circles. I think Stefano has some changes that he would like to see vetted, but the best way to signal an empty index is not one of them.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On Sun, 27 Sep 2020 at 05:06, Ricky Teachey <ricky@teachey.org> wrote:
obj[**d] = "foo" # no kwd arguments provided here
I committed yesterday the following proposal https://github.com/python/peps/pull/1622 But to be honest I am not sure if we should disallow these two constructs d[*()] d[**{}] as equivalent to the disallowed d[] or allow them as equivalent to d[()] (or whatever the sentinel will be)
PS. Maybe we can just stop debating this PEP? I have a feeling that we're going around in circles. I think Stefano has some changes that he would like to see vetted, but the best way to signal an empty index is not one of them.
I am not sure. I am on the fence on many topics. There seem to be no clear solution on many of them, it boils down to taste and compromise. In any case, I listen to all proposals (although with a small delay). I am working on the sentinel issue at the moment https://github.com/python/peps/compare/master...stefanoborini:pep-637-on-def... -- Kind regards, Stefano Borini
On Sun, 27 Sep 2020 at 12:28, Stefano Borini <stefano.borini@gmail.com> wrote:
I am not sure. I am on the fence on many topics. There seem to be no clear solution on many of them, it boils down to taste and compromise. In any case, I listen to all proposals (although with a small delay). I am working on the sentinel issue at the moment
https://github.com/python/peps/compare/master...stefanoborini:pep-637-on-def...
Sentinel is now a PR https://github.com/python/peps/pull/1626 I kept the tuple as the accepted option, but I am personally open to NoIndex as well. I am not sure how the SC would take a non-hashable, new constant to be honest, for such a specific use case. It would help if more people replied to Steven's poll to gather a clearer picture of the general opinion. -- Kind regards, Stefano Borini
On Sun, Sep 27, 2020 at 10:41 AM Stefano Borini <stefano.borini@gmail.com> wrote:
I kept the tuple as the accepted option, but I am personally open to NoIndex as well. I am not sure how the SC would take a non-hashable, new constant to be honest, for such a specific use case.
My "vote" is for NoIndex. I suggested that new object and name, after all :-). But I find empty tuple perfectly fine, and would have no objection if that is chosen. -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.
On 2020-09-27 21:47, David Mertz wrote:
On Sun, Sep 27, 2020 at 10:41 AM Stefano Borini <stefano.borini@gmail.com <mailto:stefano.borini@gmail.com>> wrote:
I kept the tuple as the accepted option, but I am personally open to NoIndex as well. I am not sure how the SC would take a non-hashable, new constant to be honest, for such a specific use case.
My "vote" is for NoIndex. I suggested that new object and name, after all :-).
But I find empty tuple perfectly fine, and would have no objection if that is chosen.
I'd go for something that would have a wider use. Consider, for example, the use-case of a function that has an optional parameter and you want a way to know whether an argument has been provided but None is a valid value that could be passed in. Having a singleton such as Missing would be helpful there. An index/subscript that has no positional component is just a special case of that.
On Sun, Sep 27, 2020 at 2:18 PM MRAB <python@mrabarnett.plus.com> wrote:
Consider, for example, the use-case of a function that has an optional parameter and you want a way to know whether an argument has been provided but None is a valid value that could be passed in.
Having a singleton such as Missing would be helpful there.
The trouble is that's what None was supposed to be for. Over time, Missing would presumably fall victim to the same fate that befell None. You'd also probably have to extend PEP 505 to support Missing-aware operators. Maybe a singleton that supported no useful operations, not even __eq__ or __bool__, would be sufficiently inconvenient that it would only be used for defaults and "is" tests for said defaults. On Sun, Sep 27, 2020 at 10:52 PM Chris Angelico <rosuav@gmail.com> wrote:
English treats 1 as special and 0 as the same as other numbers when it comes to singulars and plurals. [...] What do other languages do in this way?
You were asking about natural languages but perhaps it's worth mentioning that Haskell has tuples of length 0, 2, 3, ..., but no tuples of length 1. Tuples are meant for putting n values where 1 value is expected, and when n=1 you just put the value there. On Mon, Sep 28, 2020 at 11:47 AM Stefano Borini <stefano.borini@gmail.com> wrote:
Also, it would give different behavior between d[x] and d[x, **kw], which in my opinion should be a fully degenerate case.
On the other hand, it would make d[x,] and d[x, **kw] consistent, which they also ought to be. What a mess.
On 29/09/20 4:19 pm, Ben Rudiak-Gould wrote:
it's worth mentioning that Haskell has tuples of length 0, 2, 3, ..., but no tuples of length 1. Tuples are meant for putting n values where 1 value is expected, and when n=1 you just put the value there.
To elaborate on that a bit, the length of a tuple is part of its static type in Haskell. It's impossible to write a function that takes a tuple of arbitrary length -- a function taking a 1-tuple could only ever take a 1-tuple, so it might as well just take the value directly. Things are different in Python -- functions can operate on tuples of varying lengths, so if 1-tuples didn't exist some things would be a bit awkward. Not sure what bearing this has, if any, on the indexing problem. -- Greg
On Mon, Sep 28, 2020 at 08:19:01PM -0700, Ben Rudiak-Gould wrote:
Maybe a singleton that supported no useful operations, not even __eq__ or __bool__, would be sufficiently inconvenient that it would only be used for defaults and "is" tests for said defaults.
NotImplemented is halfway there: it supports equality test, but as of 3.9 or 3.10, I forget which, it no longer supports use in a bool context. -- Steve
On Sun, Sep 27, 2020 at 1:51 PM David Mertz <mertz@gnosis.cx> wrote:
On Sun, Sep 27, 2020 at 10:41 AM Stefano Borini <stefano.borini@gmail.com> wrote:
I kept the tuple as the accepted option, but I am personally open to NoIndex as well. I am not sure how the SC would take a non-hashable, new constant to be honest, for such a specific use case.
Would it need to be in __bulitins__ ? could it not be hidden away in a module somewhere (operator maybe?) I don't really see a reason not to put in __builtins__, but if there is resistance to that then it may not be necessary. And non-hashable is better, but not absolutely necessary either. Particularly if it's not in __builtins__, folks are unlikely to go use it as a dict key. So it could be a pretty lightweight new constant. The key is that only the authors of classes that use this new fancy indexing behavior (and the interpreter) will need access to it -- "regular" users will never see it. As for using an empty tuple, thanks Guido for laying out the logic so succinctly, and it does make it pretty simple that only the one index case is special. Nevertheless, I think most folks expect the special case to be at the end of the "series", not in the middle -- i.e. four is the same as three is the same as two, one is special, and zero is not allowed. rather than four is the same as three is the same as two, one is special, and zero is the same as two, three, four, ... For that reason, I prefer a separate sentinel. But in fact, the number of people that write classes that use this new behavior will be orders of magnitude smaller than the number of users of such classes [*], so as long as it looks simple and consistent from the users side, anything that can be explained is fine. The other thing to keep in mind is that even with the current, simpler situation, I suspect that a lot of people don't know that multiple indexes don't actually exist in current Python, and that the comma is not passing another index, but is actually simply making a tuple, just like it does everywhere else. It looks like passing multiple arguments to a function and, if you don't look inside the class, it acts like it too. if keyword indices are added, it will look even more like a function call. So: it may be rare, but it would be pretty confusing if obj[(), kw=x] is the same as: obj[kw=x] whereas: obj[NoIndex, kw=x] is the same as obj[kx=x] would be less surprising, and even less likely to be stumbled upon anyway. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Mon, Sep 28, 2020 at 3:33 PM Christopher Barker <pythonchb@gmail.com> wrote:
As for using an empty tuple, thanks Guido for laying out the logic so succinctly, and it does make it pretty simple that only the one index case is special. Nevertheless, I think most folks expect the special case to be at the end of the "series", not in the middle -- i.e.
four is the same as three is the same as two, one is special, and zero is not allowed.
rather than
four is the same as three is the same as two, one is special, and zero is the same as two, three, four, ...
For that reason, I prefer a separate sentinel.
English treats 1 as special and 0 as the same as other numbers when it comes to singulars and plurals. You talk about "an item", but "no items" or "two items" or "three items". As you descend through the numbers, four ("items") is the same as three ("items") is the same as two ("items"), and one is special ("item"), and zero is back to the normal case ("items"). I'm not saying that English should be the fundamental basis upon which Python is designed, but at least there's some precedent :) What do other languages do in this way? Are other human languages based around the idea that zero matches others, or are there different ways to write things? Dutch is, of course, of particular interest here :) ChrisA
On 2020-09-28 06:51, Chris Angelico wrote:
On Mon, Sep 28, 2020 at 3:33 PM Christopher Barker <pythonchb@gmail.com> wrote:
As for using an empty tuple, thanks Guido for laying out the logic so succinctly, and it does make it pretty simple that only the one index case is special. Nevertheless, I think most folks expect the special case to be at the end of the "series", not in the middle -- i.e.
four is the same as three is the same as two, one is special, and zero is not allowed.
rather than
four is the same as three is the same as two, one is special, and zero is the same as two, three, four, ...
For that reason, I prefer a separate sentinel.
English treats 1 as special and 0 as the same as other numbers when it comes to singulars and plurals. You talk about "an item", but "no items" or "two items" or "three items". As you descend through the numbers, four ("items") is the same as three ("items") is the same as two ("items"), and one is special ("item"), and zero is back to the normal case ("items").
I'm not saying that English should be the fundamental basis upon which Python is designed, but at least there's some precedent :) What do other languages do in this way? Are other human languages based around the idea that zero matches others, or are there different ways to write things?
Dutch is, of course, of particular interest here :)
Some languages have singular, dual, and plural. Some languages use the singular if the number ends with 1, so "20 items", "21 item", "22 items". I believe I read somewhere that some languages are OK with "found 2 files" and "found 1 file", but reject "found 0 files"; it has to be "didn't find any files". And some languages don't inflect for number at all.
On Sun, Sep 27, 2020 at 4:29 AM Stefano Borini <stefano.borini@gmail.com> wrote:
```
obj[**d] = "foo" # no kwd arguments provided here
I committed yesterday the following proposal https://github.com/python/peps/pull/1622 But to be honest I am not sure if we should disallow these two constructs
d[*()] d[**{}] ``` as equivalent to the disallowed `d[]` or allow them as equivalent to `d[()]` (or whatever the sentinel will be)
I have thought extensively about this issue (in fact I lie awake thinking it through last night :-) and have come to some kind of resolution. TL;DR: these should be allowed and use `d[()]` or whatever the sentinel will be, but I prefer the sentinel to be `()`. But there are more cases than just the above two. Below I try to catch them all. We've already established that in the absence of `*a` and keyword args, the index is a tuple when it looks like a tuple: ``` SYNTAX INDEX d[x] x d[x,] (x,) d[x, y] (x, y) ``` We've also established that in the absence of `*a` but with at least one keyword arg, the index is a tuple unless there is exactly one index value: ``` SYNTAX INDEX KWARGS d[x, k=z] x {"k": z} d[x, y, k=z] (x, y) {"k": z} ``` I propose to treat `*a` in a similar fashion: If _after expansion of `*a`_ there is exactly one index value, the index is that value, otherwise it's a tuple. IOW: ``` SYNTAX INDEX d[x, *[]] x d[x, *[y]] (x, y) d[*[], x] x d[*[y], x] (y, x) d[*[]] () d[*[x]] x d[*[x, y]] (x, y) ``` Note that I use `*[...]` instead of `*(...)` -- since `*a` takes an arbitrary iterable, it doesn't matter whether `a` is a list, tuple, another type of sequence, or a general iterable (e.g. a set or dict, or a generator). I'm using `*[...]` consistently just to remind us of this fact (and to avoid having to type a trailing comma to create a singleton tuple). We can easily extend this scheme to keyword args: As soon as either a keyword argument or `**kwargs` (or both) is present, we apply the same rule: If _after expansion of `*a`_ there is exactly one index value, the index is that value, otherwise it's a tuple. I'm not going to show all the cases, but here are some examples: ``` SYNTAX INDEX KWARGS d[*[], k=z] () {"k": z} d[*[x], k=z] x {"k": z} d[*[x, y], k=z] (x, y) {"k": z} d[*[], **{}] () {} d[*[x], **{}] x {} d[*[x, y], **{}] (x, y) {} ``` I propose to then treat the cases where there are no positional index values, only keywords (either `k=1` or `**{...}`, even `**{}`) _as if preceded by `*[]`_. So: ``` SYNTAX INDEX KWARGS d[k=z] () {"k": z} d[**{"k": z} () {"k": z} d[**{}] () {} ``` The reason for these choices is to minimize the number of inconsistencies. We have a few unavoidable inconsistencies: - if there's only one index value the index is not a tuple, in all other cases it's a tuple -- backward compatibility - the special case for `d[x,]` (not the same as `d[x]`) -- also backward compatibility - the difference between `d[x,]` (index is a tuple) and `d[x, k=1]` (index not a tuple) -- user expectations There's also the special case for `d[k=1]`. Here our choices are to either forbid it syntactically or to provide a sentinel. I think forbidding it will prevent some reasonable use cases (e.g. tables with columns that have both positions and names). So I think it's best to use a sentinel. Using `()` as the sentinel reduces the number of special cases: the rule becomes "if there's exactly one positional value, use that value as the index; otherwise use a tuple", while with another sentinel the rule would become "if there's more than one positional value, use a tuple, if there's exactly one use that value, else (there are no positional values) use the sentinel". But if in the end people prefer a sentinel other than `()`, I can live with that -- in all the above cases, just replace `()` with the selected sentinel. In both cases we also have an exception for the form `d[x,]`, but this factors out because it's the same complication in each case. Note that this exception is unique -- it only applies if the syntactic form has no keywords, no `**kwargs`, and no `*args`. The introduction of `*args` requires us to decide what to do with the edge cases. I think the rule that best matches user expectations is to combine the plain positional values with the expansion of `*args`, take the resulting sequence of values, and _then_ apply the rule from the previous paragraph (using whichever sentinel we end up deciding on). The introduction of `**kwargs` should pose no extra difficulties. Again, if there are no positional index values the sentinel index is used, and syntactic keywords are combined with `**kwargs` to form a single dict of keyword args. We end up finding that `d[**kwargs]` uses the index sentinel regardless of whether `kwargs` is empty or not. (If we were to end up forbidding `d[k=1]` syntactically, the only consistent choice would be to raise for `d[**{}]`, but I don't see a good reason to go this way.) Note that `*args` and `**kwargs` both must combine the hard-coded arguments of the same nature (positional or keyword) with the dynamic ones before deciding. Anything else would lead to more rules and more inconsistencies. Finally, in the above examples, `x`, `y` and `z`, when occurring in hard-coded arguments of either nature, may also be slices, e.g. `d[x]` could be `d[i:j]` or `d[i:j:k]` or e.g. `d[::]`. This makes no difference for the analysis. Note that in `*args` and `**kwargs` the slice notation is not syntactically valid -- but you can use explicit calls to `slice()`, e.g. `slice(i, j)`, `slice(i, j, k)` or `slice(None, None, None)`. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
Would another option be to just stop using the tuple-less index in the presence of new syntax - so by example, ``` SYNTAX INDEX KWARGS d[*[]] () {} d[*[],] () {} d[**{}] () {} d[**{},] () {} d[foo=1] () {'foo': 1} d[foo=1,] () {'foo': 1} d[x] x {} (valid existing syntax) d[x,] (x,) {} (valid existing syntax) d[x, *[]] (x,) {} d[x, *[],] (x,) {} d[x, **{}] (x,) {} d[x, **{},] (x,) {} d[x, foo=1] (x,) {'foo': 1} d[x, foo=1,] (x,) {'foo': 1} ``` Essentially, the rules are: * Does the indexing contain any of the new syntax proposed in this PEP (`*`, `**`, or explicit kwarg)? If yes: * Always pass a tuple as the index. This is new-style indexing, lets leave behind the weird corner cases. `()` is a natural choice of singleton. * If no: * Use the existing rule of parsing the contents of `[]` as a tuple The one downside of this approach is that perfect forwarding of `__getitem__` requires a small dance ``` def __getitem__(self, args, **kwargs): if not type(args) is tuple and not kwargs: # this is pre-PEP-637 indexing return self._wrapped[args] else: # the presence of `*` or `**` is enough to force PEP-637 indexing, even # if those are empty return self._wrapped[*args, **kwargs] ``` This would only come up in new code though - any forwarding `__getitem__` without `**kwargs` would already be written correctly. I'd argue it would be sufficient to note this a usage note in the PEP. Eric
On Mon, 28 Sep 2020 at 15:45, Eric Wieser <wieser.eric+numpy@gmail.com> wrote:
Would another option be to just stop using the tuple-less index in the presence of new syntax - so by example,
I don't think it would be a reliable approach. Now you end up with a "randomly" occurring special case depending on how it's invoked. Also, it would give different behavior between d[x] and d[x, **kw], which in my opinion should be a fully degenerate case. -- Kind regards, Stefano Borini
On Sat, Sep 26, 2020 at 10:43:23PM -0400, Ricky Teachey wrote:
The problem is that there is lots of existing code like this:
def __setitem__(self, index, value): ...
But the new features will almost certainly lead people to write new code like this:
d={} obj[**d] = "foo" # no kwd arguments provided here
I don't think that this will almost certainly lead people to write code like that. Why would you do that deliberately? Apart from wanting to win a bet or demonstrate a point. A better argument is that people will write code that unpacks a dict which is supposed to have keyword arguments, but *accidently* ends up being empty. kw = get_keywords() # oops, this returns an empty dict obj[**kw] = value which will be equivalent to: obj[()] = value # or whatever sentinel we choose I think that counts as *user error*, not a design flaw. Any function that takes defaults is vulnerable to the same sort of problem: "I intended to unpack a bunch of arguments (whether keyword or positional) but they were unexpectedly empty, so I got the default values. Python is buggy!!!" No, Python is working as designed. The only new thing here is that Python is passing a default for the leftmost parameter, but even that is not precisely *new*. Although the internal implementation is different, that's how the range API works: # intended to unpack a two-tuple (start, end) # but accidentally unpacked a *one*-tuple and got start=0 range(*args) So honestly, I think there's nothing to see here. Choose a sentinel, document it, and move on. -- Steve
On 27/09/20 7:10 pm, Steven D'Aprano wrote:
kw = get_keywords() # oops, this returns an empty dict obj[**kw] = value
If an explicit d[] is going to be a compile-time error, maybe anything that has the same effect at run time should be an error too? -- Greg
Salve Stefano, Stefano Borini schrieb am 23.09.20 um 22:55:
"Support for indexing with keyword arguments" has now been merged with the assigned PEP number 637.
Cool, this looks like a great addition to the language! One thing that I'm missing from the PEP is the C side of things, though. How are C extension types going to implement this? Will there be two new slot methods for them? Something like (but better named than) "mp_subscript_kw()" and "mp_ass_subscript_kw()"? Would the extension types signal their availability with a new "TP_*" class feature flag? Are the slot methods going to use the vectorcall calling convention, i.e. pass the keyword names as a tuple, or should they accept (and thus, require the overhead of) a Python dict as argument? This design must clearly be part of the PEP. What's the current status of the discussion there? Stefan
Stefan Behnel schrieb am 29.09.20 um 11:48:
Salve Stefano,
Stefano Borini schrieb am 23.09.20 um 22:55:
"Support for indexing with keyword arguments" has now been merged with the assigned PEP number 637.
Cool, this looks like a great addition to the language!
One thing that I'm missing from the PEP is the C side of things, though. How are C extension types going to implement this?
Will there be two new slot methods for them? Something like (but better named than) "mp_subscript_kw()" and "mp_ass_subscript_kw()"?
Would the extension types signal their availability with a new "TP_*" class feature flag?
Are the slot methods going to use the vectorcall calling convention, i.e. pass the keyword names as a tuple, or should they accept (and thus, require the overhead of) a Python dict as argument?
This design must clearly be part of the PEP. What's the current status of the discussion there?
So, should I conclude from the silence that there has not been any discussion of this so far? Stefan
On Tue, Oct 6, 2020 at 1:16 AM Stefan Behnel <stefan_ml@behnel.de> wrote:
Stefan Behnel schrieb am 29.09.20 um 11:48:
One thing that I'm missing from the PEP is the C side of things, though. How are C extension types going to implement this?
Will there be two new slot methods for them? Something like (but better named than) "mp_subscript_kw()" and "mp_ass_subscript_kw()"?
Would the extension types signal their availability with a new "TP_*" class feature flag?
Are the slot methods going to use the vectorcall calling convention, i.e. pass the keyword names as a tuple, or should they accept (and thus, require the overhead of) a Python dict as argument?
This design must clearly be part of the PEP. What's the current status of the discussion there?
So, should I conclude from the silence that there has not been any discussion of this so far?
That's unfortunately right. I think the main reason is that the PEP authors are not very familiar with the C API. You ask some good questions about the style of the API. Right now the only bit of C API in the PEP is the proposal to add `PyObject_GetItemWithKeywords()` and friends, which do take a dict as argument. We could define new slots that have the same signature, or we could add a new flag to the type object's tp_flags field stating that the existing slots take a dict as argument. I'm not sure that we need a vector style API here -- I have a feeling that this isn't going to be performance critical. (Certainly not for most people, and not for the PEP authors' use cases.) If you really want to help, maybe you can do a little prototype coding and propose a specific API? -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
participants (20)
-
2QdxY4RzWzUUiLuE@potatochowder.com
-
Ben Rudiak-Gould
-
Chris Angelico
-
Christopher Barker
-
David Mertz
-
Eric Wieser
-
Greg Ewing
-
Guido van Rossum
-
Henk-Jaap Wagenaar
-
Henry F. Schreiner
-
henryfs@princeton.edu
-
MRAB
-
Nicholas Cole
-
Ricky Teachey
-
Sebastian Kreft
-
Stefan Behnel
-
Stefano Borini
-
Stephan Hoyer
-
Steven D'Aprano
-
Todd