This has been discussed. The current consensus approach would be to keep the index argument as a single value, while making keyword indices as keyword arguments. So something like:
def __getitem__(self, index, **kwargs):
Classes that don't want to handle keyword indices just don't have to implement handling for keyword arguments.
This has the advantage that it makes it easier to hard-code dimension labels if desired. Although I haven't heard from pandas devs on this, where dataframes are fixed at having two dimensions, "row" and "column" , you could potentially implement something like:
def __getitem__(self, index, row=None, column=None):
No special work would be needed to check whether the keywords match, which would be needed with a new class.
On Thu, Aug 20, 2020 at 1:42 PM Christopher Barker firstname.lastname@example.org wrote:
I have not fully thought this out yet, but while my first instinct was to agree with others to “just use the calling conventions we already have”, there is a wrinkle:
Current indexing behavior is an oddball now:
( you all know this, but I think it’s helpful to lay it out)
The signature of __getitem__ is always:
def __getitem__(self, index):
If you pass a single item:
then that thing gets assigned to index, whatever it is.
if you pass more than one item:
an_object(thing1, thing2, thing3)
then a tuple of those gets assigned to index, and the implementation of __getitem__ has to parse that out itself, which is different than the "usual" argument passing, where it would always be a tuple, whether is was a length-one or not. (and to get that tuple, you'd need to use *args, or specify a particular number of arguments.
So: if we want to maintain backward compatibility, we *can't* use the regula rargument passing approach, it will have to be a slightly odd special case.
Which brings us to (what I think is) Jonathan's idea: we keep the idea that __getitem__ always accepts a single argument.
now it's either a single object or a tuple of objects. If we extend that, then it's either a single object, or a tuple of opjects, or a new "keywords" object that would hold both the positional and keyword "arguments", so any old code that did somethign like:
def __getitem__(self, index): if isinstance(index, tuple): handle_the_tuple_of_indices(index) else: handle_a_single_index(index)
would still work as it does now.
and if something wanted to implement keywords, it could add a clause:
elif isinstance(index, keywords_object): handle_all_the_args_and_keywords(index)
and away we go.
TL;DR: Indexing would now create one of:
- a single item
- a tuple of items
- a keywords_object_of positional and keyword arguments.
And just like we can now create a tuple of indices and pass them in as a single object, we could also create a keyword_object some other way and pass that in directly.
If we did not do this, could we use:
and if *args was length-1, it would get extracted from the tuple? or would the seroth item of *args always get extracted from the tuple?
So creating a new object to hold the arguments of an indexing operation is a bit awkward, yes, but more consistent with how it currently works.
On Thu, Aug 20, 2020 at 9:55 AM Jonathan Fine email@example.com wrote:
It has the same capabilities, the question is whether it has any
additional abilities that would justify the added complexity.
The most obvious additional ability is that always >>> d[SOME_EXPRESSION] is equivalent to >>> d[key] for a suitable key.
This is a capability that we already have, which would sometimes be lost under the scheme you support. Also lost would be the equivalence between
val = d[key] getter = operator.itemgetter(key) val = getter(d)
More exactly, sometimes it wouldn't be possible to find and use a key. Docs would have to be changed. See: https://docs.python.org/3/library/operator.html#operator.itemgetter
As I understand it, xarray uses dimension names to slice data. Here's an example from
http://xarray.pydata.org/en/stable/indexing.html#indexing-with-dimension-nam... >>> da[dict(space=0, time=slice(None, 2))]
Presumably, this would be replaced by something like >>> da[space=0, time=:2]
Now, the commands
da[space=0, time=:2] da[space=0, time=:2] = True del da[space=0, time=:2]
would at the begging of the call, presumably, do the same processing on the keyword arguments. (Let this stand for a wide range of examples.)
It is arguable that making it easier for the implementer of type(da) to do all that processing in the same place would be a REDUCTION of complexity. Allowing the processing to produce an intermediate object, say >>> key = dict(space=0, time=slice(None, 2)) would help here.
Real world examples are required, I think, to ground any discussions of complexity and simplicity. We want to optimise for what people do, for the problems they face. And this is a new feature.
We have a perfectly good way of handling keywords, so it is up to you to
explain why we shouldn't use it.
The scheme you support does not distinguish >>> d[1, 2, x=3, y=4] >>> d[(1, 2), x=3, y=4] I don't regard that as being perfectly good.
In addition, I would like >>> d = dict() >>> d[x=1, y=2] = 5 to work. It works out-of-the-box for my scheme. It can be made to work with a subclass of dict for the D'Aprano scheme.
I think that is enough for now.
I'd prefer to discuss this further by writing Python modules that contain code that can be tested. The testing should cover both the technical correctness and the user experience. To support this I intend not to focus on the next version of kwkey. https://pypi.org/project/kwkey/
Python-ideas mailing list -- firstname.lastname@example.org
To unsubscribe send an email to email@example.com
Message archived at https://firstname.lastname@example.org/message/3XRS7W...
Code of Conduct: http://python.org/psf/codeofconduct/