PEP 472 - regarding d[x=1, y=2] and similar
I'd like to sound out consensus regarding mapping access, where none of the keys are positional. In particular, I suggest that PEP 472 allow syntax and semantics such as >>> d[x=1, y=2] = 42 >>> d[x=1, y=2] 42 and ask whether the class >>> X = type(d) should be part of standard Python. NO ARGUMENTS At present, item access requires an argument, as a matter of syntax. >>> d[] SyntaxError: invalid syntax Compare this to >>> fn() NameError: name 'fn' is not defined I'd like d[] to become valid syntax. SEMANTICS OF NO ARGUMENTS I can see two basic ways of allowing no arguments. One is for the interpreter to construct an object that is the argument passed to __getitem__ and so forth. The other is to not pass an argument at all. I see this as a secondary question. NO POSITIONAL ARGUMENTS I'd like >>> d[x=1, y=2] to be valid syntax. It's not clear to me that all agree with this. Even if there are no objections, I'd like positive confirmation. CONSEQUENCE Suppose
d[x=1, y=2] is valid syntax. If so, then there is I think consensus that >>> d[x=1, y=2] = 42 >>> d[x=1, y=2] 42 can be implemented, where d is an instance of a suitable class. Otherwise, what's the point?
QUESTION Suppose we have >>> d[x=1, y=2] = 42 >>> d[x=1, y=2] 42 where d is an instance of a suitable class X that has no special knowledge of keywords. In other words, we also have for example >>> d[a='alpha', g='gamma', z=12] = 'cheese' >>> d[a='alpha', g='gamma', z=12] 'cheese' My question is this: Should such a class
X = type(d) be part of standard Python, as part of PEP 472? (My answer is: Yes, it should be in standard Python.)
At this time, I'm interested in canvassing opinions. Discussion of different opinions perhaps can take place later, or elsewhere. My main concern is to know if there is at present a rough consensus regarding the above. -- Jonathan
On 14/08/20 10:03 pm, Jonathan Fine wrote:
NO POSITIONAL ARGUMENTS I'd like >>> d[x=1, y=2] to be valid syntax. It's not clear to me that all agree with this.
If keywords are to be allowed, it seems reasonable to me that this should be legal.
>>> d[x=1, y=2] = 42 >>> d[x=1, y=2] 42 >>> d[a='alpha', g='gamma', z=12] = 'cheese' >>> d[a='alpha', g='gamma', z=12] 'cheese'
My question is this: Should such a class ... be part of standard Python,
Do you have any use cases in mind for this? To justify being built in, it would need to have a wide range of uses. -- Greg
Hi Greg Thank you, for your support for d[x=1, y=2] being valid syntax. You ask if I have any use cases in mind for a general keyword key class being part of standard Python. Good question. Anyone who is experimenting with keyword keys would, I think, appreciate having something they can use straight away. Thus, I think, any use case for PEP 472 is also a use case for the general keyword class I'm suggesting. No use cases for PEP 472 would of course be fatal. Storing function call results would be a use case. For this, look at the implementation of functools.lru_cache. I am keen to develop, with others, examples of how PEP 472 could help us in real-world programming. Even if convinced that PEP 472 is a good idea, the examples would help us get the details right, and also help with formal and informal documentation. For a specific example, a simple ad hoc way of recording data. For example >>> height[x=10, y=14] = 2010 We can already use dictionaries in this way, as in >>> height[10, 14] = 2010 So this is a bit like using named tuples instead of tuples. I think that's enough for now. -- Jonathan On Fri, Aug 14, 2020 at 12:28 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 14/08/20 10:03 pm, Jonathan Fine wrote:
NO POSITIONAL ARGUMENTS I'd like >>> d[x=1, y=2] to be valid syntax. It's not clear to me that all agree with this.
If keywords are to be allowed, it seems reasonable to me that this should be legal.
>>> d[x=1, y=2] = 42 >>> d[x=1, y=2] 42 >>> d[a='alpha', g='gamma', z=12] = 'cheese' >>> d[a='alpha', g='gamma', z=12] 'cheese'
My question is this: Should such a class ... be part of standard Python,
Do you have any use cases in mind for this?
To justify being built in, it would need to have a wide range of uses.
-- Greg
It seems very obvious to me that kwd args only should be valid, as Greg Ewing said, if kwd args are added. I'm less sure about no arguments, but perhaps. On Fri, Aug 14, 2020, 8:11 AM Jonathan Fine <jfine2358@gmail.com> wrote:
Storing function call results would be a use case. For this, look at the implementation of functools.lru_cache.
So maybe something like this?: from functools important storagefunc @storagefunc def fibonacci(*, n): return fibonacci[n=n-1]+fibonacci[n=n-2]
fibonacci[n=1] = 0 fibonacci[n=2] = 1 fibonacci[n=4] 2
Of course this example function may not absolutely require kwd only args but any other example that can be contrived probably doesn't absolutely require them either. For a specific example, a simple ad hoc way of recording data. For example
>>> height[x=10, y=14] = 2010
We can already use dictionaries in this way, as in >>> height[10, 14] = 2010
So this is a bit like using named tuples instead of tuples.
This seems like another nice usage to me. But isn't it important to identify what the underlying storage is, and the rest of the API, before I would want to use something like this? After all I can do these with both a list and dictionary: mylist[0] mydict[0] But the one I want to use is determined somewhat by what else I intend to do with it, and how it behaves under the hood... Use a ictionary for very fast item access, for example. List or tuple for ordered things.
On Fri, Aug 14, 2020 at 08:45:44AM -0400, Ricky Teachey wrote:
It seems very obvious to me that kwd args only should be valid, as Greg Ewing said, if kwd args are added.
Absolutely. We shouldn't even need to debate this. Would we consider a language change that forced every single function to accept arbitrary keyword-only arguments whether they made sense to the function or not? Of course we would not. So why are we even considering a language change to force every single subscriptable object to accept arbitrary keyword-only arguments unless the maintainer changes their class to explicitly reject them? If you want to use an arbitrary bunch of key:value pairs as a dict key, you should bundle them into a frozen dict or frozen SimpleNamespace object, and pass that as your key. (I would support the addition of either or both of those frozen types to the standard library.)
I'm less sure about no arguments, but perhaps.
In mathematics notation, subscripts can be values, or they can be descriptions, and sometimes a combination of the two. For example: x[i] x[i even] (Excuse the lack of WYSIWYG display, I'm limited to plain unformatted text here, but I trust the intention is obvious.) Python subscripting supports the first version, but not the second. Allowing keyword args would go some way to rectifying that, see PEP 472 for motivation and use-cases. But with *no subscript at all*, you just have x. Unlike function calls with no arguments, I maintain that `x[]` is meaningless that, in practice, would only be a error. Either a coding error or a conceptual error. That should remain an error. -- Steven
Message below sent in error. Please ignore. I'll send a replacement in a few minutes. Please accept my apologies. On Sat, Aug 15, 2020 at 4:30 PM Jonathan Fine <jfine2358@gmail.com> wrote:
Hi Steven
Recall that my proposal implies >>> d = dict() >>> d[a=1, b=2] = 42 >>> d[a=1, b=2] 42
Recall that your proposal implies >>> d = dict() >>> d[a=1, b=2] = 42 TypeError: wrapper __setitem__ doesn't take keyword arguments
Hi Steven You wrote: why are we even considering a language change to force every single
subscriptable object to accept arbitrary keyword-only arguments unless the maintainer changes their class to explicitly reject them?
I think there might be a misunderstanding here. I'd like to see an example of where such a change would be required (see my conclusion below). Recall that my proposal implies >>> d = dict() >>> d[a=1, b=2] = 42 >>> d[a=1, b=2] 42 Recall that your proposal implies >>> d = dict() >>> d[a=1, b=2] = 42 TypeError: wrapper __setitem__ doesn't take keyword arguments Your statement above helps me understand the motivation for your proposal, for which I'm grateful. However, there might be a misunderstanding. I'd like now to show you a consequence of my proposal. First consider >>> lst = [] >>> lst['a'] TypeError: list indices must be integers or slices, not str A dict will accept any hashable object as the key, but every list raises a type error if a string is passed as the index. No special action is required, for the list to reject a string as the index. Let's see what happens when we pass a keyword key to a list. >>> from kwkey import o >>> lst = [] >>> lst[o(a=1)] TypeError: list indices must be integers or slices, not K My proposal implies that the following be equivalent >>> anything[o(1, 2, a=3, b=4)] >>> anything[1, 2, a=3, b=4] for a suitable definition of 'o'. And kwkey I believe will provide such an 'o' (once the bugs have been removed - see below). CONCLUSION You wrote
why are we even considering a language change to force every single subscriptable object to accept arbitrary keyword-only arguments unless the maintainer changes their class to explicitly reject them?
I don't see how this is a consequence of my proposal. To help me understand, please provide an example such as >>> something[o(1, 2, a=3, b=4)] = 42 >>> something[o(1, 2, a=3, b=4)] whose behaviour is unexpected or unwelcome. (You might want to wait until I've fixed the bug stated below.) CONFESSION There's a serious bug in kwkey. We don't have >>> o(key) == key True but we should have that. Please accept my apologies. I've reported the bug, and there'll be a new release next week. https://github.com/jfine2358/python-kwkey/issues/2 -- Jonathan
On Sat, Aug 15, 2020, at 01:55, Steven D'Aprano wrote:
Of course we would not. So why are we even considering a language change to force every single subscriptable object to accept arbitrary keyword-only arguments unless the maintainer changes their class to explicitly reject them?
what? why would they have to "explicitly" reject them? it seems like the most obvious way to implement this would be to just... pass the keyword arguments. if you pass an argument to a function that doesn't have them in its signature it's a TypeError. Thus *automatically* "rejecting" them, with no effort required by the author. right? if that's not the case, what exactly is being proposed?
On Fri, 14 Aug 2020 at 13:12, Jonathan Fine <jfine2358@gmail.com> wrote:
Anyone who is experimenting with keyword keys would, I think, appreciate having something they can use straight away. Thus, I think, any use case for PEP 472 is also a use case for the general keyword class I'm suggesting. No use cases for PEP 472 would of course be fatal.
When experimenting, I routinely write throwaway classes and functions like def f(*args, **kw): print(f"In f, {args=} {kw=}") I don't see why writing class A: def __getitem__(self, *args, **kw): print(f"Getting {args=}, {kw=}") would be any more onerous. A stdlib class that used the new syntax should stand on its own merits, not as "something people can use to experiment with". Paul
My own personal use for this would be for generating anonymous protocols and dataclasses: class T(Protocol): x: int y: str # with some abuse of notation obviously these would generate unique typesassert T == Struct[x=int, y=str] # similarly @dataclassclass S: x: int y: str assert S == Struct[x=int, y=str] I often want to create such types “on the fly” without needing to put a name on them. Now as I don’t need mixed keyword / positional arguments I can achieve this with: # K = dict Struct[K(x=int, y=str)] But that costs 3 more keystrokes and is certainly less beautiful. While I would not personally use this I think a real killer app would be slicing named axis, as the slice syntax is exclusive to geitem and hence can not leverage the dict trick. Caleb On Fri, Aug 14, 2020 at 6:30 AM Paul Moore <p.f.moore@gmail.com> wrote:
On Fri, 14 Aug 2020 at 13:12, Jonathan Fine <jfine2358@gmail.com> wrote:
Anyone who is experimenting with keyword keys would, I think, appreciate having something they can use straight away. Thus, I think, any use case for PEP 472 is also a use case for the general keyword class I'm suggesting. No use cases for PEP 472 would of course be fatal.
When experimenting, I routinely write throwaway classes and functions like
def f(*args, **kw): print(f"In f, {args=} {kw=}")
I don't see why writing
class A: def __getitem__(self, *args, **kw): print(f"Getting {args=}, {kw=}")
would be any more onerous. A stdlib class that used the new syntax should stand on its own merits, not as "something people can use to experiment with".
Paul _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/H7KIXV... Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, Aug 14, 2020, 7:39 PM Caleb Donovick
class T(Protocol): x: int y: str # with some abuse of notation obviously these would generate unique typesassert T == Struct[x=int, y=str]
I don't see what that can possible get you that `Struct(x=int, y=str)` doesn't.
I'm +0 on the idea, but I don't think "square brackets look nicer" is sufficient reason for a change.
I don't see what that can possible get you that `Struct(x=int, y=str)` doesn't.
Using `Struct(x=int, y=str)` requires a metaclass, where `Struct[x=int, y=str]` does not. On Fri, Aug 14, 2020 at 4:45 PM David Mertz <mertz@gnosis.cx> wrote:
On Fri, Aug 14, 2020, 7:39 PM Caleb Donovick
class T(Protocol): x: int y: str # with some abuse of notation obviously these would generate unique typesassert T == Struct[x=int, y=str]
I don't see what that can possible get you that `Struct(x=int, y=str)` doesn't.
I'm +0 on the idea, but I don't think "square brackets look nicer" is sufficient reason for a change.
On Fri, Aug 14, 2020, 7:53 PM Caleb Donovick <donovick@cs.stanford.edu> wrote:
I don't see what that can possible get you that `Struct(x=int, y=str)` doesn't.
Using `Struct(x=int, y=str)` requires a metaclass, where `Struct[x=int, y=str]` does not.
Why would it require a metaclass? Rather than just: class Struct: def __init__(self, **kws): ... Yes, that won't get you the MRO for T, but neither can __getitem__() on an entirely different object Struct. A class factory is also an option, of course.
Why would it require a metaclass? Rather than just: ...
Because I want the following to be true: ``` x = Struct[x=int, y=str](...) assert isinstance(x, Struct) assert isinstance(x, Struct[x=int, y=str]) assert not isinstance(x, Struct[x=int, y=int]) ``` On Fri, Aug 14, 2020 at 5:27 PM David Mertz <mertz@gnosis.cx> wrote:
On Fri, Aug 14, 2020, 7:53 PM Caleb Donovick <donovick@cs.stanford.edu> wrote:
I don't see what that can possible get you that `Struct(x=int, y=str)` doesn't.
Using `Struct(x=int, y=str)` requires a metaclass, where `Struct[x=int, y=str]` does not.
Why would it require a metaclass? Rather than just:
class Struct: def __init__(self, **kws): ...
Yes, that won't get you the MRO for T, but neither can __getitem__() on an entirely different object Struct.
A class factory is also an option, of course.
On Sat, Aug 15, 2020 at 4:38 PM Caleb Donovick <donovick@cs.stanford.edu> wrote:
Why would it require a metaclass? Rather than just: ...
Because I want the following to be true: x = Struct[x=int, y=str](...) assert isinstance(x, Struct) assert isinstance(x, Struct[x=int, y=str]) assert not isinstance(x, Struct[x=int, y=int])
Hmm... OK, that's an interesting desire. How do square brackets get you any closer to that? If this proposal, in whatever variation, is adopted, `Struct[x=int, y=str]` is going to be some kind of call to .__getitem__(). There is some debate about exactly how the information gets passed into the method, but we can bracket that for this question. One way or another, positional and named arguments are available to this future .__getitem__(). So how do you make this true: assert isinstance(x, Struct.__getitem__(x=int, y=str)) assert not isinstance(x, Struct.__getitem__(x=int, y=int)) For demonstration, maybe it's easiest just to give a new name to the hypothetical method. Say `Struct.bracket(...)`. It's not obvious to me how you'll get the behavior you want. ... and if you CAN get the behavior, why can't we name this method .__call__()? I'm not really sure what kind of thing Struct is meant to be, as well. Is it a class? An instance? A class factory? A metaclass? -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.
Hmm... OK, that's an interesting desire. How do square brackets get you any closer to that?
I use `__class_getitem__` as a memoized type factory that builds a new subtype when it's called or returns the cached type. You are correct I could name it as something else, there is nothing special about the square brackets other than notation but having a visual distinction of type creation and instantiation is useful. On Sat, Aug 15, 2020 at 3:44 PM David Mertz <mertz@gnosis.cx> wrote:
On Sat, Aug 15, 2020 at 4:38 PM Caleb Donovick <donovick@cs.stanford.edu> wrote:
Why would it require a metaclass? Rather than just: ...
Because I want the following to be true: x = Struct[x=int, y=str](...) assert isinstance(x, Struct) assert isinstance(x, Struct[x=int, y=str]) assert not isinstance(x, Struct[x=int, y=int])
Hmm... OK, that's an interesting desire. How do square brackets get you any closer to that?
If this proposal, in whatever variation, is adopted, `Struct[x=int, y=str]` is going to be some kind of call to .__getitem__(). There is some debate about exactly how the information gets passed into the method, but we can bracket that for this question. One way or another, positional and named arguments are available to this future .__getitem__().
So how do you make this true:
assert isinstance(x, Struct.__getitem__(x=int, y=str)) assert not isinstance(x, Struct.__getitem__(x=int, y=int))
For demonstration, maybe it's easiest just to give a new name to the hypothetical method. Say `Struct.bracket(...)`. It's not obvious to me how you'll get the behavior you want.
... and if you CAN get the behavior, why can't we name this method .__call__()?
I'm not really sure what kind of thing Struct is meant to be, as well. Is it a class? An instance? A class factory? A metaclass?
-- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.
On Fri, Aug 14, 2020, 7:45 PM David Mertz <mertz@gnosis.cx> wrote:
On Fri, Aug 14, 2020, 7:39 PM Caleb Donovick
class T(Protocol): x: int y: str # with some abuse of notation obviously these would generate unique typesassert T == Struct[x=int, y=str]
I don't see what that can possible get you that `Struct(x=int, y=str)` doesn't.
I'm +0 on the idea, but I don't think "square brackets look nicer" is sufficient reason for a change.
One problem is type hint creation has been extended to built-ins in python 3.9, so that you do not have to import Dict, List, et al anymore. Without kwd args inside [ ], you would not be able to do this: Vector = dict[i=float, j=float] ...but for obvious reasons, call syntax using built ins to create custom type hints isn't an option : dict(i=float, j=float) # this syntax isn't available So the improvement allowing usage of built ins for typing is somewhat negated. Instead you would have to import Dict from typing: Vector = typing.Dict(i=str, j=float) To me it is instructive that the line of code above currently causes a TypeError. For some reason, the mpy team decided not to go this direction, and instead in Python 3.8 introduced typing.TypedDict: class Vector(TypedDict): i = float j = float I suppose type hint creation could be extended to allow syntax you are proposing, like Dict(i=float, j=float). But it certainly appears to me that it was rejected for a reason, though I can't say what that reason is (I'm sure someone else could).
On Fri, Aug 14, 2020 at 08:58:36PM -0400, Ricky Teachey wrote:
One problem is type hint creation has been extended to built-ins in python 3.9, so that you do not have to import Dict, List, et al anymore.
Without kwd args inside [ ], you would not be able to do this:
Vector = dict[i=float, j=float]
If it were decided that we do want to be able to type hint dicts with keys 'i' and 'j', then there is probably no reason that we couldn't add that to the dict builtin. But given the existence of TypedDict, I think that the typing maintainers don't want this on builtin dict. In any case, concrete changes to any builtin are a separate proposal. Obviously those changes depend on allowing the syntax, but we could allow the syntax without changing any builtins. The use of this proposed syntax for typing is orthogonal for the use of it in other contexts. Typing wants to subscript *types*, while other uses such as those in PEP 472 want to subscript *instances*. Of the two, I personally think subscripting instances is more interesting :-)
dict(i=float, j=float) # this syntax isn't available
Well, it's only not available because it already has a meaning: py> dict(i=float, j=float) {'i': <class 'float'>, 'j': <class 'float'>} -- Steven
On Fri, Aug 14, 2020 at 04:07:33PM -0700, Caleb Donovick wrote:
My own personal use for this would be for generating anonymous protocols and dataclasses:
class T(Protocol): x: int y: str # with some abuse of notation obviously these would generate unique typesassert T == Struct[x=int, y=str]
I don't know how to interpret these examples. What's Protocol and where does it come from? What's Struct? As I recall, one of the motivations for re-visiting PEP 472 is to allow such keyword notation in type hints, so that we could write Struct[x=int, y=str] in a type hint and have it mean a struct with fields x (an int) and y (a str). I'm not sure whether that use in type hinting would allow the use of this Struct to create anonymous classes. I suppose it would, but I'm not expert enough on type hints to be sure. But assuming the two uses are compatible, I must say that having the same notation for type-hinting a struct and actually creating an anonymous struct class would be desirable: def func(widget:Struct[x=int, y=str]) -> gadget: pass # Later: MyWidget = Struct[x=int, y=str] func(MyWidget(19, 'hello')) I really like the look of that, and I think that having the Struct call use the same subscript notation as the Struct type hint is a plus.
While I would not personally use this I think a real killer app would be slicing named axis, as the slice syntax is exclusive to geitem and hence can not leverage the dict trick.
This is one of the motivating use-cases of PEP 472. -- Steven
I don't know how to interpret these examples. What's Protocol and where does it come from? What's Struct?
`Protocol` comes from `typing` `Struct` is my own class which generates anonymous dataclasses and protocols as you gathered (unfortunately I currently have two versions one for building the protocol and one for building the dataclass but thats because of stupid engineering requirements). On Fri, Aug 14, 2020 at 11:14 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Fri, Aug 14, 2020 at 04:07:33PM -0700, Caleb Donovick wrote:
My own personal use for this would be for generating anonymous protocols and dataclasses:
class T(Protocol): x: int y: str # with some abuse of notation obviously these would generate unique typesassert T == Struct[x=int, y=str]
I don't know how to interpret these examples. What's Protocol and where does it come from? What's Struct?
As I recall, one of the motivations for re-visiting PEP 472 is to allow such keyword notation in type hints, so that we could write
Struct[x=int, y=str]
in a type hint and have it mean a struct with fields x (an int) and y (a str). I'm not sure whether that use in type hinting would allow the use of this Struct to create anonymous classes. I suppose it would, but I'm not expert enough on type hints to be sure.
But assuming the two uses are compatible, I must say that having the same notation for type-hinting a struct and actually creating an anonymous struct class would be desirable:
def func(widget:Struct[x=int, y=str]) -> gadget: pass
# Later: MyWidget = Struct[x=int, y=str]
func(MyWidget(19, 'hello'))
I really like the look of that, and I think that having the Struct call use the same subscript notation as the Struct type hint is a plus.
While I would not personally use this I think a real killer app would be slicing named axis, as the slice syntax is exclusive to geitem and hence can not leverage the dict trick.
This is one of the motivating use-cases of PEP 472.
-- Steven _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/HLXSO2... Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, Aug 14, 2020 at 4:38 PM Caleb Donovick <donovick@cs.stanford.edu> wrote:
My own personal use for this would be for generating anonymous protocols and dataclasses:
class T(Protocol): x: int y: str # with some abuse of notation obviously these would generate unique typesassert T == Struct[x=int, y=str] # similarly @dataclassclass S: x: int y: str assert S == Struct[x=int, y=str]
I often want to create such types “on the fly” without needing to put a name on them.
Now as I don’t need mixed keyword / positional arguments I can achieve this with:
# K = dict Struct[K(x=int, y=str)]
But that costs 3 more keystrokes and is certainly less beautiful.
While I would not personally use this I think a real killer app would be slicing named axis, as the slice syntax is exclusive to geitem and hence can not leverage the dict trick.
To me, the main weakness here is that you couldn't move forward with this unless you also got the various static type checkers on board. But I don't think those care much about this use case (an inline notation for what you can already do with a class definition and annotations). And without static checking this isn't going to be very popular. If and when we have `__getitem__` with keyword args we can start thinking about how to best leverage it in type annotations -- I would assume that describing axes of objects like numpy arrays would be the first use case. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
To me, the main weakness here is that you couldn't move forward with this unless you also got the various static type checkers on board. But I don't think those care much about this use case (an inline notation for what you can already do with a class definition and annotations). And without static checking this isn't going to be very popular.
You underestimate my willingness to generate python files which could be consumed by static checkers via a preprocessing step. Also, my real goal is to abuse type hints for the purposes of my DSL. But DSL is a nuaghty term on the list so we won't mention that :) On Sat, Aug 15, 2020 at 4:05 PM Guido van Rossum <guido@python.org> wrote:
On Fri, Aug 14, 2020 at 4:38 PM Caleb Donovick <donovick@cs.stanford.edu> wrote:
My own personal use for this would be for generating anonymous protocols and dataclasses:
class T(Protocol): x: int y: str # with some abuse of notation obviously these would generate unique typesassert T == Struct[x=int, y=str] # similarly @dataclassclass S: x: int y: str assert S == Struct[x=int, y=str]
I often want to create such types “on the fly” without needing to put a name on them.
Now as I don’t need mixed keyword / positional arguments I can achieve this with:
# K = dict Struct[K(x=int, y=str)]
But that costs 3 more keystrokes and is certainly less beautiful.
While I would not personally use this I think a real killer app would be slicing named axis, as the slice syntax is exclusive to geitem and hence can not leverage the dict trick.
To me, the main weakness here is that you couldn't move forward with this unless you also got the various static type checkers on board. But I don't think those care much about this use case (an inline notation for what you can already do with a class definition and annotations). And without static checking this isn't going to be very popular.
If and when we have `__getitem__` with keyword args we can start thinking about how to best leverage it in type annotations -- I would assume that describing axes of objects like numpy arrays would be the first use case.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On Sat, Aug 15, 2020 at 7:14 PM Caleb Donovick <donovick@cs.stanford.edu> wrote:
To me, the main weakness here is that you couldn't move forward with this unless you also got the various static type checkers on board. But I don't think those care much about this use case (an inline notation for what you can already do with a class definition and annotations). And without static checking this isn't going to be very popular.
You underestimate my willingness to generate python files which could be consumed by static checkers via a preprocessing step. Also, my real goal is to abuse type hints for the purposes of my DSL. But DSL is a nuaghty term on the list so we won't mention that :)
Fine, so the use case you claimed was fiction. If you had just said "DSL" instead of "anonymous protocols and dataclasses" you would have gotten straight to the point and we would have been talking about whether extended subscription would be useful for DSLs (I can see various use cases), rather than arguing over whether Struct can be spelled with () instead of [] (a total waste of time). In fact, I don't know why you think "DSL" is a naughty term. (I find "runtime use of annotations" much naughtier. :-) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
Fine, so the use case you claimed was fiction. If you had just said "DSL" instead of "anonymous protocols and dataclasses" you would have gotten straight to the point and we would have been talking about whether extended subscription would be useful for DSLs (I can see various use cases), rather than arguing over whether Struct can be spelled with () instead of [] (a total waste of time).
Oh but the dataclasses and protocols part is not fiction, I am just concerned with mypy being able to leverage my annotations. On Sat, Aug 15, 2020 at 7:27 PM Guido van Rossum <guido@python.org> wrote:
On Sat, Aug 15, 2020 at 7:14 PM Caleb Donovick <donovick@cs.stanford.edu> wrote:
To me, the main weakness here is that you couldn't move forward with this unless you also got the various static type checkers on board. But I don't think those care much about this use case (an inline notation for what you can already do with a class definition and annotations). And without static checking this isn't going to be very popular.
You underestimate my willingness to generate python files which could be consumed by static checkers via a preprocessing step. Also, my real goal is to abuse type hints for the purposes of my DSL. But DSL is a nuaghty term on the list so we won't mention that :)
Fine, so the use case you claimed was fiction. If you had just said "DSL" instead of "anonymous protocols and dataclasses" you would have gotten straight to the point and we would have been talking about whether extended subscription would be useful for DSLs (I can see various use cases), rather than arguing over whether Struct can be spelled with () instead of [] (a total waste of time).
In fact, I don't know why you think "DSL" is a naughty term. (I find "runtime use of annotations" much naughtier. :-)
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
*unconcerned (sorry for the spam) On Sun, Aug 16, 2020 at 3:57 PM Caleb Donovick <donovick@cs.stanford.edu> wrote:
Fine, so the use case you claimed was fiction. If you had just said "DSL" instead of "anonymous protocols and dataclasses" you would have gotten straight to the point and we would have been talking about whether extended subscription would be useful for DSLs (I can see various use cases), rather than arguing over whether Struct can be spelled with () instead of [] (a total waste of time).
Oh but the dataclasses and protocols part is not fiction, I am just concerned with mypy being able to leverage my annotations.
On Sat, Aug 15, 2020 at 7:27 PM Guido van Rossum <guido@python.org> wrote:
On Sat, Aug 15, 2020 at 7:14 PM Caleb Donovick <donovick@cs.stanford.edu> wrote:
To me, the main weakness here is that you couldn't move forward with this unless you also got the various static type checkers on board. But I don't think those care much about this use case (an inline notation for what you can already do with a class definition and annotations). And without static checking this isn't going to be very popular.
You underestimate my willingness to generate python files which could be consumed by static checkers via a preprocessing step. Also, my real goal is to abuse type hints for the purposes of my DSL. But DSL is a nuaghty term on the list so we won't mention that :)
Fine, so the use case you claimed was fiction. If you had just said "DSL" instead of "anonymous protocols and dataclasses" you would have gotten straight to the point and we would have been talking about whether extended subscription would be useful for DSLs (I can see various use cases), rather than arguing over whether Struct can be spelled with () instead of [] (a total waste of time).
In fact, I don't know why you think "DSL" is a naughty term. (I find "runtime use of annotations" much naughtier. :-)
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On Sat, 15 Aug 2020 19:27:24 -0700 Guido van Rossum <guido@python.org> wrote:
On Sat, Aug 15, 2020 at 7:14 PM Caleb Donovick <donovick@cs.stanford.edu> wrote:
To me, the main weakness here is that you couldn't move forward with this unless you also got the various static type checkers on board. But I don't think those care much about this use case (an inline notation for what you can already do with a class definition and annotations). And without static checking this isn't going to be very popular.
You underestimate my willingness to generate python files which could be consumed by static checkers via a preprocessing step. Also, my real goal is to abuse type hints for the purposes of my DSL. But DSL is a nuaghty term on the list so we won't mention that :)
Fine, so the use case you claimed was fiction. If you had just said "DSL" instead of "anonymous protocols and dataclasses" you would have gotten straight to the point and we would have been talking about whether extended subscription would be useful for DSLs (I can see various use cases), rather than arguing over whether Struct can be spelled with () instead of [] (a total waste of time).
In fact, I don't know why you think "DSL" is a naughty term.
Probably because exploiting Python abstraction facilities to build DSLs has/had long been frown upon in this community? That was the leitmotiv back when people were marvelling over Ruby's flexibility in the area. You can't fault people for remembering that theme. (arguably, annotations are a standard-enshrined DSL which even motivated unusual execution rules in the language...) Regards Antoine.
On 17/08/20 9:58 pm, Antoine Pitrou wrote:
Probably because exploiting Python abstraction facilities to build DSLs has/had long been frown upon in this community? That was the leitmotiv back when people were marvelling over Ruby's flexibility in the area.
As far as I remember, what was frowned on was adding weird and wonderful syntax (e.g. function calls without parens) to Python purely because "it might be useful for DSLs". There's nothing wrong with using existing features to build DSLs (although people might look askance at you if you use them in particularly obscure ways). -- Greg
On Mon, 17 Aug 2020 22:54:32 +1200 Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 17/08/20 9:58 pm, Antoine Pitrou wrote:
Probably because exploiting Python abstraction facilities to build DSLs has/had long been frown upon in this community? That was the leitmotiv back when people were marvelling over Ruby's flexibility in the area.
As far as I remember, what was frowned on was adding weird and wonderful syntax (e.g. function calls without parens) to Python purely because "it might be useful for DSLs".
Notice that such weird and wonderful syntax was proposed again by Guido recently... And as I said, annotations have become a DSL in themselves, and several PEPs are dedicated to that DSL.
There's nothing wrong with using existing features to build DSLs (although people might look askance at you if you use them in particularly obscure ways).
Well, we're not talking about an existing feature in this thread. Regards Antoins.
RE: empty subscripts: Should something[] be allowed syntax? TL;DR: no In a way this comes down to design philosophy vs implementation. From an implementation perspective, the [] operator is another way to call __getitem__ and __setitem__. And from that perspective, why not have it act like a function call: no arguments, positional arguments, keyword arguments, the whole shebang. But from a language design perspective, the [] operator is a way to "index" a container -- get part of the container's contents. And from this perspective, no index makes no sense. I like to think of the dunders as an implementation detail, so no, the square brackets have a distinct meaning that is different from the parentheses, and should not have the same features. another way to think of it is that we shouldn't encourage folks to "abuse" the [] as simply an alternative way to call the object. -CHB On Mon, Aug 17, 2020 at 4:06 AM Antoine Pitrou <solipsis@pitrou.net> wrote:
On Mon, 17 Aug 2020 22:54:32 +1200
Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 17/08/20 9:58 pm, Antoine Pitrou wrote:
Probably because exploiting Python abstraction facilities to build DSLs
has/had long been frown upon in this community? That was the leitmotiv
back when people were marvelling over Ruby's flexibility in the area.
As far as I remember, what was frowned on was adding weird and
wonderful syntax (e.g. function calls without parens) to Python
purely because "it might be useful for DSLs".
Notice that such weird and wonderful syntax was proposed again by Guido
recently...
And as I said, annotations have become a DSL in themselves, and several
PEPs are dedicated to that DSL.
There's nothing
wrong with using existing features to build DSLs (although
people might look askance at you if you use them in particularly
obscure ways).
Well, we're not talking about an existing feature in this thread.
Regards
Antoins.
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/SZQUVE...
Code of Conduct: http://python.org/psf/codeofconduct/
Here's a few words on whether we should allow and whether we can forbid: >>> something[] First, in >>> something[x=1] what I call the argv is empty, as it is with >>> something[] If we represent an empty argv by passing the empty tuple () to __getitem__, then how are >>> something[(), x=1] >>> something[x=1] to be distinguished from each other? Or perhaps they shouldn't be. Next, if >>> something[*argv] is allowed, then what was a syntax error becomes a run-time error. Put another way, an optimising compiler might want to raise syntax error for
something[*()] although I think that would be wrong. Compare with >>> 1 if True else 1 / 0 1 as its possible that something[*()] won't be called.
Finally, even if we forbid
something[*argv] in order to prevent the empty key, the door is still open. We can use >>> something[**dict()] to access something with the empty key (assuming ** arguments are allowed).
And one more thing. There's rightly a strong association between [] and an empty list literal. To my mind, this makes >>> d[] look very odd. We're expecting something, but there's nothing there. Perhaps
d[-] would work better for signifying an empty key. Here, '[-]' is a special syntactic token.
Aside: Consistent with this, we could use
{-} for the empty set literal. At present the closest we can do for an empty set literal is >>> {0} - {0} set()
I hope all this helps. -- Jonathan
Ugh. It is becoming gradually clear to me that obj[] should be a syntax error. We should definitely not indulge in obj[-] or {-}. On Mon, Aug 17, 2020 at 11:41 AM Jonathan Fine <jfine2358@gmail.com> wrote:
Here's a few words on whether we should allow and whether we can forbid: >>> something[]
First, in >>> something[x=1] what I call the argv is empty, as it is with >>> something[]
If we represent an empty argv by passing the empty tuple () to __getitem__, then how are >>> something[(), x=1] >>> something[x=1] to be distinguished from each other? Or perhaps they shouldn't be.
Next, if >>> something[*argv] is allowed, then what was a syntax error becomes a run-time error. Put another way, an optimising compiler might want to raise syntax error for
something[*()] although I think that would be wrong. Compare with >>> 1 if True else 1 / 0 1 as its possible that something[*()] won't be called.
Finally, even if we forbid
something[*argv] in order to prevent the empty key, the door is still open. We can use >>> something[**dict()] to access something with the empty key (assuming ** arguments are allowed).
And one more thing. There's rightly a strong association between [] and an empty list literal. To my mind, this makes >>> d[] look very odd. We're expecting something, but there's nothing there. Perhaps
d[-] would work better for signifying an empty key. Here, '[-]' is a special syntactic token.
Aside: Consistent with this, we could use
{-} for the empty set literal. At present the closest we can do for an empty set literal is >>> {0} - {0} set()
I hope all this helps.
-- Jonathan _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/QOBONX... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On 18/08/20 6:37 am, Jonathan Fine wrote:
if >>> something[*argv] is allowed, then what was a syntax error becomes a run-time error.
I don't think so. If argv is empty, this is more akin to x = () something(x) which we presumably still want to allow. Opponents of a[] aren't trying to prevent an empty tuple being passed as an index, only disallow that particular way of spelling it. One possible reason for that is because it's not clear whether it should really mean passing an empty tuple or passing no positional argument at all. However, we still have to grapple with that issue if we want to allow keyword-only indexes, and it seems to me that whatever answer we come up with should also apply to a[]. So that's not really an argument against allowing a[]. Another reason put forward is that someone might accidentally type d[] = something and get an unexpected empty tuple in their dict. But I don't see why that's such a heinous possibility that a syntactic restriction is needed to protect us from it. Normal debugging techniques should be adequate to find it. So there don't seem to be any real technical arguments against allowing a[], only ones based on tradition ("we never allow it before") and emotion ("this looks weird and unfamiliar"). -- Greg
On 18/08/20 6:00 am, Christopher Barker wrote:
But from a language design perspective, the [] operator is a way to "index" a container -- get part of the container's contents. And from this perspective, no index makes no sense.
It can make sense if you have a zero-dimensional array. Or as much sense as zero-dimensional arrays make in the first place. Although, as has been pointed out, there is some ambiguity as to whether a[] should mean the scalar inside the zero-dimensional array or a view of the array, so maybe the refusal-to-guess principle argues that it should be disallowed. -- Greg
On Mon, Aug 17, 2020, at 14:00, Christopher Barker wrote:
From an implementation perspective, the [] operator is another way to call __getitem__ and __setitem__. And from that perspective, why not have it act like a function call: no arguments, positional arguments, keyword arguments, the whole shebang.
But from a language design perspective, the [] operator is a way to "index" a container -- get part of the container's contents. And from this perspective, no index makes no sense.
I think it makes perfect sense. Remember that numpy *currently* has a concept of "no index" [an empty tuple is used for this], it results in a view of the whole array, or the content as a scalar for a 0-dimensional array. [I've occasionally been tempted to try the same thing on ctypes objects, incidentally, I think it might be useful to make obj[] equivalent to obj.value]
On Thu, Aug 20, 2020 at 09:03:40AM -0400, Random832 wrote:
On Mon, Aug 17, 2020, at 14:00, Christopher Barker wrote:
From an implementation perspective, the [] operator is another way to call __getitem__ and __setitem__. And from that perspective, why not have it act like a function call: no arguments, positional arguments, keyword arguments, the whole shebang.
But from a language design perspective, the [] operator is a way to "index" a container -- get part of the container's contents. And from this perspective, no index makes no sense.
I think it makes perfect sense. Remember that numpy *currently* has a concept of "no index" [an empty tuple is used for this], it results in a view of the whole array, or the content as a scalar for a 0-dimensional array.
I wouldn't necessarily be taking numpy as the gold standard of good API design. Treating a missing index as the object itself makes a certain logical sense. Here's a variable with a subscript: xₑ and here it is again with a missing subscript: x So by analogy, we might say that `x[]` should be treated as just `x`. But honestly that's more likely to be an error, not a feature. I think that zero dimensional arrays are a pretty dubious concept, but if you did have one, since it has *no dimensions* it cannot contain any content at all.
[I've occasionally been tempted to try the same thing on ctypes objects, incidentally, I think it might be useful to make obj[] equivalent to obj.value]
And what about all the objects that don't have a .value attribute? What's so special about that attribute that subscripting with a missing subscript should return that attribute rather than some other? -- Steve
Hmmm, sorry, there was an encoding issue with my previous email. Somehow my attempted U+2091 LATIN SUBSCRIPT SMALL LETTER E got turned into the ^J control character. -- Steve
On Thu, Aug 20, 2020, at 09:46, Steven D'Aprano wrote:
And what about all the objects that don't have a .value attribute? What's so special about that attribute that subscripting with a missing subscript should return that attribute rather than some other?
er, I meant "it should, on ctypes scalar objects specifically, have the same semantics as those classes' .value attribute", not "it should proxy to the attribute called 'value' on all objects" sorry if that was unclear
On Fri, 14 Aug 2020 at 11:07, Jonathan Fine <jfine2358@gmail.com> wrote:
NO ARGUMENTS
I'd like d[] to become valid syntax.
This makes me a bit uncomfortable.
SEMANTICS OF NO ARGUMENTS I can see two basic ways of allowing no arguments. One is for the interpreter to construct an object that is the argument passed to __getitem__ and so forth. The other is to not pass an argument at all. I see this as a secondary question.
I understand it makes sense if you assume that you can have default arguments. but still it's kind of weird. and it's not always obvious how each structure is going to implement it.
NO POSITIONAL ARGUMENTS I'd like >>> d[x=1, y=2] to be valid syntax. It's not clear to me that all agree with this. Even if there are no objections, I'd like positive confirmation.
Yes, it should be.
CONSEQUENCE Suppose
d[x=1, y=2] is valid syntax. If so, then there is I think consensus that >>> d[x=1, y=2] = 42 >>> d[x=1, y=2] 42 can be implemented, where d is an instance of a suitable class. Otherwise, what's the point?
Yes. Agreed.
QUESTION Suppose we have >>> d[x=1, y=2] = 42 >>> d[x=1, y=2] 42 where d is an instance of a suitable class X that has no special knowledge of keywords.
Initially, when I wrote the pep, the idea was that there was no distinction of kwargs and normal args. Basically the idea was that currently the only "metainfo" associated to every argument is purely positional (e.g. the meaning of position 1 is implicit). But index 1 can have a specific semantic meaning (e.g. it could be a day). So in practice they would be one and the same, just that you add non-positional semantic meaning to indexes, and you can refer to them either through the position, or this additional semantic meaning. In other words, if you claim that the first index is day, and the second index is detector, somehow, there is no difference between these d[3, 4] d[day=3, detector=4] d[detector=4, day=3] In fact, my initial feeling would be that you can use either one or the other. You should not be able to mix and match. the pep went through various revisions, and we came to a possible proposal, but it's not set in stone.
In other words, we also have for example >>> d[a='alpha', g='gamma', z=12] = 'cheese' >>> d[a='alpha', g='gamma', z=12] 'cheese'
My question is this: Should such a class
X = type(d) be part of standard Python, as part of PEP 472? (My answer is: Yes, it should be in standard Python.)
Yes and no. In my opinion, the current classes (e.g. dict) *may* be extended to support this (optional) functionality. But the target should probably be something like numpy or pandas, or any other class that wants the flexibility for a named index approach. I would not add X to python stdlib. -- Kind regards, Stefano Borini
On Sat, Aug 15, 2020 at 7:26 PM Stefano Borini <stefano.borini@gmail.com> wrote:
QUESTION Suppose we have >>> d[x=1, y=2] = 42 >>> d[x=1, y=2] 42 where d is an instance of a suitable class X that has no special knowledge of keywords.
Initially, when I wrote the pep, the idea was that there was no distinction of kwargs and normal args. Basically the idea was that currently the only "metainfo" associated to every argument is purely positional (e.g. the meaning of position 1 is implicit). But index 1 can have a specific semantic meaning (e.g. it could be a day). So in practice they would be one and the same, just that you add non-positional semantic meaning to indexes, and you can refer to them either through the position, or this additional semantic meaning.
In other words, if you claim that the first index is day, and the second index is detector, somehow, there is no difference between these
d[3, 4] d[day=3, detector=4] d[detector=4, day=3]
In fact, my initial feeling would be that you can use either one or the other. You should not be able to mix and match.
the pep went through various revisions, and we came to a possible proposal, but it's not set in stone.
This would definitely not be sufficient for xarray, which I see as being one of the main users of this syntax. The whole point is to be able to specify an arbitrary subset labeled dimensions.
On Sat, Aug 15, 2020 at 8:02 PM Todd <toddrjen@gmail.com> wrote:
On Sat, Aug 15, 2020 at 7:26 PM Stefano Borini <stefano.borini@gmail.com> wrote:
QUESTION Suppose we have >>> d[x=1, y=2] = 42 >>> d[x=1, y=2] 42 where d is an instance of a suitable class X that has no special knowledge of keywords.
Initially, when I wrote the pep, the idea was that there was no distinction of kwargs and normal args. Basically the idea was that currently the only "metainfo" associated to every argument is purely positional (e.g. the meaning of position 1 is implicit). But index 1 can have a specific semantic meaning (e.g. it could be a day). So in practice they would be one and the same, just that you add non-positional semantic meaning to indexes, and you can refer to them either through the position, or this additional semantic meaning.
In other words, if you claim that the first index is day, and the second index is detector, somehow, there is no difference between these
d[3, 4] d[day=3, detector=4] d[detector=4, day=3]
In fact, my initial feeling would be that you can use either one or the other. You should not be able to mix and match.
the pep went through various revisions, and we came to a possible proposal, but it's not set in stone.
This would definitely not be sufficient for xarray, which I see as being one of the main users of this syntax. The whole point is to be able to specify an arbitrary subset labeled dimensions.
Are you saying that for xarray it is important to distinguish between `d[day=3, detector=4]` and `d[detector=4, day=3]`? If we just passed the keyword args to `__getitem__` as an extra `**kwds` argument (which preserves order, since Python 3.6 at least), that should work, right? If not, can you clarify? -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On Sat, Aug 15, 2020 at 8:27 PM Guido van Rossum <guido@python.org> wrote:
On Sat, Aug 15, 2020 at 8:02 PM Todd <toddrjen@gmail.com> wrote:
On Sat, Aug 15, 2020 at 7:26 PM Stefano Borini <stefano.borini@gmail.com> wrote:
QUESTION Suppose we have >>> d[x=1, y=2] = 42 >>> d[x=1, y=2] 42 where d is an instance of a suitable class X that has no special knowledge of keywords.
Initially, when I wrote the pep, the idea was that there was no distinction of kwargs and normal args. Basically the idea was that currently the only "metainfo" associated to every argument is purely positional (e.g. the meaning of position 1 is implicit). But index 1 can have a specific semantic meaning (e.g. it could be a day). So in practice they would be one and the same, just that you add non-positional semantic meaning to indexes, and you can refer to them either through the position, or this additional semantic meaning.
In other words, if you claim that the first index is day, and the second index is detector, somehow, there is no difference between these
d[3, 4] d[day=3, detector=4] d[detector=4, day=3]
In fact, my initial feeling would be that you can use either one or the other. You should not be able to mix and match.
the pep went through various revisions, and we came to a possible proposal, but it's not set in stone.
This would definitely not be sufficient for xarray, which I see as being one of the main users of this syntax. The whole point is to be able to specify an arbitrary subset labeled dimensions.
Are you saying that for xarray it is important to distinguish between `d[day=3, detector=4]` and `d[detector=4, day=3]`? If we just passed the keyword args to `__getitem__` as an extra `**kwds` argument (which preserves order, since Python 3.6 at least), that should work, right? If not, can you clarify?
An extra **kwds would be quite sufficient for xarray. We don't need to distinguish between `d[day=3, detector=4]` and `d[day=4, detector=3]`, at least not any differently from normal Python keyword arguments. What might not suffice is a required one-to-one mapping between positional and keyword arguments. Xarray has a notion of a "Dataset" that contains multiple arrays, some of which may have different dimensions or dimensions in a different order. For this reason, on Dataset we allow keyword-based indexing, but not positional-based indexing. This is basically the same as the use-cases for keyword-only parameters. One question that comes up: should d[**kwargs] be valid syntax? d[*args] currently is not, but that's OK since d[tuple(args)] is identical. On the other hand, we probably do need d[**kwargs] since there's no way to dynamically unpack keyword arguments (short of directly calling __getitem__). And perhaps for symmetry this suggests d[*args] should be valid, too, defined as equivalent to d[tuple(args)].
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...> _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/TGNFCU... Code of Conduct: http://python.org/psf/codeofconduct/
On Sat, Aug 15, 2020 at 09:25:00PM -0700, Stephan Hoyer wrote:
One question that comes up: should d[**kwargs] be valid syntax? d[*args] currently is not, but that's OK since d[tuple(args)] is identical.
If we're going to support keyword arguments, what reason would we have for not supporting `**kw` unpacking?
On the other hand, we probably do need d[**kwargs] since there's no way to dynamically unpack keyword arguments (short of directly calling __getitem__).
Indeed. So there is an excellent reason to support keyword unpacking, and no good reason not to support it. (Assuming keyword args are supported at all.)
And perhaps for symmetry this suggests d[*args] should be valid, too, defined as equivalent to d[tuple(args)].
Since positional arguments to `__getitem__` are automatically packed into a single argument, there is no need to call it with `*args`. That would be a waste of time: have the interpreter unpack the arguments, then pack them again. As you say, that makes it functionally equivalent to just passing `tuple(args)`. "For symmetry" is at best a weak argument. -- Steven
On Sat, Aug 15, 2020 at 08:26:10PM -0700, Guido van Rossum wrote:
Are you saying that for xarray it is important to distinguish between `d[day=3, detector=4]` and `d[detector=4, day=3]`? If we just passed the keyword args to `__getitem__` as an extra `**kwds` argument (which preserves order, since Python 3.6 at least), that should work, right? If not, can you clarify?
Just to clarify here, I assume you mean that if xarray cares about order-preserving keywords, they should write their methods this way: def __getitem__(self, index=None, **kwargs): rather than mandating that keyword args are *always* bundled into a single dict parameter. -- Steven
On Sat, Aug 15, 2020 at 10:00 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Aug 15, 2020 at 08:26:10PM -0700, Guido van Rossum wrote:
Are you saying that for xarray it is important to distinguish between `d[day=3, detector=4]` and `d[detector=4, day=3]`? If we just passed the keyword args to `__getitem__` as an extra `**kwds` argument (which preserves order, since Python 3.6 at least), that should work, right? If not, can you clarify?
Just to clarify here, I assume you mean that if xarray cares about order-preserving keywords, they should write their methods this way:
def __getitem__(self, index=None, **kwargs):
rather than mandating that keyword args are *always* bundled into a single dict parameter.
Um, I'm not sure what "bundled into a single dict parameter" refers to. That the signature would be ``` def __getitem__(self, index, kwargs, /): ``` ? That sounds bad for people who want to use a few choice keywords. (And I think you'd be against that, for that very reason; as am I.) Or Jonathan Fine's proposal to create a "Key" class that bundles positional and keyword args? Same thing. (And I *know* you're against that. So am I.) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On Sun, Aug 16, 2020 at 11:18:40AM -0700, Guido van Rossum wrote:
Just to clarify here, I assume you mean that if xarray cares about order-preserving keywords, they should write their methods this way:
def __getitem__(self, index=None, **kwargs):
rather than mandating that keyword args are *always* bundled into a single dict parameter.
Um, I'm not sure what "bundled into a single dict parameter" refers to.
That the signature would be ``` def __getitem__(self, index, kwargs, /): ``` ? That sounds bad for people who want to use a few choice keywords. (And I think you'd be against that, for that very reason; as am I.)
Yes, that's exactly it! The analogy is with comma-separated items in the subscript, which get collected (bundled) into a tuple, rather than allocated to multiple parameters. And you are correct that I am against that. If people want that behavious for their class, they can use `**kwargs`. So I think we're on the same page here. -- Steven
No, I am saying it is important to distinguish between "d[3, 4]" and "d[day=3, detector=4]". In xarray there can be label-only dimensions, that is dimensions with no corresponding position. So having all labelled dimensions necessarily mapped onto positional dimensions wouldn't work. On Sat, Aug 15, 2020, 23:26 Guido van Rossum <guido@python.org> wrote:
On Sat, Aug 15, 2020 at 8:02 PM Todd <toddrjen@gmail.com> wrote:
On Sat, Aug 15, 2020 at 7:26 PM Stefano Borini <stefano.borini@gmail.com> wrote:
QUESTION Suppose we have >>> d[x=1, y=2] = 42 >>> d[x=1, y=2] 42 where d is an instance of a suitable class X that has no special knowledge of keywords.
Initially, when I wrote the pep, the idea was that there was no distinction of kwargs and normal args. Basically the idea was that currently the only "metainfo" associated to every argument is purely positional (e.g. the meaning of position 1 is implicit). But index 1 can have a specific semantic meaning (e.g. it could be a day). So in practice they would be one and the same, just that you add non-positional semantic meaning to indexes, and you can refer to them either through the position, or this additional semantic meaning.
In other words, if you claim that the first index is day, and the second index is detector, somehow, there is no difference between these
d[3, 4] d[day=3, detector=4] d[detector=4, day=3]
In fact, my initial feeling would be that you can use either one or the other. You should not be able to mix and match.
the pep went through various revisions, and we came to a possible proposal, but it's not set in stone.
This would definitely not be sufficient for xarray, which I see as being one of the main users of this syntax. The whole point is to be able to specify an arbitrary subset labeled dimensions.
Are you saying that for xarray it is important to distinguish between `d[day=3, detector=4]` and `d[detector=4, day=3]`? If we just passed the keyword args to `__getitem__` as an extra `**kwds` argument (which preserves order, since Python 3.6 at least), that should work, right? If not, can you clarify?
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On Fri, Aug 14, 2020 at 3:05 AM Jonathan Fine <jfine2358@gmail.com> wrote:
I'd like to sound out consensus regarding mapping access, where none of the keys are positional. In particular, I suggest that PEP 472 allow syntax and semantics such as >>> d[x=1, y=2] = 42 >>> d[x=1, y=2] 42 and ask whether the class >>> X = type(d) should be part of standard Python.
This way of phrasing it rubs me the wrong way. The way Python is defined, notations like `x+y` or `a[i]` translate to dunder calls and we separately specify which built-in types support which dunder methods. If you think `dict` should support the extended subscript notation, just say so, don't play guessing games with `X = type(d)`. (You've given the answer away by naming the variable `d` anyway.) Personally I think `dict` should *not* support the extended subscript notation, but I think the extended `__getitem__` protocol should support passing just keyword args.
NO ARGUMENTS At present, item access requires an argument, as a matter of syntax. >>> d[] SyntaxError: invalid syntax
Compare this to >>> fn() NameError: name 'fn' is not defined
I'd like d[] to become valid syntax.
This looks like a syntax error that would trip over a lot of people. I guess we could make it less dangerous if common types like `dict` rejected this at runtime. But if you insist that `dict` should support keyword args, I would personally insist on making this syntax illegal.
SEMANTICS OF NO ARGUMENTS I can see two basic ways of allowing no arguments. One is for the interpreter to construct an object that is the argument passed to __getitem__ and so forth. The other is to not pass an argument at all. I see this as a secondary question.
The signatures could be ``` def __getitem__(self, index=default, /, **kwargs): ... def __setitem__(self, index=default, value, /, **kwargs): ... ``` A class that doesn't want to support `a[]` would indicate so by not providing a default index, and then it would simply raise a TypeError, just like calling a function with a missing required argument.
NO POSITIONAL ARGUMENTS I'd like >>> d[x=1, y=2] to be valid syntax. It's not clear to me that all agree with this. Even if there are no objections, I'd like positive confirmation.
Both Greg Ewing and Steven D'Aprano agree with this, and it looks fine to me as well, so I think you've got this.
CONSEQUENCE Suppose
d[x=1, y=2] is valid syntax. If so, then there is I think consensus that >>> d[x=1, y=2] = 42 >>> d[x=1, y=2] 42 can be implemented, where d is an instance of a suitable class. Otherwise, what's the point?
Yes of course. A MutableMapping subclass that wanted to do this could be defined like this: ``` class KeyKey(MutableMapping): def __getitem__(self, key=None, /, **kwds): return super().__getitem__(self.makekey(key, kwds)) def __setitem__(self, key=None, value, /, **kwds): return super().__setitem__(self.makekey(key, kwds), value) @classmethod def makekey(self, key, kwds): return (key, tuple(sorted(kwds))) # Or something more sophisticated ``` (A few other methods would have to be added, like `__delitem__` and `__contains__`; the caller of `__contains__` would need to know about the makekey() method.)
QUESTION Suppose we have >>> d[x=1, y=2] = 42 >>> d[x=1, y=2] 42 where d is an instance of a suitable class X that has no special knowledge of keywords.
Why should X not be required to have knowledge of keywords?
In other words, we also have for example >>> d[a='alpha', g='gamma', z=12] = 'cheese' >>> d[a='alpha', g='gamma', z=12] 'cheese'
My question is this: Should such a class
X = type(d) be part of standard Python, as part of PEP 472? (My answer is: Yes, it should be in standard Python.)
At this time, I'm interested in canvassing opinions. Discussion of different opinions perhaps can take place later, or elsewhere. My main concern is to know if there is at present a rough consensus regarding the above.
I don't think you have consensus at all -- in fact I haven't seen anyone besides you agreeing that `d[x=1, y=2]` should construct a key of a special type and pass that to `__getitem__` -- everyone else (including myself) appears to think that it's better to pass the keyword args to `__getitem__` and decide the edge cases based on that. FWIW in my example I sorted the keywords, so that `d[x=1, y=2]` and `d[y=2, x=1]` construct the same internal key. But for some use cases it might be better if these constructed *different* internal keys. For example, Caleb's Struct class, when used to construct a dataclass, would need to be able to tell them apart, since the field order in a dataclass is part of the type -- the argument order of the constructor depends on it. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On 16/08/20 11:49 am, Guido van Rossum wrote:
SEMANTICS OF NO ARGUMENTS I can see two basic ways of allowing no arguments. One is for the interpreter to construct an object that is the argument passed to __getitem__ and so forth. The other is to not pass an argument at all. I see this as a secondary question.
If d[] were to be allowed, I would expect it to pass an empty tuple as the index, since it's the limiting case of reducing the number of positional indices. -- Greg
On Mon, Aug 17, 2020 at 12:32:08AM +1200, Greg Ewing wrote:
On 16/08/20 11:49 am, Guido van Rossum wrote:
SEMANTICS OF NO ARGUMENTS I can see two basic ways of allowing no arguments. One is for the interpreter to construct an object that is the argument passed to __getitem__ and so forth. The other is to not pass an argument at all. I see this as a secondary question.
If d[] were to be allowed, I would expect it to pass an empty tuple as the index, since it's the limiting case of reducing the number of positional indices.
So you would expect `obj[]` and `obj[()]` to be the same? Personally, I think that unless there is an overwhelmingly good use-case for an empty subscript, we should continue to treat empty subscripts (no positional or keyword arguments) as a syntax error. -- Steven
On Sun, Aug 16, 2020 at 5:45 AM Steven D'Aprano <steve@pearwood.info> wrote:
On Mon, Aug 17, 2020 at 12:32:08AM +1200, Greg Ewing wrote:
On 16/08/20 11:49 am, Guido van Rossum wrote:
SEMANTICS OF NO ARGUMENTS I can see two basic ways of allowing no arguments. One is for the interpreter to construct an object that is the argument passed to __getitem__ and so forth. The other is to not pass an argument at all. I see this as a secondary question.
If d[] were to be allowed, I would expect it to pass an empty tuple as the index, since it's the limiting case of reducing the number of positional indices.
So you would expect `obj[]` and `obj[()]` to be the same?
That's not terrible, since these are also the same: ``` obj[x] === obj[(x)] obj[x, y] === obj[(x, y)] ``` (Even though `obj[x]` is still an exception because it's the only form that isn't tuplified.) On the other hand, it's *also* the limiting case of reducing the number of keyword arguments, so whatever is passed here should also be passed as the positional part of the key when only keyword arguments are present, and I'm not sure what I think of using `()` for that.
Personally, I think that unless there is an overwhelmingly good use-case for an empty subscript, we should continue to treat empty subscripts (no positional or keyword arguments) as a syntax error.
That's where I am too right now. But I think there may be at least a _decent_ use case: the `Tuple` type in type annotations. (And since PEP 585 also the `tuple` type.) We have `Tuple[int, int]` as a tuple of two integers. And we have `Tuple[int]` as a tuple of one integer. And occasionally we need to spell a tuple of *no* values, since that's the type of `()`. But we currently are forced to write that as `Tuple[()]`. If we allowed `Tuple[]` that odd edge case would be removed. So I probably would be okay with allowing `obj[]` syntactically, as long as the dict type could be made to reject it. Alas, I thought I had a solution, but it doesn't work for `__setitem__`: we can easily state that `obj[]` calls `obj.__getitem__()` and whether that's accepted or not depends on whether `obj.__getitem__` has a default value for its `key` positional argument. But what to do for `obj[] = x`? We can't call `obj.__setitem__(x)` -- well, we could, but it would be super ugly to write such a `__getitem__` method properly -- similar to supporting `range(n)`. So my intuition is failing me. It looks like `d[] = x` will need to come up with *some* key, and the only two values that sound at all reasonable are `()` (for the reason Greg mentioned) and `None` (because it's the universal "nothing here" value). But either way it's not reasonable for `dict` to reject those keys -- they are legitimate keys when passed explicitly. Using `()` is slightly better because it helps debugging: if you have a dict with an unexpected `None` key you should look for a key computation that unexpectedly returned `None`, and we can now add that if you have a dict with an unexpected `()` key, you should look for an assignment of the form `d[] = x`. But it would be better if `d[] = x` could be simply rejected -- either at runtime (perhaps with a TypeError, like for calling a function with insufficient arguments), or syntactically. (That is, if you believe, like me, that `d[key, kwd=val]` should be rejected.) Hence, `Tuple[]` qualifies as a decent use case, but not as an overwhelmingly good one. I can think of another way to deal with this -- we could define a new sentinel object (e.g. `Nope` :-), `d[]` could be equivalent to `d[Nope]`, and the dict class could reject `Nope` as a key value. But that's quite ugly, and it's arbitrary, too. PS. All this reminds me of a complaint I heard 4-5 decades ago from an experienced programmer when I was just learning the ropes, and which somehow stuck in my mind ever since: "... and yet again, the empty [sequence] is treated rather shabbily." (It sounded better in Dutch -- "stiefmoederlijk", meaning "stepmotherly".) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On Sun, 2020-08-16 at 12:07 -0700, Guido van Rossum wrote:
On Sun, Aug 16, 2020 at 5:45 AM Steven D'Aprano <steve@pearwood.info> wrote:
On Mon, Aug 17, 2020 at 12:32:08AM +1200, Greg Ewing wrote:
On 16/08/20 11:49 am, Guido van Rossum wrote:
SEMANTICS OF NO ARGUMENTS I can see two basic ways of allowing no arguments. One is for the interpreter to construct an object that is the argument passed to __getitem__ and so forth. The other is to not pass an argument at all. I see this as a secondary question.
If d[] were to be allowed, I would expect it to pass an empty tuple as the index, since it's the limiting case of reducing the number of positional indices.
So you would expect `obj[]` and `obj[()]` to be the same?
That's not terrible, since these are also the same: ``` obj[x] === obj[(x)] obj[x, y] === obj[(x, y)] ``` (Even though `obj[x]` is still an exception because it's the only form that isn't tuplified.)
On the other hand, it's *also* the limiting case of reducing the number of keyword arguments, so whatever is passed here should also be passed as the positional part of the key when only keyword arguments are present, and I'm not sure what I think of using `()` for that.
Personally, I think that unless there is an overwhelmingly good use-case for an empty subscript, we should continue to treat empty subscripts (no positional or keyword arguments) as a syntax error.
That's where I am too right now. But I think there may be at least a _decent_ use case: the `Tuple` type in type annotations. (And since PEP 585 also the `tuple` type.)
We have `Tuple[int, int]` as a tuple of two integers. And we have `Tuple[int]` as a tuple of one integer. And occasionally we need to spell a tuple of *no* values, since that's the type of `()`. But we currently are forced to write that as `Tuple[()]`. If we allowed `Tuple[]` that odd edge case would be removed.
So I probably would be okay with allowing `obj[]` syntactically, as long as the dict type could be made to reject it.
Alas, I thought I had a solution, but it doesn't work for `__setitem__`: we can easily state that `obj[]` calls `obj.__getitem__()` and whether that's accepted or not depends on whether `obj.__getitem__` has a default value for its `key` positional argument. But what to do for `obj[] = x`? We can't call `obj.__setitem__(x)` -- well, we could, but it would be super ugly to write such a `__getitem__` method properly -- similar to supporting `range(n)`.
So my intuition is failing me. It looks like `d[] = x` will need to come up with *some* key, and the only two values that sound at all reasonable are `()` (for the reason Greg mentioned) and `None` (because it's the universal "nothing here" value). But either way it's not reasonable for `dict` to reject those keys -- they are legitimate keys when passed explicitly. Using `()` is slightly better because it helps debugging: if you have a dict with an unexpected `None` key you should look for a key computation that unexpectedly returned `None`, and we can now add that if you have a dict with an unexpected `()` key, you should look for an assignment of the form `d[] = x`.
For what its worth, NumPy uses `None` to indicate inserting a new axis/dimensions (we have an `np.newaxis` alias as well): arr = np.array(5) arr.ndim == 0 arr[None].ndim == arr[None,].ndim == 1 So that would be problematic. There are two (subtly different [1]) acceptable choices for `ndarray[]`: arr[] is arr[()] or: arr[] is arr[...] In either case, though, I guess the arguments above against `None` apply likely similarly for `Ellipsis`. And since there are two different choices refusing to choose is fair for NumPy itself (for some other array-objects the result of those two choices may be identical). - Sebastian [1] The difference is that `arr[()]` extracts a scalar, while `arr[...]` returns the array (container) unchanged. This difference only matters for zero dimensional arrays. There may be reasons to prefer one over the other, but I can't think of any right now.
But it would be better if `d[] = x` could be simply rejected -- either at runtime (perhaps with a TypeError, like for calling a function with insufficient arguments), or syntactically. (That is, if you believe, like me, that `d[key, kwd=val]` should be rejected.) Hence, `Tuple[]` qualifies as a decent use case, but not as an overwhelmingly good one.
I can think of another way to deal with this -- we could define a new sentinel object (e.g. `Nope` :-), `d[]` could be equivalent to `d[Nope]`, and the dict class could reject `Nope` as a key value. But that's quite ugly, and it's arbitrary, too.
PS. All this reminds me of a complaint I heard 4-5 decades ago from an experienced programmer when I was just learning the ropes, and which somehow stuck in my mind ever since: "... and yet again, the empty [sequence] is treated rather shabbily." (It sounded better in Dutch -- "stiefmoederlijk", meaning "stepmotherly".)
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/3AXKZG... Code of Conduct: http://python.org/psf/codeofconduct/
On 17/08/20 8:19 am, Sebastian Berg wrote:
[1] The difference is that `arr[()]` extracts a scalar, while `arr[...]` returns the array (container) unchanged. This difference only matters for zero dimensional arrays. There may be reasons to prefer one over the other, but I can't think of any right now.
Usually in numpy, omitting trailing indices is equivalent to filling them with ':', so I would expect a[] to be a slice that covers the whole array, i.e. equivalent to a[...]. -- Greg
On Sun, Aug 16, 2020 at 12:07:47PM -0700, Guido van Rossum wrote: [...]
So I probably would be okay with allowing `obj[]` syntactically, as long as the dict type could be made to reject it.
I don't absolutely hate the idea, but I do feel that it's semantically rather dubious. `obj` with no subscript is just `obj`. It's not like an empty list, or string, so I'm still going to argue that there should be *something* in the subscript. Writing `obj[]` is, in my opinion, more likely to be an error than an intentional "subscript the default index". But if we did allow empty subscripts syntactically, surely they would only be valid if the `__getitem__` method defines a default? def __getitem__(self, index="right here"): Otherwise we should get a TypeError. The same would apply to subscript assignment: obj[] = value would only be allowed if the object defined setitem with a default for the index. Otherwise it would be a TypeError. In any case, we could punt on this and leave the empty subscript question for another day: Now is better than never. Although never is often better than *right* now. I would hate for the keyword question to be derailed because we can't reach a consensus on what empty subscripts mean. -- Steven
On Fri, Aug 14, 2020, at 06:03, Jonathan Fine wrote:
I'd like to sound out consensus regarding mapping access, where none of the keys are positional. In particular, I suggest that PEP 472 allow syntax and semantics such as >>> d[x=1, y=2] = 42 >>> d[x=1, y=2] 42 and ask whether the class >>> X = type(d) should be part of standard Python.
I have an implementation proposal that I believe is distinct from any of the ones mentioned in the PEP currently. Pass keyword arguments as ordinary keyword arguments [which any particular __getitem__ implementation is free to handle as **kwargs, specific keywords, or simple named arguments]. When a single positional argument is passed, it's used directly; when zero or two or more are passed, they are bundled into a tuple and passed as a single positional argument. Having zero arguments result in an empty tuple allows for easy conceptual compatibility with numpy. d[]: d.__getitem__(()) d[0] : d.__getitem__(0) d[0,1] : d.__getitem__((0, 1)) d[x=0]: d.__getitem__((), x=0) d[0, y=1]: d.__getitem__(0, y=1) d[0, 1, z=2]: d.__getitem__((0, 1), z=2) if an implementation wishes to support a more conventional style of argument handling, the tuple can be deparsed easily in python code: def __getitem__(self, arg, **kwargs): return self._getitem(*(arg if isinstance(arg, tuple) else (arg,))) def _getitem(self, x=slice(), y=slice(), z=slice()): ... but an implementation could also define __getitem__ with any other signature that accepts the arguments outlined above, for example def __getitem__(self, arg, option) would effectively treat option as a keyword-only argument, and __getitem__(self, arg) would, as currently, not accept any keyword arguments. and if __getitem__ itself defines named keyword args instead of **kwargs, or does not define anything at all, it will result in a TypeError in exactly the same way as if the calls outlined above were made directly, with only positional arguments being accepted and handled in the same way as they are currently.
On Thu, Aug 20, 2020, at 08:56, Random832 wrote:
I have an implementation proposal that I believe is distinct from any of the ones mentioned in the PEP currently.
on further reflection, this seems mostly equivalent to the "kwargs argument" strategy [which I had wrongly read as __getitem__(self, arg, kwargs) instead of __getitem__(self, arg, **kwargs)], except I think the specification that it "should be keyword only" is unnecessary and harmful. I think it would be perfectly reasonable to have a __getitem__(self, idx, option1=None, option2=None) which can be called as d[i, option1=foo] or d[i, option2=bar] - for a class which has a clear distinction between positional and keyword arguments there's no reason to force them to use the wrapper pattern I mentioned in my previous post. Incidentally, "It doesn't preserve order, unless an OrderedDict is used" is out of place since PEP 468 was accepted.
On Thu, Aug 20, 2020 at 05:57 Random832 <random832@fastmail.com> wrote:
I have an implementation proposal that I believe is distinct from any of the ones mentioned in the PEP currently.
Pass keyword arguments as ordinary keyword arguments [which any particular __getitem__ implementation is free to handle as **kwargs, specific keywords, or simple named arguments]. When a single positional argument is passed, it's used directly; when zero or two or more are passed, they are bundled into a tuple and passed as a single positional argument. Having zero arguments result in an empty tuple allows for easy conceptual compatibility with numpy.
d[]: d.__getitem__(())
d[0] : d.__getitem__(0)
d[0,1] : d.__getitem__((0, 1))
d[x=0]: d.__getitem__((), x=0)
d[0, y=1]: d.__getitem__(0, y=1)
d[0, 1, z=2]: d.__getitem__((0, 1), z=2)
That may not be in the PEP, but apart from the edge cases for d[] and d[x=0] it’s exactly what I and Steven have been proposing for quite a while. —Guido -- --Guido (mobile)
On Thu, Aug 20, 2020 at 10:31 AM Guido van Rossum <guido@python.org> wrote:
On Thu, Aug 20, 2020 at 05:57 Random832 <random832@fastmail.com> wrote:
I have an implementation proposal that I believe is distinct from any of the ones mentioned in the PEP currently.
Pass keyword arguments as ordinary keyword arguments [which any particular __getitem__ implementation is free to handle as **kwargs, specific keywords, or simple named arguments]. When a single positional argument is passed, it's used directly; when zero or two or more are passed, they are bundled into a tuple and passed as a single positional argument. Having zero arguments result in an empty tuple allows for easy conceptual compatibility with numpy.
d[]: d.__getitem__(())
d[0] : d.__getitem__(0)
d[0,1] : d.__getitem__((0, 1))
d[x=0]: d.__getitem__((), x=0)
d[0, y=1]: d.__getitem__(0, y=1)
d[0, 1, z=2]: d.__getitem__((0, 1), z=2)
That may not be in the PEP, but apart from the edge cases for d[] and d[x=0] it’s exactly what I and Steven have been proposing for quite a while.
—Guido -- --Guido (mobile)
I want to offer a big apology if this question has been answered 1 million times already. That being said, on the edge case: d[x=0] ...what is the alternative proposed __getitem__ call from Guido and Steven? And what about unpacking? d[*()]: d.__getitem__(()) and: d[**{}]: d.__getitem__(()) Thumb's up, or down on one, both? I can't remember what the current PEP says on these. Both are currently SyntaxErrors. --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
On Thu, Aug 20, 2020 at 11:01 AM Ricky Teachey <ricky@teachey.org> wrote:
On Thu, Aug 20, 2020 at 10:31 AM Guido van Rossum <guido@python.org> wrote:
That may not be in the PEP, but apart from the edge cases for d[] and
d[x=0] it’s exactly what I and Steven have been proposing for quite a while.
—Guido -- --Guido (mobile)
I want to offer a big apology if this question has been answered 1 million
times already. That being said, on the edge case:
d[x=0]
...what is the alternative proposed __getitem__ call from Guido and Steven?
Actually sorry I just answered my own question-- I guess it depends on what the call signature is on __getitem__. If it looked like this: def __getitem__(self, key=(), **kwargs): ... Then you get () as the key. But if the signature looks like this: def __getitem__(self, key, **kwargs): ... Then you will get a TypeError. --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
On Thu, Aug 20, 2020, 09:01 Random832 <random832@fastmail.com> wrote:
On Fri, Aug 14, 2020, at 06:03, Jonathan Fine wrote:
I'd like to sound out consensus regarding mapping access, where none of the keys are positional. In particular, I suggest that PEP 472 allow syntax and semantics such as >>> d[x=1, y=2] = 42 >>> d[x=1, y=2] 42 and ask whether the class >>> X = type(d) should be part of standard Python.
I have an implementation proposal that I believe is distinct from any of the ones mentioned in the PEP currently.
Pass keyword arguments as ordinary keyword arguments [which any particular __getitem__ implementation is free to handle as **kwargs, specific keywords, or simple named arguments]. When a single positional argument is passed, it's used directly; when zero or two or more are passed, they are bundled into a tuple and passed as a single positional argument. Having zero arguments result in an empty tuple allows for easy conceptual compatibility with numpy.
d[]: d.__getitem__(()) d[0] : d.__getitem__(0) d[0,1] : d.__getitem__((0, 1)) d[x=0]: d.__getitem__((), x=0) d[0, y=1]: d.__getitem__(0, y=1) d[0, 1, z=2]: d.__getitem__((0, 1), z=2)
if an implementation wishes to support a more conventional style of argument handling, the tuple can be deparsed easily in python code:
def __getitem__(self, arg, **kwargs): return self._getitem(*(arg if isinstance(arg, tuple) else (arg,)))
def _getitem(self, x=slice(), y=slice(), z=slice()): ...
but an implementation could also define __getitem__ with any other signature that accepts the arguments outlined above, for example def __getitem__(self, arg, option) would effectively treat option as a keyword-only argument, and __getitem__(self, arg) would, as currently, not accept any keyword arguments.
and if __getitem__ itself defines named keyword args instead of **kwargs, or does not define anything at all, it will result in a TypeError in exactly the same way as if the calls outlined above were made directly, with only positional arguments being accepted and handled in the same way as they are currently.
This is the proposal pretty much everyone already supports. Only Jonathan seems to want to do it differently. We are trying to find out exactly why he prefers this approach. So far the only advantage I have seen is that it is easier to experiment with.
Todd wrote: Only Jonathan seems to want to do it differently. We are trying to find
out exactly why he prefers this approach. So far the only advantage I have seen is that it is easier to experiment with.
I think it's good to make experiments before making a decision. That's where I'd like us to do next. Let's learn from shared experience. By the way, using >>> d[o(1, 2, a=3, b=4)] for a suitable 'o' and item function decorator has I believe all the capabilities of any scheme proposed (for a suitable 'o' and decorator). I'd rather make my case by doing experiments using various values of 'o' (and the associated function decorator). -- Jonathan
On Thu, Aug 20, 2020, 11:03 Jonathan Fine <jfine2358@gmail.com> wrote:
Todd wrote:
Only Jonathan seems to want to do it differently. We are trying to find
out exactly why he prefers this approach. So far the only advantage I have seen is that it is easier to experiment with.
I think it's good to make experiments before making a decision. That's where I'd like us to do next. Let's learn from shared experience.
Nobody is saying that experimentation is bad, only that the final implementation doesn't need to be exactly the same as the experiment.
By the way, using >>> d[o(1, 2, a=3, b=4)] for a suitable 'o' and item function decorator has I believe all the capabilities of any scheme proposed (for a suitable 'o' and decorator).
It has the same capabilities, the question is whether it has any additional abilities that would justify the added complexity. We have a perfectly good way of handling keywords, so it is up to you to explain why we shouldn't use it. I'd rather make my case by doing experiments using various values of 'o'
(and the associated function decorator).
If this is going to be accepted you need to be able to articulate why your approach is superior to the one everyone else supports. At the very least this will require a pep, and the pep will need to explain this. And I am not sure what you expect an experiment to show that can't be seen right now.
Todd wrote: It has the same capabilities, the question is whether it has any additional
abilities that would justify the added complexity.
The most obvious additional ability is that always >>> d[SOME_EXPRESSION] is equivalent to >>> d[key] for a suitable key. This is a capability that we already have, which would sometimes be lost under the scheme you support. Also lost would be the equivalence between
val = d[key] getter = operator.itemgetter(key) val = getter(d)
More exactly, sometimes it wouldn't be possible to find and use a key. Docs would have to be changed. See: https://docs.python.org/3/library/operator.html#operator.itemgetter As I understand it, xarray uses dimension names to slice data. Here's an example from http://xarray.pydata.org/en/stable/indexing.html#indexing-with-dimension-nam... >>> da[dict(space=0, time=slice(None, 2))] Presumably, this would be replaced by something like >>> da[space=0, time=:2] Now, the commands
da[space=0, time=:2] da[space=0, time=:2] = True del da[space=0, time=:2] would at the begging of the call, presumably, do the same processing on the keyword arguments. (Let this stand for a wide range of examples.)
It is arguable that making it easier for the implementer of type(da) to do all that processing in the same place would be a REDUCTION of complexity. Allowing the processing to produce an intermediate object, say >>> key = dict(space=0, time=slice(None, 2)) would help here. Real world examples are required, I think, to ground any discussions of complexity and simplicity. We want to optimise for what people do, for the problems they face. And this is a new feature. We have a perfectly good way of handling keywords, so it is up to you to
explain why we shouldn't use it.
The scheme you support does not distinguish >>> d[1, 2, x=3, y=4] >>> d[(1, 2), x=3, y=4] I don't regard that as being perfectly good. In addition, I would like >>> d = dict() >>> d[x=1, y=2] = 5 to work. It works out-of-the-box for my scheme. It can be made to work with a subclass of dict for the D'Aprano scheme. I think that is enough for now. I'd prefer to discuss this further by writing Python modules that contain code that can be tested. The testing should cover both the technical correctness and the user experience. To support this I intend not to focus on the next version of kwkey. https://pypi.org/project/kwkey/ -- Jonathan
On Thu, Aug 20, 2020 at 12:55 PM Jonathan Fine <jfine2358@gmail.com> wrote:
In addition, I would like >>> d = dict() >>> d[x=1, y=2] = 5 to work. It works out-of-the-box for my scheme. It can be made to work with a subclass of dict for the D'Aprano scheme.
This raises the question about this advantage (I agree it could be an advantage): is providing such a subclass really such a heavy lift? Additionally and related: what would be the real world advantages of d[x=1, y=2], where d is just a vanilla dict, working right out of the box? --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
On Thu, Aug 20, 2020 at 12:55 PM Jonathan Fine <jfine2358@gmail.com> wrote:
In addition, I would like
>>> d = dict() >>> d[x=1, y=2] = 5 to work. It works out-of-the-box for my scheme.
1) it does? could you explain that, I can't see it. 2) so what? -- it would still only work with the next version of Python, and the dict could be updated in that version 3) I don't think I want that to "just work" anyway -- in fact, I have no idea what it means. I can guess that it essentially created something like a namedtuple that is used as a key -- is that correct? in which case, I'm not sure I would want that, as you'd need/want a way to make that same key object outside of indexing, and then you might as well just make it. as an example, you can now do: In [2]: d[1,2] = 'this' In [3]: t = (1,2) In [4]: d[t] Out[4]: 'this' But I'm pretty sure I have never done that before just now, though I have certainly used tuples as keys in dicts. In any case, I'd suggest you keep the discussion of extending dict behavior a bit separate from the more general extension to indexing -- it's fine as an example, but this is not about adding functionality for dicts, and I suspect that dicts are the least interesting use case to most of us. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Thu, Aug 20, 2020 at 1:58 PM Christopher Barker <pythonchb@gmail.com> wrote:
On Thu, Aug 20, 2020 at 12:55 PM Jonathan Fine <jfine2358@gmail.com> wrote:
In addition, I would like
>>> d = dict() >>> d[x=1, y=2] = 5 to work. It works out-of-the-box for my scheme.
1) it does? could you explain that, I can't see it.
My understanding is that the "o" class would be static and thus hashable, so anything that accepts a hashable key/index would be able to use this (correct me if I am wrong Jonathan). I think this is a bad idea, since it would mean classes could seem to support keyword arguments but silently do the completely wrong thing, especially if someone accidentally uses an older version.
Hi Todd You wrote: I think this is a bad idea, since it would mean classes could seem to
support keyword arguments but silently do the completely wrong thing, especially if someone accidentally uses an older version.
I don't see this happening, and certainly don't see it as a new problem. To help me understand, please provide an example where this happens. It would be best to wait a few days first, after I've fixed https://github.com/jfine2358/python-kwkey/issues/2 -- Jonathan
On Thu, Aug 20, 2020 at 2:28 PM Jonathan Fine <jfine2358@gmail.com> wrote:
Hi Todd
You wrote:
I think this is a bad idea, since it would mean classes could seem to
support keyword arguments but silently do the completely wrong thing, especially if someone accidentally uses an older version.
I don't see this happening, and certainly don't see it as a new problem. To help me understand, please provide an example where this happens.
It would be best to wait a few days first, after I've fixed https://github.com/jfine2358/python-kwkey/issues/2
As I said, it could very easily happen in xarray. xarray has a structure called a Dataset, which is a dict-style container for DataArrays (labelled multidimensional arrays). It supports both selecting an array by key or indexing all the arrays by their indices. So imagine a labelled array 'y' with dimensions 'a' and 'b'. So consider this example: ds = Dataset({'y': y}) ds[a=1, b=2] = y[a=2, b=3] If xarray supports keyword arguments, this would assign to the corresponding values. If it didn't, it would create a new element of the Dataset containing "y[a=2, b=3]". But "y" would continue working as it would, only with different values than expected.
On Thu, Aug 20, 2020 at 02:58:28PM -0400, Todd wrote:
If xarray supports keyword arguments, this would assign to the corresponding values. If it didn't, it would create a new element of the Dataset containing "y[a=2, b=3]". But "y" would continue working as it would, only with different values than expected.
Thanks Todd, that's an excellent example of how an unexpected change could lead to code doing the wrong thing instead of a nice, easy to diagnose exception. -- Steve
On Thu, Aug 20, 2020 at 5:16 PM Steven D'Aprano <steve@pearwood.info> wrote:
If xarray supports keyword arguments, this would assign to the corresponding values. If it didn't, it would create a new element of the Dataset containing "y[a=2, b=3]". But "y" would continue working as it would, only with different values than expected.
Thanks Todd, that's an excellent example of how an unexpected change could lead to code doing the wrong thing instead of a nice, easy to diagnose exception.
well, that isn't currently supported syntax anyway, so xarray won't break. But if we did add support for this, then xarray might use it, then it's a question of what it means -- I can't see how it would be ambiguous or break anything. though maybe I"m missing something .... And what I'm missing might be this: Jonatha's proposal is designed to make it easy for: [x=4, y=5] to make a little immutable mapping that can be used as a dict key. essentially shorthand for what you now have to do with: [CustomImmutableMapping(x=4, y=5)] But the problem with that is that if a class wants to use keyword indexing in a different way, then it would require extra effort to do so. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
I have not fully thought this out yet, but while my first instinct was to agree with others to “just use the calling conventions we already have”, there is a wrinkle: Current indexing behavior is an oddball now: ( you all know this, but I think it’s helpful to lay it out) The signature of __getitem__ is always: def __getitem__(self, index): If you pass a single item: an_object[thing] then that thing gets assigned to index, whatever it is. if you pass more than one item: an_object(thing1, thing2, thing3) then a tuple of those gets assigned to index, and the implementation of __getitem__ has to parse that out itself, which is different than the "usual" argument passing, where it would always be a tuple, whether is was a length-one or not. (and to get that tuple, you'd need to use *args, or specify a particular number of arguments. So: if we want to maintain backward compatibility, we *can't* use the regula rargument passing approach, it will have to be a slightly odd special case. Which brings us to (what I think is) Jonathan's idea: we keep the idea that __getitem__ always accepts a single argument. now it's either a single object or a tuple of objects. If we extend that, then it's either a single object, or a tuple of opjects, or a new "keywords" object that would hold both the positional and keyword "arguments", so any old code that did somethign like: def __getitem__(self, index): if isinstance(index, tuple): handle_the_tuple_of_indices(index) else: handle_a_single_index(index) would still work as it does now. and if something wanted to implement keywords, it could add a clause: elif isinstance(index, keywords_object): handle_all_the_args_and_keywords(index) and away we go. TL;DR: Indexing would now create one of: - a single item - a tuple of items - a keywords_object_of positional and keyword arguments. And just like we can now create a tuple of indices and pass them in as a single object, we could also create a keyword_object some other way and pass that in directly. If we did not do this, could we use: an_object[*args, **kwargs] and if *args was length-1, it would get extracted from the tuple? or would the seroth item of *args always get extracted from the tuple? So creating a new object to hold the arguments of an indexing operation is a bit awkward, yes, but more consistent with how it currently works. -CHB On Thu, Aug 20, 2020 at 9:55 AM Jonathan Fine <jfine2358@gmail.com> wrote:
Todd wrote:
It has the same capabilities, the question is whether it has any
additional abilities that would justify the added complexity.
The most obvious additional ability is that always >>> d[SOME_EXPRESSION] is equivalent to >>> d[key] for a suitable key.
This is a capability that we already have, which would sometimes be lost under the scheme you support. Also lost would be the equivalence between
val = d[key] getter = operator.itemgetter(key) val = getter(d)
More exactly, sometimes it wouldn't be possible to find and use a key. Docs would have to be changed. See: https://docs.python.org/3/library/operator.html#operator.itemgetter
As I understand it, xarray uses dimension names to slice data. Here's an example from
http://xarray.pydata.org/en/stable/indexing.html#indexing-with-dimension-nam... >>> da[dict(space=0, time=slice(None, 2))]
Presumably, this would be replaced by something like >>> da[space=0, time=:2]
da[space=0, time=:2] da[space=0, time=:2] = True del da[space=0, time=:2] would at the begging of the call, presumably, do the same processing on
Now, the commands the keyword arguments. (Let this stand for a wide range of examples.)
It is arguable that making it easier for the implementer of type(da) to do all that processing in the same place would be a REDUCTION of complexity. Allowing the processing to produce an intermediate object, say >>> key = dict(space=0, time=slice(None, 2)) would help here.
Real world examples are required, I think, to ground any discussions of complexity and simplicity. We want to optimise for what people do, for the problems they face. And this is a new feature.
We have a perfectly good way of handling keywords, so it is up to you to
explain why we shouldn't use it.
The scheme you support does not distinguish >>> d[1, 2, x=3, y=4] >>> d[(1, 2), x=3, y=4] I don't regard that as being perfectly good.
In addition, I would like >>> d = dict() >>> d[x=1, y=2] = 5 to work. It works out-of-the-box for my scheme. It can be made to work with a subclass of dict for the D'Aprano scheme.
I think that is enough for now.
I'd prefer to discuss this further by writing Python modules that contain code that can be tested. The testing should cover both the technical correctness and the user experience. To support this I intend not to focus on the next version of kwkey. https://pypi.org/project/kwkey/
-- Jonathan
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/3XRS7W...
Code of Conduct: http://python.org/psf/codeofconduct/
This has been discussed. The current consensus approach would be to keep the index argument as a single value, while making keyword indices as keyword arguments. So something like: def __getitem__(self, index, **kwargs): Classes that don't want to handle keyword indices just don't have to implement handling for keyword arguments. This has the advantage that it makes it easier to hard-code dimension labels if desired. Although I haven't heard from pandas devs on this, where dataframes are fixed at having two dimensions, "row" and "column" , you could potentially implement something like: def __getitem__(self, index, row=None, column=None): No special work would be needed to check whether the keywords match, which would be needed with a new class. On Thu, Aug 20, 2020 at 1:42 PM Christopher Barker <pythonchb@gmail.com> wrote:
I have not fully thought this out yet, but while my first instinct was to agree with others to “just use the calling conventions we already have”, there is a wrinkle:
Current indexing behavior is an oddball now:
( you all know this, but I think it’s helpful to lay it out)
The signature of __getitem__ is always:
def __getitem__(self, index):
If you pass a single item:
an_object[thing]
then that thing gets assigned to index, whatever it is.
if you pass more than one item:
an_object(thing1, thing2, thing3)
then a tuple of those gets assigned to index, and the implementation of __getitem__ has to parse that out itself, which is different than the "usual" argument passing, where it would always be a tuple, whether is was a length-one or not. (and to get that tuple, you'd need to use *args, or specify a particular number of arguments.
So: if we want to maintain backward compatibility, we *can't* use the regula rargument passing approach, it will have to be a slightly odd special case.
Which brings us to (what I think is) Jonathan's idea: we keep the idea that __getitem__ always accepts a single argument.
now it's either a single object or a tuple of objects. If we extend that, then it's either a single object, or a tuple of opjects, or a new "keywords" object that would hold both the positional and keyword "arguments", so any old code that did somethign like:
def __getitem__(self, index): if isinstance(index, tuple): handle_the_tuple_of_indices(index) else: handle_a_single_index(index)
would still work as it does now.
and if something wanted to implement keywords, it could add a clause:
elif isinstance(index, keywords_object): handle_all_the_args_and_keywords(index)
and away we go.
TL;DR: Indexing would now create one of: - a single item - a tuple of items - a keywords_object_of positional and keyword arguments.
And just like we can now create a tuple of indices and pass them in as a single object, we could also create a keyword_object some other way and pass that in directly.
If we did not do this, could we use:
an_object[*args, **kwargs]
and if *args was length-1, it would get extracted from the tuple? or would the seroth item of *args always get extracted from the tuple?
So creating a new object to hold the arguments of an indexing operation is a bit awkward, yes, but more consistent with how it currently works.
-CHB
On Thu, Aug 20, 2020 at 9:55 AM Jonathan Fine <jfine2358@gmail.com> wrote:
Todd wrote:
It has the same capabilities, the question is whether it has any
additional abilities that would justify the added complexity.
The most obvious additional ability is that always >>> d[SOME_EXPRESSION] is equivalent to >>> d[key] for a suitable key.
This is a capability that we already have, which would sometimes be lost under the scheme you support. Also lost would be the equivalence between
val = d[key] getter = operator.itemgetter(key) val = getter(d)
More exactly, sometimes it wouldn't be possible to find and use a key. Docs would have to be changed. See: https://docs.python.org/3/library/operator.html#operator.itemgetter
As I understand it, xarray uses dimension names to slice data. Here's an example from
http://xarray.pydata.org/en/stable/indexing.html#indexing-with-dimension-nam... >>> da[dict(space=0, time=slice(None, 2))]
Presumably, this would be replaced by something like >>> da[space=0, time=:2]
da[space=0, time=:2] da[space=0, time=:2] = True del da[space=0, time=:2] would at the begging of the call, presumably, do the same processing on
Now, the commands the keyword arguments. (Let this stand for a wide range of examples.)
It is arguable that making it easier for the implementer of type(da) to do all that processing in the same place would be a REDUCTION of complexity. Allowing the processing to produce an intermediate object, say >>> key = dict(space=0, time=slice(None, 2)) would help here.
Real world examples are required, I think, to ground any discussions of complexity and simplicity. We want to optimise for what people do, for the problems they face. And this is a new feature.
We have a perfectly good way of handling keywords, so it is up to you to
explain why we shouldn't use it.
The scheme you support does not distinguish >>> d[1, 2, x=3, y=4] >>> d[(1, 2), x=3, y=4] I don't regard that as being perfectly good.
In addition, I would like >>> d = dict() >>> d[x=1, y=2] = 5 to work. It works out-of-the-box for my scheme. It can be made to work with a subclass of dict for the D'Aprano scheme.
I think that is enough for now.
I'd prefer to discuss this further by writing Python modules that contain code that can be tested. The testing should cover both the technical correctness and the user experience. To support this I intend not to focus on the next version of kwkey. https://pypi.org/project/kwkey/
-- Jonathan
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/3XRS7W...
Code of Conduct: http://python.org/psf/codeofconduct/
On Thu, Aug 20, 2020 at 10:41:42AM -0700, Christopher Barker wrote:
Current indexing behavior is an oddball now:
( you all know this, but I think it’s helpful to lay it out)
Correct.
The signature of __getitem__ is always:
def __getitem__(self, index):
Mostly correct -- you can still declare the method with additional parameters, so long as they have default values. Calling the method with subscript syntax will always bind the entire subscript to the first parameter, but if you call the dunder method directly you can do anything you like. It's just a method. This is likely to be a rare and unusual case, but we don't want to break anyone who already has a dunder something like this: def __getitem__(self, index, extra=None) [...]
So: if we want to maintain backward compatibility, we *can't* use the regula rargument passing approach, it will have to be a slightly odd special case.
It's already a slightly odd special case, the question is whether we want to extend that oddness to the keyword arguments as well as the positional arguments? [...]
now it's either a single object or a tuple of objects. If we extend that, then it's either a single object, or a tuple of opjects, or a new "keywords" object that would hold both the positional and keyword "arguments", so any old code that did somethign like:
def __getitem__(self, index): if isinstance(index, tuple): handle_the_tuple_of_indices(index) else: handle_a_single_index(index)
would still work as it does now.
I don't think it would, because that "single index" may turn into an unexpected keywords_object, as you describe here:
and if something wanted to implement keywords, it could add a clause:
elif isinstance(index, keywords_object): handle_all_the_args_and_keywords(index)
and away we go.
But if you *don't* add that clause, your `handle_a_single_index` will start to receive keyword_objects instead of whatever you expect. That may lead to an easily diagnosed exception, but it may not. Obviously we're talking about library code here. In an application where the author of the class is also the caller of the class, keyword args in subscripts aren't going to magically appear in your own source code unless you put them there :-) But for libraries that expect a single index to be key or int or slice, say, Jonathan's proposal will mean they will receive unexpected keyword_objects as well, and we don't know how they will be handled. Jonathan's proposal would be grand if we all agreed that the primary use-case for this feature is "I want to use a collection of name:value pairs as a key in a dict": d[spam=True, eggs=False] = 42 assert d[spam=True, eggs=False] == 42 That's a reasonable use-case, I guess, but Python is 30 years old and we don't have a standard hashable mapping object or namespace object. So I don't think it's a common use-case. If people want it, they can provide their own frozen dict or frozen namespace and convert their `**kwargs` parameter into a key. But the primary use-cases for named keyword parameters is, I think, to use them as modifier options to the subscript, not as part of the subscript itself. And for that, we want keyword arguments to be automatically unpacked into named parameters, just as happens with function calls. -- Steve
On Thu, Aug 20, 2020 at 5:11 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Aug 20, 2020 at 10:41:42AM -0700, Christopher Barker wrote:
Current indexing behavior is an oddball now:
The signature of __getitem__ is always:
def __getitem__(self, index):
Mostly correct -- you can still declare the method with additional parameters, so long as they have default values. Calling the method with subscript syntax will always bind the entire subscript to the first parameter, but if you call the dunder method directly you can do anything you like. It's just a method.
This is likely to be a rare and unusual case, but we don't want to break anyone who already has a dunder something like this:
def __getitem__(self, index, extra=None)
Is that an issue we care about? As I think about it, the dunders are special: they are reserved by python to support particular things, and in this case, the "official" use will never pass anything other than the one argument. So isn't it OK to break code that might be "abusing" the __getitem__ dunder for some other use? honestly, folks could have code that used any dunder in any way -- but I think it's OK to break that code. xarray is an example, it *could* have extended __getitem__ in a similar way, but it didn't, because that really would have been a "bad idea". But in any case the keyword_index object approach would be less likely to break this code than "regular" keywords would be.
So: if we want to maintain backward compatibility, we *can't* use the
regula rargument passing approach, it will have to be a slightly odd special case.
It's already a slightly odd special case, the question is whether we want to extend that oddness to the keyword arguments as well as the positional arguments?
yes, that was my point :-) The question is which new odd special case we go with, one that's closer to the current one, or one that's closer to "regular" argument passing.
def __getitem__(self, index): if isinstance(index, tuple): handle_the_tuple_of_indices(index) else: handle_a_single_index(index)
would still work as it does now.
I don't think it would, because that "single index" may turn into an unexpected keywords_object, as you describe here:
yes, but existing code doesn't accept any old thing as a single index, Each class has particular things is accepts, For example: Sequences accept something with __Index__ or a slice object. So if it got a keywords_index object it would raise a TypeError, just like they do when you pass in any number of other types. Mappings are more of a challenge, 'cause they accept any hashable type -- so if the keywords_index object were hashable, then it would just work -- which is what Jonathan wants, but I'm not sure that's a great idea. If it weren't hashable, then nothing would change for Mapping either. Of course, arbitrary objects can have arbitrary handling of the single index -- but at least most would expect to get a particular type or types, and would hopefully raise on some new object that has never existed before. With dynamic duck typing, it's possible that it would work, but do the wrong thing, but that's got to be rare. But if anyone has an example of already existing code that would be broken by this, I'd love to see it. Also: I think that type hints use the [] operator as well -- though I don't know anything about them -- but does the semantics of [] for square brackets need to be the same as for indexing? But if you *don't* add that clause, your `handle_a_single_index` will
start to receive keyword_objects instead of whatever you expect. That may lead to an easily diagnosed exception, but it may not.
sure -- though I would like to see an actual example of where it wouldn't be a TypeError (from otherwise robust code)
But for libraries that expect a single index to be key or int or slice,
say, Jonathan's proposal will mean they will receive unexpected keyword_objects as well, and we don't know how they will be handled.
We can already pass arbitrary objects in as an index, so any robust code would already handle that well. Unless it was designed to handle an object that happened to duck-type to the new keywords_index object -- which seems very unlikely to me.
Jonathan's proposal would be grand if we all agreed that the primary
use-case for this feature is "I want to use a collection of name:value pairs as a key in a dict":
I agree that that is not the primary use case for this feature, and that that is really a separate issue anyway. I think a ImmutableMapping would be nice to have, but as you say, you can write one yourself, and it's apparently not useful enough to be a commonly used thing. which is why I'm making the case for this approach completely separately from how dicts might work. I'm kind of thinking out loud here, I'm still not sure which approach I prefer. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Thu, Aug 20, 2020, at 20:09, Steven D'Aprano wrote:
This is likely to be a rare and unusual case, but we don't want to break anyone who already has a dunder something like this:
def __getitem__(self, index, extra=None)
People do similar things with regular arguments to regular functions now, trusting callers not to pass stuff in for no reason that will break the function has been adequate there.
On Thu, Aug 20, 2020 at 9:16 PM Random832 <random832@fastmail.com> wrote:
On Thu, Aug 20, 2020, at 20:09, Steven D'Aprano wrote:
This is likely to be a rare and unusual case, but we don't want to break anyone who already has a dunder something like this:
def __getitem__(self, index, extra=None)
People do similar things with regular arguments to regular functions now, trusting callers not to pass stuff in for no reason that will break the function has been adequate there.
Yes, but this is a little different in that the [] operator will never pass in anything else, but if we make this change, then it might -- so this is a breaking change in this case. And having keywords in the square brackets pass keyword args to the dunder could,m in fact break this use case, whereas passing a new keyword_index object would not. But as I said in an earlier note -- having ezra keyword parameters on __getitem__ is an abuse of the system anyway, so I think it's OK to break that. -CHB
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/XYYOGR... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
The worst that could happen would be that some user figures out they can access the hidden functionality by writing a[key, extra=“stuff”], whereas previously they would have to call a.__getitem__(key, extra=“stuff”). Doesn’t sound like a big deal, and not breakage either. On Sat, Aug 22, 2020 at 11:09 Christopher Barker <pythonchb@gmail.com> wrote:
On Thu, Aug 20, 2020 at 9:16 PM Random832 <random832@fastmail.com> wrote:
On Thu, Aug 20, 2020, at 20:09, Steven D'Aprano wrote:
This is likely to be a rare and unusual case, but we don't want to break
anyone who already has a dunder something like this:
def __getitem__(self, index, extra=None)
People do similar things with regular arguments to regular functions now, trusting callers not to pass stuff in for no reason that will break the function has been adequate there.
Yes, but this is a little different in that the [] operator will never pass in anything else, but if we make this change, then it might -- so this is a breaking change in this case. And having keywords in the square brackets pass keyword args to the dunder could,m in fact break this use case, whereas passing a new keyword_index object would not.
But as I said in an earlier note -- having ezra keyword parameters on __getitem__ is an abuse of the system anyway, so I think it's OK to break that.
-CHB
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/XYYOGR...
Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD
Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/USN6UT...
Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido (mobile)
On Sat, Aug 22, 2020 at 11:22 AM Guido van Rossum <guido@python.org> wrote:
The worst that could happen would be that some user figures out they can access the hidden functionality by writing a[key, extra=“stuff”], whereas previously they would have to call a.__getitem__(key, extra=“stuff”).
Doesn’t sound like a big deal, and not breakage either.
well, breakage is arguable, but I agree, not a big deal. Now that i think about it, I don't think the keywords_index object idea would break this use case either: calling the dunder directly. would stil\ work the same way. so a non-argument either way. -CHB
On Sat, Aug 22, 2020 at 11:09 Christopher Barker <pythonchb@gmail.com> wrote:
On Thu, Aug 20, 2020 at 9:16 PM Random832 <random832@fastmail.com> wrote:
On Thu, Aug 20, 2020, at 20:09, Steven D'Aprano wrote:
This is likely to be a rare and unusual case, but we don't want to break
anyone who already has a dunder something like this:
def __getitem__(self, index, extra=None)
People do similar things with regular arguments to regular functions now, trusting callers not to pass stuff in for no reason that will break the function has been adequate there.
Yes, but this is a little different in that the [] operator will never pass in anything else, but if we make this change, then it might -- so this is a breaking change in this case. And having keywords in the square brackets pass keyword args to the dunder could,m in fact break this use case, whereas passing a new keyword_index object would not.
But as I said in an earlier note -- having ezra keyword parameters on __getitem__ is an abuse of the system anyway, so I think it's OK to break that.
-CHB
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/XYYOGR...
Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD
Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/USN6UT...
Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido (mobile)
-- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Thu, Aug 20, 2020 at 12:54 PM Jonathan Fine <jfine2358@gmail.com> wrote:
Todd wrote:
It has the same capabilities, the question is whether it has any
additional abilities that would justify the added complexity.
The most obvious additional ability is that always >>> d[SOME_EXPRESSION] is equivalent to >>> d[key] for a suitable key.
This is a capability that we already have, which would sometimes be lost under the scheme you support. Also lost would be the equivalence between
val = d[key] getter = operator.itemgetter(key) val = getter(d)
More exactly, sometimes it wouldn't be possible to find and use a key. Docs would have to be changed. See: https://docs.python.org/3/library/operator.html#operator.itemgetter
As I understand it, xarray uses dimension names to slice data. Here's an example from
http://xarray.pydata.org/en/stable/indexing.html#indexing-with-dimension-nam... >>> da[dict(space=0, time=slice(None, 2))]
Presumably, this would be replaced by something like >>> da[space=0, time=:2]
Was the slicing notation already explicitly proposed for kwargs? I find it too similar to the walrus operator when the first argument is missing. I could only find an example in this section https://www.python.org/dev/peps/pep-0472/#use-cases, but the first argument is defined. rain[time=0:12, location=location]
da[space=0, time=:2] da[space=0, time=:2] = True del da[space=0, time=:2] would at the begging of the call, presumably, do the same processing on
Now, the commands the keyword arguments. (Let this stand for a wide range of examples.)
It is arguable that making it easier for the implementer of type(da) to do all that processing in the same place would be a REDUCTION of complexity. Allowing the processing to produce an intermediate object, say >>> key = dict(space=0, time=slice(None, 2)) would help here.
Real world examples are required, I think, to ground any discussions of complexity and simplicity. We want to optimise for what people do, for the problems they face. And this is a new feature.
We have a perfectly good way of handling keywords, so it is up to you to
explain why we shouldn't use it.
The scheme you support does not distinguish >>> d[1, 2, x=3, y=4] >>> d[(1, 2), x=3, y=4] I don't regard that as being perfectly good.
In addition, I would like >>> d = dict() >>> d[x=1, y=2] = 5 to work. It works out-of-the-box for my scheme. It can be made to work with a subclass of dict for the D'Aprano scheme.
I think that is enough for now.
I'd prefer to discuss this further by writing Python modules that contain code that can be tested. The testing should cover both the technical correctness and the user experience. To support this I intend not to focus on the next version of kwkey. https://pypi.org/project/kwkey/
-- Jonathan _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/3XRS7W... Code of Conduct: http://python.org/psf/codeofconduct/
-- Sebastian Kreft
On Thu, Aug 20, 2020 at 1:43 PM Sebastian Kreft <skreft@gmail.com> wrote:
On Thu, Aug 20, 2020 at 12:54 PM Jonathan Fine <jfine2358@gmail.com> wrote:
Todd wrote:
It has the same capabilities, the question is whether it has any
additional abilities that would justify the added complexity.
The most obvious additional ability is that always >>> d[SOME_EXPRESSION] is equivalent to >>> d[key] for a suitable key.
This is a capability that we already have, which would sometimes be lost under the scheme you support. Also lost would be the equivalence between
val = d[key] getter = operator.itemgetter(key) val = getter(d)
More exactly, sometimes it wouldn't be possible to find and use a key. Docs would have to be changed. See: https://docs.python.org/3/library/operator.html#operator.itemgetter
As I understand it, xarray uses dimension names to slice data. Here's an example from
http://xarray.pydata.org/en/stable/indexing.html#indexing-with-dimension-nam... >>> da[dict(space=0, time=slice(None, 2))]
Presumably, this would be replaced by something like >>> da[space=0, time=:2]
Was the slicing notation already explicitly proposed for kwargs? I find it too similar to the walrus operator when the first argument is missing.
I could only find an example in this section https://www.python.org/dev/peps/pep-0472/#use-cases, but the first argument is defined.
rain[time=0:12, location=location]
That is something I want to bring up, but I was waiting for the syntax discussion to get settled to avoid derailing it. I felt it the conversation is already getting pulled in too many directions.
On Thu, Aug 20, 2020 at 12:54 PM Jonathan Fine <jfine2358@gmail.com> wrote:
Todd wrote:
It has the same capabilities, the question is whether it has any
additional abilities that would justify the added complexity.
The most obvious additional ability is that always >>> d[SOME_EXPRESSION] is equivalent to >>> d[key] for a suitable key.
This is a capability that we already have, which would sometimes be lost under the scheme you support. Also lost would be the equivalence between
val = d[key] getter = operator.itemgetter(key) val = getter(d)
Classes that want this could always support a tuple including a dict. For example, d[(1, 2, {'space': 0, 'time': 2})] So this doesn't really help much, saving a few characters at most.
More exactly, sometimes it wouldn't be possible to find and use a key.
What do you mean by this?
Docs would have to be changed. See: https://docs.python.org/3/library/operpython-ideas <python-ideas@python.org>ator.html#operator.itemgetter <https://docs.python.org/3/library/operator.html#operator.itemgetter>
This would be the case either way. If itemgetter is made to support keyword arguments it would need to have its docs changed. Or are you suggesting that itemgetter be made to only support the "o" class but not keyword arguments directly? That would need to be documented too.
As I understand it, xarray uses dimension names to slice data. Here's an example from
http://xarray.pydata.org/en/stable/indexing.html#indexing-with-dimension-nam... >>> da[dict(space=0, time=slice(None, 2))]
Presumably, this would be replaced by something like >>> da[space=0, time=:2]
da[space=0, time=:2] da[space=0, time=:2] = True del da[space=0, time=:2] would at the begging of the call, presumably, do the same processing on
Now, the commands the keyword arguments. (Let this stand for a wide range of examples.)
It is arguable that making it easier for the implementer of type(da) to do all that processing in the same place would be a REDUCTION of complexity. Allowing the processing to produce an intermediate object, say >>> key = dict(space=0, time=slice(None, 2)) would help here.
I don't see how. kwargs could be packed into a dict, which could then be processed identically to passing a dict directly. While in your approach there would need to be a test for a new class, and then an additional step to separate out the parts of it.
We have a perfectly good way of handling keywords, so it is up to you to
explain why we shouldn't use it.
The scheme you support does not distinguish >>> d[1, 2, x=3, y=4] >>> d[(1, 2), x=3, y=4] I don't regard that as being perfectly good.
I think for backwards-compatibility it would have to be. Why should adding keyword arguments radically change the meaning of the positional arguments? That seems like an enormous trap.
In addition, I would like >>> d = dict() >>> d[x=1, y=2] = 5 to work. It works out-of-the-box for my scheme. It can be made to work with a subclass of dict for the D'Aprano scheme.
First, if it was desired it could be made to work with the normal dict. The dict class would just need to be modified to handle it. I think it is highly debatable whether it should, but there is no reason it couldn't. That is a separate discussion. Having it work out-of-the-box like that in your case is actually a downside, in my opinion. It could lead to unexpected situations where classes SEEM to work with keyword indices, but really don't. For example someone could use it with an old version of xarray that doesn't support it, have it seem to work because it is accepting a hashable, but then have it silently do a completely different thing. So I think classes having to explicitly handle them appropriately for that class is a benefit, not a downside.
I'd prefer to discuss this further by writing Python modules that contain code that can be tested. The testing should cover both the technical correctness and the user experience. To support this I intend not to focus on the next version of kwkey.
I still don't see how testing it will help anything at this point. The behavior is easy to explain, so examples could be provided without actually needing to run it. So please let's discuss this and work through our thinking with examples before spending a lot of time writing code.
On Thu, Aug 20, 2020 at 2:13 PM Jonathan Fine <jfine2358@gmail.com> wrote:
Hi Todd
I still don't see how testing it will help anything at this point.
Well, you and I have a difference of opinion here. I don't think it's worth discussing this further now. Perhaps next month it will be.
This is an important feature for me. It gets brought up periodically, gets some support, then gradually fades from focus and nothing happens. We are making some real progress for the first time in a long time, and I would really like to avoid another dead-end. If you can't explain exactly what you hope to learn from these tests, I don't see why we should put the discussion off an indeterminate amount of time and have it risk it falling off the radar yet again for no real benefit.
Jonathan, you’re getting awfully close to “not knowing when to stop”. On Thu, Aug 20, 2020 at 11:20 Jonathan Fine <jfine2358@gmail.com> wrote:
Hi Todd
I still don't see how testing it will help anything at this point.
Well, you and I have a difference of opinion here. I don't think it's worth discussing this further now. Perhaps next month it will be.
-- Jonathan
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/BEXD7C...
Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido (mobile)
participants (16)
-
Antoine Pitrou
-
Caleb Donovick
-
Christopher Barker
-
David Mertz
-
Greg Ewing
-
Guido van Rossum
-
Jonathan Fine
-
Paul Moore
-
Random832
-
Ricky Teachey
-
Sebastian Berg
-
Sebastian Kreft
-
Stefano Borini
-
Stephan Hoyer
-
Steven D'Aprano
-
Todd