PEP 472 - slices in keyword indices, d[x=1:3]
I think it is worth directly discussing the availability of slices in PEP 472-style keyword indices, since we seem to have mostly converged on a dunder method signature. This is an issue that has been alluded to regarding keyword-based (labelled) indices but not directly addressed. The basic syntax would be something like d[x=1:3]. I am strongly in favor of having slices. The main motivating factor for me, labelled dimensions in xarray, would be much, much less useful without support for slices. In fact, as PEP 472 currently mentions, the big benefit of indexing over method calls is that indexing supports slice syntax while method calls don't. In a more general sense, I feel not allowing slices would create an artificial distinction between labelled and positional indices that I don't think is justified. They would work the same, except for slices where labelled indices behave differently. It would be a strange gotcha. So I think any revision to PEP 472 or new PEP should directly and explicitly support the use of slices.
On Sun, Aug 23, 2020 at 6:42 PM Todd <toddrjen@gmail.com> wrote:
I think it is worth directly discussing the availability of slices in PEP 472-style keyword indices,
+1 on slices in all indexing. But thus brings up a broader question: Why not allow slice syntax as an expression everywhere? Everywhere I’ve tried, it’s a syntax error now, but is there any technical reason that it couldn’t be used pretty much anywhere? -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Mon, Aug 24, 2020, at 00:43, Christopher Barker wrote:
But thus brings up a broader question:
Why not allow slice syntax as an expression everywhere? Everywhere I’ve tried, it’s a syntax error now, but is there any technical reason that it couldn’t be used pretty much anywhere?
is {a:b} a set containing a slice, or a dict? obviously it's a dict, but are there any other places that might be affected? what about other forms of slices, or if it's not the only element? should we support {a, b:c} as a set containing a slice? what about {a:b, c}? there may be other places it might be desirable to add new syntax that uses the colon character, allowing slices anywhere would foreclose that.
On Mon, Aug 24, 2020 at 12:54 AM Random832 <random832@fastmail.com> wrote:
Why not allow slice syntax as an expression everywhere?
is {a:b} a set containing a slice, or a dict? obviously it's a dict, but are there any other places that might be affected?
should we support {a, b:c} as a set containing a slice? what about {a:b, c}?
well, slices aren't hashable, so no. though technically hashabily is not a syntax issue. So if a:b:c was a valid expressson anywhere, then you should be able to *try* putting in in a set ... but {a:b:1} is now a syntax error, so we could make only the three-part form allowable.
there may be other places it might be desirable to add new syntax that uses the colon character, allowing slices anywhere would foreclose that.
well, not sure I agree that that's a strong motivator. I do think the dict display issue is big one though. But maybe we could expand where slices can be used, like function calls, for instance: though *maybe* there would be less use for that if.when we get keywords in the [] operator. -CHB
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/CEF2MF... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Mon, Aug 24, 2020 at 6:27 PM Christopher Barker <pythonchb@gmail.com> wrote:
but {a:b:1} is now a syntax error, so we could make only the three-part form allowable.
But is that a set containing a:b:1, or a dict mapping a to b:1, or a dict mapping a:b to 1? I don't like it. ChrisA
On Mon, Aug 24, 2020 at 9:54 AM Random832 <random832@fastmail.com> wrote:
On Mon, Aug 24, 2020, at 00:43, Christopher Barker wrote:
But thus brings up a broader question:
Why not allow slice syntax as an expression everywhere? Everywhere I’ve tried, it’s a syntax error now, but is there any technical reason that it couldn’t be used pretty much anywhere?
is {a:b} a set containing a slice, or a dict? obviously it's a dict, but are there any other places that might be affected? what about other forms of slices, or if it's not the only element? should we support {a, b:c} as a set containing a slice? what about {a:b, c}?
there may be other places it might be desirable to add new syntax that uses the colon character, allowing slices anywhere would foreclose that.
{a:b} is a dict, {(a:b)} is a set containing one slice.
On Mon, 24 Aug 2020 at 09:59, Alex Hall <alex.mojaki@gmail.com> wrote:
On Mon, Aug 24, 2020 at 9:54 AM Random832 <random832@fastmail.com> wrote:
On Mon, Aug 24, 2020, at 00:43, Christopher Barker wrote:
But thus brings up a broader question:
Why not allow slice syntax as an expression everywhere? Everywhere I’ve tried, it’s a syntax error now, but is there any technical reason that it couldn’t be used pretty much anywhere?
is {a:b} a set containing a slice, or a dict? obviously it's a dict, but are there any other places that might be affected? what about other forms of slices, or if it's not the only element? should we support {a, b:c} as a set containing a slice? what about {a:b, c}?
there may be other places it might be desirable to add new syntax that uses the colon character, allowing slices anywhere would foreclose that.
{a:b} is a dict, {(a:b)} is a set containing one slice.
Isn't the "broader question" actually "what is the justification for allowing slice syntax as an expression everywhere?" In other words, what are the use cases? "Why not do X" is *never* a sufficient reason for a language change - there is always a cost, and unless there's a corresponding benefit, it simply isn't going to happen. Paul
On Mon, Aug 24, 2020 at 10:58:13AM +0200, Alex Hall wrote:
{a:b} is a dict, {(a:b)} is a set containing one slice.
What's `{a: b: c: d, x: y}`? Typo or key with a slice value? I know that any syntax can contain typos, but we generally try to avoid syntax which silently swallows such syntactic typos. The only one I can think of is implicit string concatenation: values = ['spam', 'eggs' 'NOBODY expects the Spanish Inquisition', 'Ethel the Aardvark Goes Quantity Surveying'] is probably supposed to have four items, not three :-) But implicit string concatentation is useful enough that (in my opinion) it is worth keeping it around despite the occasional whoops moment. Admittedly slicing is already vulnerable to that: obj[a:b,c] # two items or typo for an extended slice? but I'm not sure that slice literals outside of subscripts is useful at all, let alone useful enough to allow this sort of silent error outside of subscripts. -- Steve
My suggestion to solve this would be to use a similar rule to the walrus operator; only allow slice literals within either `()` brackets or `[]` square brackets; thus `{a: b: c: d, x: y}` becomes illegal and would need to be `{(a:b): (c:d), x: y}`
On Mon, Aug 24, 2020, 00:43 Christopher Barker <pythonchb@gmail.com> wrote:
But thus brings up a broader question:
Why not allow slice syntax as an expression everywhere? Everywhere I’ve tried, it’s a syntax error now, but is there any technical reason that it couldn’t be used pretty much anywhere?
That is a very different discussion, and not directly related to keyword indexes. Would it be possible to start a new email thread to discuss it?
Christopher wrote: Why not allow slice syntax as an expression everywhere? In reply, Todd wrote: That is a very different discussion, and not directly related to keyword indexes. Would it be possible to start a new email thread to discuss it? I think they are closely related matters, at least in terms of implementation. For details see rest of this message. I hope this helps our understanding, even if it shows difficulties lying ahead. My non-expert understanding is that if
d[a=1:2:3] is allowed by making a minimal change to Python's abstract grammar, then f(a=1:2:3) will also be allowed. (Extra work would be required to forbid it.)
d[::, ::]
f(x=SOMETHING) f[x=SOMETHING]
It is also my non-expert understanding that >>> {0:1:2:3:4:5} would then be equivalent to >>> {slice(0, 1, 2): slice(3, 4, 5)} and further that >>> { :: :: : } would become valid syntax! My non-expert understanding is based on https://docs.python.org/3/library/ast.html#abstract-grammar To me it seems that in for example the AST is constrained by slice = Slice(expr? lower, expr? upper, expr? step) | ExtSlice(slice* dims) | Index(expr value) while in the SOMETHING is an expr, and the AST is constrained by expr = BoolOp(boolop op, expr* values) | NamedExpr(expr target, expr value) | BinOp(expr left, operator op, expr right) | UnaryOp(unaryop op, expr operand) | Lambda(arguments args, expr body) | IfExp(expr test, expr body, expr orelse) | Dict(expr* keys, expr* values) ... | List(expr* elts, expr_context ctx) | Tuple(expr* elts, expr_context ctx If this is correct then adding Slice to the choices for expr would extend the AST to allow slices in keyword indices. And then the rest follows. -- Jonathan
On Mon, Aug 24, 2020 at 12:23 PM Jonathan Fine <jfine2358@gmail.com> wrote:
Christopher wrote: Why not allow slice syntax as an expression everywhere?
In reply, Todd wrote: That is a very different discussion, and not directly related to keyword indexes. Would it be possible to start a new email thread to discuss it?
I think they are closely related matters, at least in terms of implementation. For details see rest of this message. I hope this helps our understanding, even if it shows difficulties lying ahead.
My non-expert understanding is that if
d[a=1:2:3] is allowed by making a minimal change to Python's abstract grammar, then f(a=1:2:3) will also be allowed. (Extra work would be required to forbid it.)
Why would that be the case? d[1:3] is allowed but d(1:3) isn't. The interpreter replaces "1:3" with "slice(1, 3)" behind-the-scenes.
Hi Todd You wrote:
Why would that be the case? d[1:3] is allowed but d(1:3) isn't. The interpreter replaces "1:3" with "slice(1, 3)" behind-the-scenes.
>>> s ='hello' >>> s[1:3] 'el' >>> s[(1:3)] SyntaxError: invalid syntax >>> f(x=1:3) SyntaxError: invalid syntax My understanding is that the syntax errors arise because the parser is expecting, but not getting, an expression. (The conversion of "1:3" to a slice is semantics, not syntax.) My understanding is that if we make "1:3" an expression, as described in my previous post, then it will also follow that
{ :: :: :} is valid syntax.
To allow >>> d[x=1:3] while forbidding >>> { :: :: :} will I think require changes in several places to Python.asdl. I think it reasonable to require the PEP to explicitly state what those changes are. If not you then perhaps some supporter of this change can provide such changes, to be included in the PEP. By the way, Python.asdl is included verbatim in the docs page https://docs.python.org/3/library/ast.html#abstract-grammar https://github.com/python/cpython/blob/3.8/Doc/library/ast.rst#abstract-gram... -- Jonathan
On Mon, Aug 24, 2020 at 12:43 AM Christopher Barker <pythonchb@gmail.com> wrote:
Why not allow slice syntax as an expression everywhere? Everywhere I’ve tried, it’s a syntax error now, but is there any technical reason that it couldn’t be used pretty much anywhere?
How often do you do this?
class Slice: ... def __getitem__(self, o): ... return o I = Slice() print(I[1:100:3], I[999:888:-10]) slice(1, 100, 3) slice(999, 888, -10)
Currently, it takes three extra characters to get a "slice anywhere." My answer is actually "more than never" since I actually use pandas.IndexSlice and numpy.s_ occasionally, both of which are the same as this (but as shown, no need to install/import either to get the functionality). But it's not "all the time" either. -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.
On Sun, Aug 23, 2020 at 09:43:14PM -0700, Christopher Barker wrote:
Why not allow slice syntax as an expression everywhere? Everywhere I’ve tried, it’s a syntax error now, but is there any technical reason that it couldn’t be used pretty much anywhere?
When do you use slices outside of a subscript? More importantly, when do you need slices outside of a subscript where they would benefit from being written in compact slice syntax rather than function call syntax? I think I've done something like this once or twice: chunk = slice(a, b, step) for seq in sequences: do_something_with(seq[chunk]) but I don't even remember why :-) I'm not convinced that the first line would be better written as: chunk = a:b:step But this definitely wouldn't be: chunk = :: So apart from "but it looks cool" why do you want this? (I agree that slices look cool inside subscripts, I'm just not so sure about outside of them.) -- Steve
On Sun, Aug 23, 2020 at 9:42 PM Todd <toddrjen@gmail.com> wrote:
I think it is worth directly discussing the availability of slices in PEP 472-style keyword indices, since we seem to have mostly converged on a dunder method signature. This is an issue that has been alluded to regarding keyword-based (labelled) indices but not directly addressed. The basic syntax would be something like d[x=1:3].
As I mentioned in another thread, I think the syntax in which the initial argument of the slice is missing may be visually confusing, as it is too similar to the walrus operator. d[x=:3] or d[x=:] There's precedent for combinations of symbols that have different meanings when swapped. For example: x += 3 vs x =+ 3 However, the second case will be usually formatted as x = +3, as per PEP 8 <https://www.python.org/dev/peps/pep-0008/#other-recommendations> So unless PEP8 is updated to require/suggest spaces around a keyword index (which I'm not proposing), then I am -1 for the suggested feature, at least when the initial element is missing.
I am strongly in favor of having slices. The main motivating factor for me, labelled dimensions in xarray, would be much, much less useful without support for slices. In fact, as PEP 472 currently mentions, the big benefit of indexing over method calls is that indexing supports slice syntax while method calls don't.
In a more general sense, I feel not allowing slices would create an artificial distinction between labelled and positional indices that I don't think is justified. They would work the same, except for slices where labelled indices behave differently. It would be a strange gotcha.
So I think any revision to PEP 472 or new PEP should directly and explicitly support the use of slices. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/TOABKD... Code of Conduct: http://python.org/psf/codeofconduct/
-- Sebastian Kreft
On Mon, Aug 24, 2020 at 9:26 AM Sebastian Kreft <skreft@gmail.com> wrote:
As I mentioned in another thread, I think the syntax in which the initial argument of the slice is missing may be visually confusing, as it is too similar to the walrus operator.
d[x=:3] or d[x=:]
which I suppose is the point mentioned: "what if we want to use colons for other things"? However, the second case will be usually formatted as x = +3, as per PEP 8 <https://www.python.org/dev/peps/pep-0008/#other-recommendations>
So unless PEP8 is updated to require/suggest spaces around a keyword index (which I'm not proposing), then I am -1 for the suggested feature, at least when the initial element is missing.
PEP8 can't require, and of course it could be updated -- that seems a non issue. As for "why not" not being a motivator -- I agree, I posted it that easy because this conversation has brought up a number of examples where slice syntax is nice to use. And David Mertz pointed out, both numpy and pandas have a utility to make easier slices -- yes, that's an argument for why you don't need them, but it's also an argument for why there IS a need for slice objects outside of the square brackets, and if we need slice objects, then slice syntax is nice. But rather than suggesting one or two places where we might use sloce syntax: e.g. function calls: object.get_subset(x=a:b, y=c:d), or itertools,islice(it, 2:20:2) Once you allow them anywhere outside [], then the question does become "why not?", because having an expression that is only valid in some small fraction of contexts gets even more confusing. Of course, as has been pointed out here -- dict displays are one good reason NOT to support it. Which is too bad -- I'd really like to see it more broadly available -- I think it's a very nice syntax. I am strongly in favor of having slices. The main motivating factor for
me, labelled dimensions in xarray, would be much, much less useful without support for slices. In fact, as PEP 472 currently mentions, the big benefit of indexing over method calls is that indexing supports slice syntax while method calls don't.
so maybe the solution is to have method calls support slices :-) -CHB
In a more general sense, I feel not allowing slices would create an artificial distinction between labelled and positional indices that I don't think is justified. They would work the same, except for slices where labelled indices behave differently. It would be a strange gotcha.
So I think any revision to PEP 472 or new PEP should directly and explicitly support the use of slices. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/TOABKD... Code of Conduct: http://python.org/psf/codeofconduct/
-- Sebastian Kreft _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/IQJNOQ... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Tue, Aug 25, 2020 at 2:26 AM Christopher Barker <pythonchb@gmail.com> wrote:
As for "why not" not being a motivator -- I agree, I posted it that easy because this conversation has brought up a number of examples where slice syntax is nice to use. And David Mertz pointed out, both numpy and pandas have a utility to make easier slices -- yes, that's an argument for why you don't need them, but it's also an argument for why there IS a need for slice objects outside of the square brackets, and if we need slice objects, then slice syntax is nice.
Pandas is kinda special though. It semi-abuses Python syntax in quite a few places. For example, here is an example from the Pandas docs:
idx = pd.IndexSlice>>> dfmi.loc[idx[:, 'B0':'B1'], :]
There is a hierarchical index (MultiIndex) where we want to put slices as some of the "dimensions" of slice items. It definitely makes sense in the Pandas world, but I have never once encountered anywhere I would want "stand-alone" slice objects outside of Pandas. I know NumPy includes the same convenience, but it's not even coming to me immediately in what context you would want that in NumPy. Personally, I like the single letter `I` even more than the `idx` name in the example. But in general, having to use a somewhat special object to do something that really is rare and special doesn't feel like a big burden. Moreover, as others have noted, in dictionaries and elsewhere, allowing slices as generic expressions allows for very confusing looking code (I won't say it's *definitely* ambiguous, but certainly hard for humans to make sense of, even if the parser could). -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.
On Tue, Aug 25, 2020 at 10:58 AM David Mertz <mertz@gnosis.cx> wrote:
On Tue, Aug 25, 2020 at 2:26 AM Christopher Barker <pythonchb@gmail.com> wrote:
As for "why not" not being a motivator -- I agree, I posted it that easy because this conversation has brought up a number of examples where slice syntax is nice to use. And David Mertz pointed out, both numpy and pandas have a utility to make easier slices -- yes, that's an argument for why you don't need them, but it's also an argument for why there IS a need for slice objects outside of the square brackets, and if we need slice objects, then slice syntax is nice.
Pandas is kinda special though. It semi-abuses Python syntax in quite a few places. For example, here is an example from the Pandas docs:
idx = pd.IndexSlice>>> dfmi.loc[idx[:, 'B0':'B1'], :]
There is a hierarchical index (MultiIndex) where we want to put slices as some of the "dimensions" of slice items. It definitely makes sense in the Pandas world, but I have never once encountered anywhere I would want "stand-alone" slice objects outside of Pandas. I know NumPy includes the same convenience, but it's not even coming to me immediately in what context you would want that in NumPy.
I had to do it in a situation where I needed slices but couldn't tell ahead of time how many dimensions would be sliced. So I had to construct a dynamically-sized tuple of slices on-the-fly. I doubt this is not a common situation, though.
On Mon, 24 Aug 2020 at 02:42, Todd <toddrjen@gmail.com> wrote:
So I think any revision to PEP 472 or new PEP should directly and explicitly support the use of slices.
Duly noted (I am revisiting the PEP in light of all your comments starting from 2019) but yes, I fully agree that slicing should be supported for keyed arguments. -- Kind regards, Stefano Borini
participants (12)
-
Alex Hall
-
Chris Angelico
-
Christopher Barker
-
David Mertz
-
Jonathan Fine
-
Paul Moore
-
Random832
-
Sebastian Kreft
-
Stefano Borini
-
Steven D'Aprano
-
tcphone93@gmail.com
-
Todd