Where did we go wrong with negative stride?
In the comments of http://python-history.blogspot.com/2013/10/why-python-uses-0-based-indexing.... were some complaints about the interpretation of the bounds for negative strides, and I have to admin it feels wrong. Where did we go wrong? For example, "abcde"[::-1] == "edcba" as you'd expect, but there is no number you can put as the second bound to get the same result: "abcde"[:1:-1] == "edc" "abcde"[:0:-1] == "edcb" but "abcde":-1:-1] == "" I'm guessing it all comes from the semantics I assigned to negative stride for range() long ago, unthinkingly combined with the rules for negative indices. Are we stuck with this forever? If we want to fix this in Python 4 we'd have to start deprecating negative stride with non-empty lower/upper bounds now. And we'd have to start deprecating negative step for range() altogether, recommending reversed(range(lower, upper)) instead. Thoughts? Is NumPy also affected? -- --Guido van Rossum (python.org/~guido)
I believe the problem is not about negative strides but about negative bounds. There should be a notion of "minus zero", something like "abcde"[:-0:-1] =="edcba". Here ":-" serves as a special syntax for negative stride; of course it is not a real proposal. The same awkwardness results when you take a negative upper bounds to the limit of 0: "abcde"[:-2] == "abc" "abcde"[:-1] == "abcd" "abcde"[:-0] == "" (I once filed a bug for it, which was of course correctly rejected: http://bugs.python.org/issue17287). Elazar 2013/10/27 Guido van Rossum <guido@python.org>
In the comments of http://python-history.blogspot.com/2013/10/why-python-uses-0-based-indexing.... were some complaints about the interpretation of the bounds for negative strides, and I have to admin it feels wrong. Where did we go wrong? For example,
"abcde"[::-1] == "edcba"
as you'd expect, but there is no number you can put as the second bound to get the same result:
"abcde"[:1:-1] == "edc" "abcde"[:0:-1] == "edcb"
but
"abcde":-1:-1] == ""
I'm guessing it all comes from the semantics I assigned to negative stride for range() long ago, unthinkingly combined with the rules for negative indices.
Are we stuck with this forever? If we want to fix this in Python 4 we'd have to start deprecating negative stride with non-empty lower/upper bounds now. And we'd have to start deprecating negative step for range() altogether, recommending reversed(range(lower, upper)) instead.
Thoughts? Is NumPy also affected?
-- --Guido van Rossum (python.org/~guido)
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
On 27/10/2013 5:04pm, Guido van Rossum wrote:
Are we stuck with this forever? If we want to fix this in Python 4 we'd have to start deprecating negative stride with non-empty lower/upper bounds now. And we'd have to start deprecating negative step for range() altogether, recommending reversed(range(lower, upper)) instead.
Or recommend using None?
"abcde"[None:None:-1] 'edcba'
-- Richard
On 27/10/2013 17:04, Guido van Rossum wrote:
In the comments of http://python-history.blogspot.com/2013/10/why-python-uses-0-based-indexing.... there were some complaints about the interpretation of the bounds for negative strides, and I have to admin it feels wrong. Where did we go wrong? For example,
"abcde"[::-1] == "edcba"
as you'd expect, but there is no number you can put as the second bound to get the same result:
"abcde"[:1:-1] == "edc" "abcde"[:0:-1] == "edcb"
but
"abcde":-1:-1] == ""
I'm guessing it all comes from the semantics I assigned to negative stride for range() long ago, unthinkingly combined with the rules for negative indices.
For a positive stride, omitting the second bound is equivalent to length + 1:
"abcde"[:6:1] 'abcde'
For a negative stride, omitting the second bound is equivalent to -(length + 1):
"abcde"[:-6:-1] 'edcba'
Are we stuck with this forever? If we want to fix this in Python 4 we'd have to start deprecating negative stride with non-empty lower/upper bounds now. And we'd have to start deprecating negative step for range() altogether, recommending reversed(range(lower, upper)) instead.
Thoughts? Is NumPy also affected?
On Sun, Oct 27, 2013 at 10:40 AM, MRAB <python@mrabarnett.plus.com> wrote:
On 27/10/2013 17:04, Guido van Rossum wrote:
In the comments of http://python-history.**blogspot.com/2013/10/why-** python-uses-0-based-indexing.**html<http://python-history.blogspot.com/2013/10/why-python-uses-0-based-indexing.html> there were some complaints about the interpretation of the bounds for negative strides, and I have to admin it feels wrong. Where did we go wrong? For example,
"abcde"[::-1] == "edcba"
as you'd expect, but there is no number you can put as the second bound to get the same result:
"abcde"[:1:-1] == "edc" "abcde"[:0:-1] == "edcb"
but
"abcde":-1:-1] == ""
I'm guessing it all comes from the semantics I assigned to negative stride for range() long ago, unthinkingly combined with the rules for negative indices.
For a positive stride, omitting the second bound is equivalent to length + 1:
"abcde"[:6:1] 'abcde'
Actually, it is equivalent to length; "abcde"[:5:1] == "abcde" too.
For a negative stride, omitting the second bound is equivalent to -(length + 1):
"abcde"[:-6:-1] 'edcba'
Hm, so the idea is that with a negative stride you you should use negative indices. Then at least you get a somewhat useful invariant: if -len(a)-1 <= j <= i <= -1: len(a[i:j:-1]) == i-j which at least somewhat resembles the invariant for positive indexes and stride: if 0 <= i <= j <= len(a): len(a[i:j:1]) == j-i For negative indices and stride, we now also get back this nice theorem about adjacent slices: if -len(a)-1 <= i <= -1: a[:i:-1] + a[i::-1] == a[::-1] Using negative indices also restores the observation that a[i:j:k] produces exactly the items corresponding to the values produced by range(i, j, k). Still, the invariant for negative stride looks less attractive, and the need to use negative indices confuses the matter. Also we end up with -1 corresponding to the position at one end and -len(a)-1 corresponding to the position at the other end. The -1 offset feels really wrong here. I wonder if it would have been simpler if we had defined a[i:j:-1] as the reverse of a[i:j]? What are real use cases for negative strides? -- --Guido van Rossum (python.org/~guido)
On 27 October 2013 18:32, Guido van Rossum <guido@python.org> wrote:
Hm, so the idea is that with a negative stride you you should use negative indices.
The same problem arises when using a negative indices and a positive stride e.g.: # Chop off last n elements x_chopped = x[:-n] # Fails when n == 0 The solution is to use a positive end condition: x_chopped = x[:len(x)+1-n] Oscar
On 27/10/2013 18:32, Guido van Rossum wrote:
On Sun, Oct 27, 2013 at 10:40 AM, MRAB <python@mrabarnett.plus.com> wrote:
On 27/10/2013 17:04, Guido van Rossum wrote:
In the comments of
http://python-history.blogspot.com/2013/10/why-python-uses-0-based-indexing....
there were some complaints about the interpretation of the bounds for negative strides, and I have to admin it feels wrong. Where did we go wrong? For example,
"abcde"[::-1] == "edcba"
as you'd expect, but there is no number you can put as the second bound to get the same result:
"abcde"[:1:-1] == "edc" "abcde"[:0:-1] == "edcb"
but
"abcde":-1:-1] == ""
I'm guessing it all comes from the semantics I assigned to negative stride for range() long ago, unthinkingly combined with the rules for negative indices.
For a positive stride, omitting the second bound is equivalent to length + 1:
"abcde"[:6:1] 'abcde'
Actually, it is equivalent to length; "abcde"[:5:1] == "abcde" too.
For a negative stride, omitting the second bound is equivalent to -(length + 1):
"abcde"[:-6:-1] 'edcba'
Hm, so the idea is that with a negative stride you you should use negative indices. Then at least you get a somewhat useful invariant:
if -len(a)-1 <= j <= i <= -1: len(a[i:j:-1]) == i-j
which at least somewhat resembles the invariant for positive indexes and stride:
if 0 <= i <= j <= len(a): len(a[i:j:1]) == j-i
For negative indices and stride, we now also get back this nice theorem about adjacent slices:
if -len(a)-1 <= i <= -1: a[:i:-1] + a[i::-1] == a[::-1]
Using negative indices also restores the observation that a[i:j:k] produces exactly the items corresponding to the values produced by range(i, j, k).
Still, the invariant for negative stride looks less attractive, and the need to use negative indices confuses the matter. Also we end up with -1 corresponding to the position at one end and -len(a)-1 corresponding to the position at the other end. The -1 offset feels really wrong here.
The difference might be because the left end is at offset 0 but the right end is at offset -1.
I wonder if it would have been simpler if we had defined a[i:j:-1] as the reverse of a[i:j]?
'range' is defined as range(start, stop, stride). Some examples from other languages: BASIC: for i = start to stop step stride Pascal: for i := start to stop do for i := start downto stop do The order of start and stop is the same. If you're slicing in reverse order, then the current order of the start and stop positions seeks reasonable to me.
What are real use cases for negative strides?
On Sun, Oct 27, 2013 at 10:04 AM, Guido van Rossum <guido@python.org> wrote:
"abcde"[::-1] == "edcba"
as you'd expect, but there is no number you can put as the second bound to get the same result:
"abcde"[:1:-1] == "edc" "abcde"[:0:-1] == "edcb"
This isn't really a negative stride issue. [x:y] is a half-open range == [x, y) in mathematical notation and therefore you need a value for y that is one more. As others have pointed out there is a number you can put in the second bound but it's not a valid index: 'abcde'[:-6:-1] == 'abcde' But the same thing applies to positive strides: 'abcde'[::1] == 'abcde'[:5:1] == 'abcde' And the only values you can replace 5 with that work are out of bounds as well or the special value None. None represents both the left edge and the right edge and if we deem that confusing slices could be modified to accept -inf as representing the left edge and inf as representing the right edge. Thus we'd have: 'abcde'[-inf:inf] == 'abcde' 'abcde'[inf:-inf] == '' On Sun, Oct 27, 2013 at 12:23 PM, MRAB <python@mrabarnett.plus.com> wrote:
The difference might be because the left end is at offset 0 but the right end is at offset -1.
If the left end was offset 1 and the right end was offset -1 then some of the asymmetry goes away. On Sun, Oct 27, 2013 at 1:02 PM, Ron Adam <ron3200@gmail.com> wrote:
And I've never liked the property where when counting down, and you pass 0, it wraps around. (And the other case of counting up when passing 0.)
And then when you count down and it passes 0, you'd get an index error. I'm *not* proposing we change how strings are indexed. I think that might break a few programs. I'm just pointing out that you can't count from zero in both directions and that introduces some weirdness. --- Bruce I'm hiring: http://www.cadencemd.com/info/jobs Latest blog post: Alice's Puzzle Page http://www.vroospeak.com Learn how hackers think: http://j.mp/gruyere-security
On 10/27/2013 01:32 PM, Guido van Rossum wrote:
Still, the invariant for negative stride looks less attractive, and the need to use negative indices confuses the matter. Also we end up with -1 corresponding to the position at one end and -len(a)-1 corresponding to the position at the other end. The -1 offset feels really wrong here.
And I've never liked the property where when counting down, and you pass 0, it wraps around. (And the other case of counting up when passing 0.)
I wonder if it would have been simpler if we had defined a[i:j:-1] as the reverse of a[i:j]?
I think that would have been simpler. Could adding an __rgetitem__() improve things? seq[i:j:k] --> __getitem__(slice(i:j:k)) seq-[i:j:k] --> __rgetitem__(slice(i:j:k)) Or the sign of K could determine weather __getitem__ or __rgetitem__ is used? Ron
One thing I find unfortunate and does trip me up in practice, is that if you want to do a whole sequence up to k from the end: u[:-k] hits a singularity if k=0 Sorry, not exactly related to negative stride
Neal Becker wrote:
One thing I find unfortunate and does trip me up in practice, is that if you want to do a whole sequence up to k from the end:
u[:-k]
hits a singularity if k=0
I think the only way to really fix this cleanly is to have a different *syntax* for counting from the end, rather than trying to guess from the value of the argument. I can't remember ever needing to write code that switches dynamically between from-start and from-end indexing, or between forward and reverse iteration direction -- and if I ever did, I'd be happy to write two code branches. -- Greg
On Sun, Oct 27, 2013 at 4:45 PM, Greg Ewing <greg.ewing@canterbury.ac.nz>wrote:
I think the only way to really fix this cleanly is to have a different *syntax* for counting from the end, rather than trying to guess from the value of the argument.
I was thinking the exact same thing today. Suppose the slice syntax was changed to: [start:stop:stride:reverse] where 0 or None or False for reverse leaves the slice in order while any True value reverses it. This would replace 'abcde'[2:5] == 'bcd' 'abcde'[2:5::True] == 'dcb' 'abcde'[::-2] == 'abcde'[::2:True] == 'eca' 'abcdef'[::-2] == 'fdb' 'abcdef'[::2:True] == 'eca' As the last three examples, illustrate, sometimes the reverse is equivalent to a negative stride and sometimes it's not. --- Bruce I'm hiring: http://www.cadencemd.com/info/jobs Latest blog post: Alice's Puzzle Page http://www.vroospeak.com Learn how hackers think: http://j.mp/gruyere-security
On Mon, Oct 28, 2013 at 10:45 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Neal Becker wrote:
One thing I find unfortunate and does trip me up in practice, is that if you want to do a whole sequence up to k from the end:
u[:-k]
hits a singularity if k=0
I think the only way to really fix this cleanly is to have a different *syntax* for counting from the end, rather than trying to guess from the value of the argument. I can't remember ever needing to write code that switches dynamically between from-start and from-end indexing, or between forward and reverse iteration direction -- and if I ever did, I'd be happy to write two code branches.
If it'd help, you could borrow Pike's syntax for counting-from-end ranges: <2 means 2 from the end, <0 means 0 from the end. So "abcdefg"[:<2] would be "abcde", and "abcdefg"[:<0] would be "abcdefg". Currently that's invalid syntax (putting a binary operator with no preceding operand), so it'd be safe and unambiguous. ChrisA
On 28 Oct 2013 16:34, "Chris Angelico" <rosuav@gmail.com> wrote:
On Mon, Oct 28, 2013 at 10:45 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Neal Becker wrote:
One thing I find unfortunate and does trip me up in practice, is that if you want to do a whole sequence up to k from the end:
u[:-k]
hits a singularity if k=0
I think the only way to really fix this cleanly is to have a different *syntax* for counting from the end, rather than trying to guess from the value of the argument. I can't remember ever needing to write code that switches dynamically between from-start and from-end indexing, or between forward and reverse iteration direction -- and if I ever did, I'd be happy to write two code branches.
If it'd help, you could borrow Pike's syntax for counting-from-end ranges: <2 means 2 from the end, <0 means 0 from the end. So "abcdefg"[:<2] would be "abcde", and "abcdefg"[:<0] would be "abcdefg". Currently that's invalid syntax (putting a binary operator with no preceding operand), so it'd be safe and unambiguous.
In this vein, I started wondering if it might be worth trying to come up with a syntax to control whether the ends of a slice were open or closed. Since mismatched paren types would be too confusing, perhaps abusing some binary operators as Chris suggested could help: "[<i:" closed start of slice (default) "[i<:" open start of slice ":>j]" open end of slice (default) ":j>]" closed end of slice ":>j:k]" open end of slice with step ":j>:k]" closed end of slice with step Default slice: "[<0:-1>:1]" Reversed slice: "[<-1:0>:-1]" This makes it possible to cleanly include the final element as a closed range, rather than needing to add or subtract 1 (and avoids the zero trap when indexing from the end). Cheers, Nick.
ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
On Oct 28, 2013, at 11:00 PM, Nick Coghlan wrote:
In this vein, I started wondering if it might be worth trying to come up with a syntax to control whether the ends of a slice were open or closed.
Since mismatched paren types would be too confusing, perhaps abusing some binary operators as Chris suggested could help:
"[<i:" closed start of slice (default) "[i<:" open start of slice ":>j]" open end of slice (default) ":j>]" closed end of slice ":>j:k]" open end of slice with step ":j>:k]" closed end of slice with step
Default slice: "[<0:-1>:1]" Reversed slice: "[<-1:0>:-1]"
This makes it possible to cleanly include the final element as a closed range, rather than needing to add or subtract 1 (and avoids the zero trap when indexing from the end).
Sorry, I'm -1 here. I think it's already difficult enough to teach, read, and comprehend what's going on with slice notation when there are strides (especially negative ones). I don't think this syntax will make it easier to understand at a glance (or even upon some deeper inspection). -Barry
Barry Warsaw wrote:
On Oct 28, 2013, at 11:00 PM, Nick Coghlan wrote:
"[<i:" closed start of slice (default) "[i<:" open start of slice ":>j]" open end of slice (default) ":j>]" closed end of slice ":>j:k]" open end of slice with step ":j>:k]" closed end of slice with step
Sorry, I'm -1 here. ... I don't think this syntax will make it easier to understand at a glance (or even upon some deeper inspection).
I agree that this looks far too cluttered. Joshua's ~ idea shows that we don't need separate syntax for "from the start" and "from the end", just something that means "from the other end". Also we want a character that doesn't look too obtrusive and doesn't already have a meaning the way we're using it. How about: a[^i:j] a[i:^j] a[^i:^j] -- Greg
On 10/28/2013 08:00 AM, Nick Coghlan wrote:
In this vein, I started wondering if it might be worth trying to come up with a syntax to control whether the ends of a slice were open or closed.
Since mismatched paren types would be too confusing, perhaps abusing some binary operators as Chris suggested could help:
"[<i:" closed start of slice (default) "[i<:" open start of slice ":>j]" open end of slice (default) ":j>]" closed end of slice ":>j:k]" open end of slice with step ":j>:k]" closed end of slice with step
Default slice: "[<0:-1>:1]" Reversed slice: "[<-1:0>:-1]"
This makes it possible to cleanly include the final element as a closed range, rather than needing to add or subtract 1 (and avoids the zero trap when indexing from the end).
I think a reverse index object could be easier to understand. For now it could be just a subclass of int. Then 0 and rx(0) would be distinguishable from each other. (-i and rx(i) would be too.) seq[0:rx(0)] Default slice. seq[0:rx(0):-1] Reversed slice. (compare to above) seq[rx(5): rx(0)] The last 5 items. A syntax could be added later. (Insert preferred syntax below.) seq[\5:\0] The last 5 items How about this example, which would probably use names instead of the integers in real code. >>> "abcdefg"[3:10] # 10 is past the end. (works fine) 'defg' Sliding the range 5 to the left... >>> "abcdefg"[-2:5] # -2 is before the beginning? (Nope) '' # The wrap around gotcha! The same situation happens when indexing from the right side [-i:-j], and sliding the range to the right. Once j >= 0, it breaks. It would be nice if these worked the same on both ends. A reverse index object could fix both of these cases. Cheers, Ron
On 29/10/2013 01:34, Ron Adam wrote:
On 10/28/2013 08:00 AM, Nick Coghlan wrote:
In this vein, I started wondering if it might be worth trying to come up with a syntax to control whether the ends of a slice were open or closed.
Since mismatched paren types would be too confusing, perhaps abusing some binary operators as Chris suggested could help:
"[<i:" closed start of slice (default) "[i<:" open start of slice ":>j]" open end of slice (default) ":j>]" closed end of slice ":>j:k]" open end of slice with step ":j>:k]" closed end of slice with step
Default slice: "[<0:-1>:1]" Reversed slice: "[<-1:0>:-1]"
This makes it possible to cleanly include the final element as a closed range, rather than needing to add or subtract 1 (and avoids the zero trap when indexing from the end).
I think a reverse index object could be easier to understand. For now it could be just a subclass of int. Then 0 and rx(0) would be distinguishable from each other. (-i and rx(i) would be too.)
seq[0:rx(0)] Default slice. seq[0:rx(0):-1] Reversed slice. (compare to above)
seq[rx(5): rx(0)] The last 5 items.
A syntax could be added later. (Insert preferred syntax below.)
seq[\5:\0] The last 5 items
If you're going to have a reverse index object, shouldn't you also have an index object? I don't like the idea of counting from one end with one type and from the other end with another type. But if you're really set on having different types of some kind, how about real counting from the left and imaginary counting from the right: seq[5j : 0j] # The last 5 items seq[1 : 1j] # From second to second-from-last
How about this example, which would probably use names instead of the integers in real code.
>>> "abcdefg"[3:10] # 10 is past the end. (works fine) 'defg'
Sliding the range 5 to the left...
>>> "abcdefg"[-2:5] # -2 is before the beginning? (Nope) '' # The wrap around gotcha!
The same situation happens when indexing from the right side [-i:-j], and sliding the range to the right. Once j >= 0, it breaks.
It would be nice if these worked the same on both ends. A reverse index object could fix both of these cases.
If you don't want a negative int to count from the right, then the clearest choice I've seen so far is, IHMO, 'end': seq[end - 5 : end] # The last 5 items seq[1 : end - 1] # From second to second-from-last I don't know the best way to handle it, but here's an idea: do it in the syntax: subscript: subscript_test | [subscript_test] ':' [subscript_test] [sliceop] subscript_test: test | 'end' '-' test
On Tue, Oct 29, 2013 at 2:43 PM, MRAB <python@mrabarnett.plus.com> wrote:
But if you're really set on having different types of some kind, how about real counting from the left and imaginary counting from the right:
seq[5j : 0j] # The last 5 items
seq[1 : 1j] # From second to second-from-last
Interesting idea, but is the notion of indexing a list with a float going to be another huge can of worms? ChrisA
On 10/28/2013 10:43 PM, MRAB wrote:
I think a reverse index object could be easier to understand. For now it could be just a subclass of int. Then 0 and rx(0) would be distinguishable from each other. (-i and rx(i) would be too.)
seq[0:rx(0)] Default slice. seq[0:rx(0):-1] Reversed slice. (compare to above)
seq[rx(5): rx(0)] The last 5 items.
A syntax could be added later. (Insert preferred syntax below.)
seq[\5:\0] The last 5 items
If you're going to have a reverse index object, shouldn't you also have an index object?
I don't like the idea of counting from one end with one type and from the other end with another type.
It would be possible to make it work both ways by having a direction attribute on it which is set with a unary minus opperation. seq[-ix(5): -ix(0)] Positive integers would work normally too. Negative ints would just be to the left of the first item rather than the left of the last item. Just had a thought. In accounting negative numbers are often represented as a positive number in parenthes. seq[(5,):(0,)] Last 5 items. Unfortunately we need the comma to define a single item tuple. :-/ But this wuold work without adding new syntax or a new type. And the ',' isn't that big of a deal. It would just take a bit of getting used to it. Cheers, Ron
But if you're really set on having different types of some kind, how about real counting from the left and imaginary counting from the right:
seq[5j : 0j] # The last 5 items
seq[1 : 1j] # From second to second-from-last
How about this example, which would probably use names instead of the integers in real code.
>>> "abcdefg"[3:10] # 10 is past the end. (works fine) 'defg'
Sliding the range 5 to the left...
>>> "abcdefg"[-2:5] # -2 is before the beginning? (Nope) '' # The wrap around gotcha!
The same situation happens when indexing from the right side [-i:-j], and sliding the range to the right. Once j >= 0, it breaks.
It would be nice if these worked the same on both ends. A reverse index object could fix both of these cases.
If you don't want a negative int to count from the right, then the clearest choice I've seen so far is, IHMO, 'end':
seq[end - 5 : end] # The last 5 items
seq[1 : end - 1] # From second to second-from-last
I don't know the best way to handle it, but here's an idea: do it in the syntax:
subscript: subscript_test | [subscript_test] ':' [subscript_test] [sliceop] subscript_test: test | 'end' '-' test
I think this would work too, but it's not any different than the [\5:\0] syntax example. Just a differnt spelling. Your example could be done without adding syntax by an end class. Which is effectivly the same as an index class. Cheers, Ron
On 29/10/2013 17:49, Ron Adam wrote:
On 10/28/2013 10:43 PM, MRAB wrote:
I think a reverse index object could be easier to understand. For now it could be just a subclass of int. Then 0 and rx(0) would be distinguishable from each other. (-i and rx(i) would be too.)
seq[0:rx(0)] Default slice. seq[0:rx(0):-1] Reversed slice. (compare to above)
seq[rx(5): rx(0)] The last 5 items.
A syntax could be added later. (Insert preferred syntax below.)
seq[\5:\0] The last 5 items
If you're going to have a reverse index object, shouldn't you also have an index object?
I don't like the idea of counting from one end with one type and from the other end with another type.
It would be possible to make it work both ways by having a direction attribute on it which is set with a unary minus opperation.
seq[-ix(5): -ix(0)]
Positive integers would work normally too. Negative ints would just be to the left of the first item rather than the left of the last item.
Just had a thought. In accounting negative numbers are often represented as a positive number in parenthes.
seq[(5,):(0,)] Last 5 items.
Unfortunately we need the comma to define a single item tuple. :-/
But this wuold work without adding new syntax or a new type. And the ',' isn't that big of a deal. It would just take a bit of getting used to it.
But if you're really set on having different types of some kind, how about real counting from the left and imaginary counting from the right:
seq[5j : 0j] # The last 5 items
seq[1 : 1j] # From second to second-from-last
[snip] Suppose there were two new classes, "index" and "rindex". "index" counts from the left and "rindex" counts from the right.
You could also use unary ">" and "<": >x == index(x) <x == rindex(x) Slicing would be like this: seq[<5 : <0] # The last five items seq[>1 : <1] # From the second to the second-from-last. Strictly speaking, str.find and str.index should also return an index instance. In the case of str.find, if the string wasn't found it would return >-1 (i.e. index(-1)), which, when used as an index, would raise an IndexError (index(-1) isn't the same as -1). In fact, index or rindex instances could end up spreading throughout the language, to wherever an int is actually an index. (You'd also have to handle addition and subtraction with indexes, e.g. pos + 1.) All of which, I suspect, is taking it too far! :-)
On 10/29/2013 03:19 PM, MRAB wrote:
On 29/10/2013 17:49, Ron Adam wrote:
On 10/28/2013 10:43 PM, MRAB wrote:
I think a reverse index object could be easier to understand. For now it could be just a subclass of int. Then 0 and rx(0) would be distinguishable from each other. (-i and rx(i) would be too.)
seq[0:rx(0)] Default slice. seq[0:rx(0):-1] Reversed slice. (compare to above)
seq[rx(5): rx(0)] The last 5 items.
A syntax could be added later. (Insert preferred syntax below.)
seq[\5:\0] The last 5 items
If you're going to have a reverse index object, shouldn't you also have an index object?
I don't like the idea of counting from one end with one type and from the other end with another type.
It would be possible to make it work both ways by having a direction attribute on it which is set with a unary minus opperation.
seq[-ix(5): -ix(0)]
Positive integers would work normally too. Negative ints would just be to the left of the first item rather than the left of the last item.
Just had a thought. In accounting negative numbers are often represented as a positive number in parenthes.
seq[(5,):(0,)] Last 5 items.
Unfortunately we need the comma to define a single item tuple. :-/
But this wuold work without adding new syntax or a new type. And the ',' isn't that big of a deal. It would just take a bit of getting used to it.
But if you're really set on having different types of some kind, how about real counting from the left and imaginary counting from the right:
seq[5j : 0j] # The last 5 items
seq[1 : 1j] # From second to second-from-last
[snip] Suppose there were two new classes, "index" and "rindex". "index" counts from the left and "rindex" counts from the right.
You could also use unary ">" and "<":
>x == index(x) <x == rindex(x)
Slicing would be like this:
seq[<5 : <0] # The last five items seq[>1 : <1] # From the second to the second-from-last.
Strictly speaking, str.find and str.index should also return an index instance. In the case of str.find, if the string wasn't found it would return >-1 (i.e. index(-1)), which, when used as an index, would raise an IndexError (index(-1) isn't the same as -1).
In fact, index or rindex instances could end up spreading throughout the language, to wherever an int is actually an index. (You'd also have to handle addition and subtraction with indexes, e.g. pos + 1.)
All of which, I suspect, is taking it too far! :-)
I think it may be the only way to get a clean model of slicing from both directions with a 0 based index system. Cheers, Ron
On 30 Oct 2013 07:26, "Ron Adam" <ron3200@gmail.com> wrote:
On 10/29/2013 03:19 PM, MRAB wrote:
On 29/10/2013 17:49, Ron Adam wrote:
On 10/28/2013 10:43 PM, MRAB wrote:
I think a reverse index object could be easier to understand. For
could be just a subclass of int. Then 0 and rx(0) would be distinguishable from each other. (-i and rx(i) would be too.)
seq[0:rx(0)] Default slice. seq[0:rx(0):-1] Reversed slice. (compare to above)
seq[rx(5): rx(0)] The last 5 items.
A syntax could be added later. (Insert preferred syntax below.)
seq[\5:\0] The last 5 items
If you're going to have a reverse index object, shouldn't you also have an index object?
I don't like the idea of counting from one end with one type and from the other end with another type.
It would be possible to make it work both ways by having a direction attribute on it which is set with a unary minus opperation.
seq[-ix(5): -ix(0)]
Positive integers would work normally too. Negative ints would just be to the left of the first item rather than the left of the last item.
Just had a thought. In accounting negative numbers are often represented as a positive number in parenthes.
seq[(5,):(0,)] Last 5 items.
Unfortunately we need the comma to define a single item tuple. :-/
But this wuold work without adding new syntax or a new type. And the ',' isn't that big of a deal. It would just take a bit of getting used to it.
But if you're really set on having different types of some kind, how about real counting from the left and imaginary counting from the right:
seq[5j : 0j] # The last 5 items
seq[1 : 1j] # From second to second-from-last
[snip] Suppose there were two new classes, "index" and "rindex". "index" counts from the left and "rindex" counts from the right.
You could also use unary ">" and "<":
>x == index(x) <x == rindex(x)
Slicing would be like this:
seq[<5 : <0] # The last five items seq[>1 : <1] # From the second to the second-from-last.
Strictly speaking, str.find and str.index should also return an index instance. In the case of str.find, if the string wasn't found it would return >-1 (i.e. index(-1)), which, when used as an index, would raise an IndexError (index(-1) isn't the same as -1).
In fact, index or rindex instances could end up spreading throughout the language, to wherever an int is actually an index. (You'd also have to handle addition and subtraction with indexes, e.g. pos + 1.)
All of which, I suspect, is taking it too far! :-)
I think it may be the only way to get a clean model of slicing from both
now it directions with a 0 based index system. Isn't all that is needed to prevent the default wraparound behaviour clamping negative numbers to zero on input? As in: def clampleft(start, stop, step): if start is not None and start < 0: start = 0 if stop is not None and stop < 0: stop = 0 return slice(start, stop, step) Similar to rslice and "reverse=False", this could be implemented as a "range=False" flag (the rationale for the flag name is that in "range", negative numbers are just negative numbers, without the wraparound behaviour normally exhibited by the indices calculation in slice objects). I think there are two reasonable options that could conceivably be included in 3.4 at this late stage: * Make slice subclassable and ensure the C API and stdlib respect an overridden indices() method * add a "reverse" flag to both slice and range, and a "range" flag to slice. Either way, if any changes are going to be made, a PEP should be written up summarising some of the ideas in this thread, including the clampleft() and rslice() recipes that work in current versions of Python. Cheers, Nick.
Cheers, Ron
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
On Oct 29, 2013, at 15:07, Nick Coghlan <ncoghlan@gmail.com> wrote:
Isn't all that is needed to prevent the default wraparound behaviour clamping negative numbers to zero on input?
As in:
def clampleft(start, stop, step): if start is not None and start < 0: start = 0 if stop is not None and stop < 0: stop = 0 return slice(start, stop, step)
Except many of the wraparound cases people complain about are the other way around, negative stop wrapping around to 0. You could fix that almost as easily: def clampright(start, stop, step): if start >= 0: start = ??? if stop >= 0: stop = None return slice(start, stop, step) Except... What do you set start to if you want to make sure it's past-end? You could force an empty slice (which is the main thing you want) with, e.g., stop=start=0; is that close enough?
On 30 October 2013 11:04, Andrew Barnert <abarnert@yahoo.com> wrote:
On Oct 29, 2013, at 15:07, Nick Coghlan <ncoghlan@gmail.com> wrote:
Isn't all that is needed to prevent the default wraparound behaviour clamping negative numbers to zero on input?
As in:
def clampleft(start, stop, step): if start is not None and start < 0: start = 0 if stop is not None and stop < 0: stop = 0 return slice(start, stop, step)
Except many of the wraparound cases people complain about are the other way around, negative stop wrapping around to 0.
You could fix that almost as easily:
def clampright(start, stop, step): if start >= 0: start = ??? if stop >= 0: stop = None return slice(start, stop, step)
Except... What do you set start to if you want to make sure it's past-end? You could force an empty slice (which is the main thing you want) with, e.g., stop=start=0; is that close enough?
Yes, that's what I did in the rslice recipe - if it figured out an empty slice was needed when explicit bounds were involved, it always returned "slice(0, 0, step)" regardless of the original inputs. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 10/29/2013 05:07 PM, Nick Coghlan wrote:
On 30 Oct 2013 07:26, "Ron Adam" <ron3200@gmail.com
I think it may be the only way to get a clean model of slicing from both directions with a 0 based index system.
Isn't all that is needed to prevent the default wraparound behaviour clamping negative numbers to zero on input?
Well, you can't have indexing from the other end and clamp index's to zero at the same time. The three situations are... index both i and j from the front index both i and j from the end index i from the front + index j from the end. (?) And then there is this one, which is what confuses everyone. index i from the end + index j from the front. It's useful in the current slice semantics where I and J are swapped if K is negative. It works, but is not easy to think about clearly. It's meant to match up with start and stop concepts, rather than left and right. Another issue is weather or not a slice should raise an Index error if it's range is outside the length of the sequence. Both behaviours are useful. Currently it doesn't on one end and give the wrong output on the other. (When moving a slice left or right.) :-/ For that matter the wraparound behaviour is sometimes useful too. But not if it's only on the left side. And then there's the idea of open a closed ends. Which you have an interest in. Assuming their is four combinations of that... both-closed, left-open, right-open, and both-open. That's a lot of things to be trying to shove into one syntax! So it seems (to me) we may be better off to just concentrate on writing some functions with the desired behaviour(s) and leaving the slice syntax question to later. (But I'm very glad these things are being addressed.) A function that would cover nearly all of the use cases I can think of... # Get items that are within slice range. # index's: l, r, rl, rr --> left, right, rev-left, rev-right # The index's always use positive numbers. # step and width can be either positive or negative. # width - chunk to take at each step. (If it can work cleanly.) get_slice(obj, l=None, r=None, ri=None, rr=None, step=1, width=1) Used as... a = get_slice(s, l=i, r=j) # index from left end a = get_slice(s, rl=i, rr=j) # index from right end a = get_slice(s, l=i, rr=j) # index from both ends While that signature definition is long and not too pretty, it could be wrapped to make more specialised and nicer to use variations. Or it could be hidden away in __getitem__ methods. def mid_slice(obj, i, j, k): """Slice obj i and j distance from ends.""" return get_slice(obj, l=i, rr=j, step=k) Instead of using flags for these... "closed" "open" "open-right" "open-left" "reversed" "raise-err" "wrap-around" Would it be possible to have those applied with a context manager while the object is being indexed? with index_mode(seq, "open", "reversed") as s: r = mid_slice(s, i, j) That could work with any slice syntax we use later. And makes a nice building block for creating specialised slice functions.
As in:
def clampleft(start, stop, step): if start is not None and start < 0: start = 0 if stop is not None and stop < 0: stop = 0 return slice(start, stop, step)
Similar to rslice and "reverse=False", this could be implemented as a "range=False" flag (the rationale for the flag name is that in "range", negative numbers are just negative numbers, without the wraparound behaviour normally exhibited by the indices calculation in slice objects).
I know some have mentioned unifying range and slice, even though they aren't the same thing... But it suggests doing... seq[range(i,j,k)] I'm not sure there an any real advantage to that other than testing that range and slice behave in similar ways.
I think there are two reasonable options that could conceivably be included in 3.4 at this late stage:
* Make slice subclassable and ensure the C API and stdlib respect an overridden indices() method
I think that would be good, it would allow some experimentation that may be helpful. Is there any reason to not allow it?
* add a "reverse" flag to both slice and range, and a "range" flag to slice.
Either way, if any changes are going to be made, a PEP should be written up summarising some of the ideas in this thread, including the clampleft() and rslice() recipes that work in current versions of Python.
I agree. :-) Cheers, Ron
On 30 October 2013 13:09, Ron Adam <ron3200@gmail.com> wrote:
That's a lot of things to be trying to shove into one syntax!
That's why I no longer think it should be handled as syntax. slice is a builtin, and keyword arguments are a thing. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 30 October 2013 07:39, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 30 October 2013 13:09, Ron Adam <ron3200@gmail.com> wrote:
That's a lot of things to be trying to shove into one syntax!
That's why I no longer think it should be handled as syntax. slice is a builtin, and keyword arguments are a thing.
I assume that you mean to add a reverse keyword argument to the slice constructor so that I can do: b = a[slice(i, j, reverse=True)] instead of b = a[i-1, j-1, -1] or b = a[i:j][::-1] Firstly would it not be better to add slice.__reversed__ so that it would be b = a[reversed(slice(i, j))] Secondly I don't think I would ever actually want to use this over the existing possibilities. There are real problems with slicing and indexing in Python that lead to corner cases and bugs but this particular issue is not one of them. The real problems, including the motivating example at the start of this thread, are caused by the use of negative indices to mean from the end. Subtracting 1 from the indices when using a negative stride isn't a big deal but correctly and robustly handling the wraparound behaviour is. EAFP only works if invalid inputs raise an error and this is very often not what happens with slicing and indexing. Oscar
On 30 October 2013 09:52, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
Firstly would it not be better to add slice.__reversed__ so that it would be
b = a[reversed(slice(i, j))]
This won't work, because reversed returns an iterator, not a slice object.
Secondly I don't think I would ever actually want to use this over the existing possibilities.
Agreed, while my usage is pretty trivial, I would definitely use b = a[::-1] over b = a[Slice(None, None, None, reversed=True)] I could probably omit some of those None arguments, but I probably wouldn't simply because I can't remember which are optional.
There are real problems with slicing and indexing in Python that lead to corner cases and bugs but this particular issue is not one of them. The real problems, including the motivating example at the start of this thread, are caused by the use of negative indices to mean from the end.
However, being able to write last_n = s[-n:] is extremely useful. I'm losing track of what is being proposed here, but I do not want to have to write that as s[len(s)-n:]. Particularly if "s" is actually a longer variable name, or worse still a calculated value (which I do a lot). Paul
On 30 October 2013 10:02, Paul Moore <p.f.moore@gmail.com> wrote:
There are real problems with slicing and indexing in Python that lead to corner cases and bugs but this particular issue is not one of them. The real problems, including the motivating example at the start of this thread, are caused by the use of negative indices to mean from the end.
However, being able to write
last_n = s[-n:]
is extremely useful.
Until you hit the bug where n is 0.
a = 'abcde' for n in reversed(range(4)): ... print(n, a[-n:]) ... 3 cde 2 de 1 e 0 abcde
This is what I mean by the wraparound behaviour causing corner cases and bugs. I and others have reported that this is a bigger source of problems than the off-by-one negative stride issue which has never caused me any actual problems. Yes I need to think carefully when writing a negative stride slice but I generally need to think carefully every time I write any slice particularly a multidimensional one. The thing that really makes it difficult to reason about slices is working out whether or not your code is susceptible to wraparound bugs.
I'm losing track of what is being proposed here, but I do not want to have to write that as s[len(s)-n:]. Particularly if "s" is actually a longer variable name, or worse still a calculated value (which I do a lot).
But you currently need to write it that way to get the correct behaviour:
for n in reversed(range(4)): ... print(n, a[len(a)-n:]) ... 3 cde 2 de 1 e 0
Oscar
On 30 October 2013 20:13, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
But you currently need to write it that way to get the correct behaviour:
for n in reversed(range(4)): ... print(n, a[len(a)-n:]) ... 3 cde 2 de 1 e 0
Regardless, my main point is this: slices are just objects. The syntax: s[i:j:k] is just syntactic sugar for: s[slice(i, j, k)] That means that until people have fully explored exactly the semantics they want in terms of the existing object model, just as I did for rslice(), then there are *zero* grounds to be discussing syntax changes that provide those new semantics. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 30 October 2013 20:22, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 30 October 2013 20:13, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
But you currently need to write it that way to get the correct behaviour:
for n in reversed(range(4)): ... print(n, a[len(a)-n:]) ... 3 cde 2 de 1 e 0
Regardless, my main point is this: slices are just objects. The syntax:
s[i:j:k]
is just syntactic sugar for:
s[slice(i, j, k)]
That means that until people have fully explored exactly the semantics they want in terms of the existing object model, just as I did for rslice(), then there are *zero* grounds to be discussing syntax changes that provide those new semantics.
Hmm, looks like my rslice testing was broken. Anyway, I created an enhanced version people using the "End - idx" notation from the end that actually passes more systematic testing: https://bitbucket.org/ncoghlan/misc/src/default/rslice.py?at=default
from rslice import rslice, betterslice, End betterslice(-4, 5) slice(0, 5, 1) betterslice(End-4, 5) slice(-4, 5, 1) rslice(-4, 5).as_slice(10) slice(4, -11, -1) rslice(End-4, 5).as_slice(10) slice(4, -5, -1)
Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 30 October 2013 13:45, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 30 October 2013 20:22, Nick Coghlan <ncoghlan@gmail.com> wrote:
That means that until people have fully explored exactly the semantics they want in terms of the existing object model, just as I did for rslice(), then there are *zero* grounds to be discussing syntax changes that provide those new semantics.
Hmm, looks like my rslice testing was broken. Anyway, I created an enhanced version people using the "End - idx" notation from the end that actually passes more systematic testing:
https://bitbucket.org/ncoghlan/misc/src/default/rslice.py?at=default
It took me a while to get to that link. I think bitbucket may be having server problems.
from rslice import rslice, betterslice, End betterslice(-4, 5) slice(0, 5, 1) betterslice(End-4, 5) slice(-4, 5, 1) rslice(-4, 5).as_slice(10) slice(4, -11, -1) rslice(End-4, 5).as_slice(10) slice(4, -5, -1)
I like the idea of a magic End object. I would be happy to see negative indexing deprecated in favour of that. For this to really be useful though it needs to apply to ordinary indexing as well as slicing. If it also becomes an error to use negative indices then you get proper bounds checking as well as an explicit way to show when you're indexing from the end which is a substantial improvement. Oscar
On 31 Oct 2013 00:43, "Oscar Benjamin" <oscar.j.benjamin@gmail.com> wrote:
On 30 October 2013 13:45, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 30 October 2013 20:22, Nick Coghlan <ncoghlan@gmail.com> wrote:
That means that until people have fully explored exactly the semantics they want in terms of the existing object model, just as I did for rslice(), then there are *zero* grounds to be discussing syntax changes that provide those new semantics.
Hmm, looks like my rslice testing was broken. Anyway, I created an enhanced version people using the "End - idx" notation from the end that actually passes more systematic testing:
https://bitbucket.org/ncoghlan/misc/src/default/rslice.py?at=default
It took me a while to get to that link. I think bitbucket may be having server problems.
from rslice import rslice, betterslice, End betterslice(-4, 5) slice(0, 5, 1) betterslice(End-4, 5) slice(-4, 5, 1) rslice(-4, 5).as_slice(10) slice(4, -11, -1) rslice(End-4, 5).as_slice(10) slice(4, -5, -1)
I like the idea of a magic End object. I would be happy to see negative indexing deprecated in favour of that. For this to really be useful though it needs to apply to ordinary indexing as well as slicing. If it also becomes an error to use negative indices then you get proper bounds checking as well as an explicit way to show when you're indexing from the end which is a substantial improvement.
That's much harder to do in a backwards compatible way without introducing both the index() and rindex() types Ron (I think?) suggested (the End object in my proof-of-concept is a stripped down rindex type), and even then it's hard to provide both clamping for slices and an index error for out of bounds item lookup. They both also have the problem that __index__ isn't allowed to return None. Regardless, the main thing I got out of writing that proof of concept is that I'd now be +1 on a patch to make it possible and practical to inherit from slice objects to override their construction and their indices() method. Previously I would have asked "What's the point?" Cheers, Nick.
Oscar
On 10/30/2013 11:25 AM, Nick Coghlan wrote:
On 31 Oct 2013 00:43, "Oscar Benjamin"
I like the idea of a magic End object. I would be happy to see negative indexing deprecated in favour of that. For this to really be useful though it needs to apply to ordinary indexing as well as slicing. If it also becomes an error to use negative indices then you get proper bounds checking as well as an explicit way to show when you're indexing from the end which is a substantial improvement.
I though of using a magic symbol, $, for that -- a[$-n]. But aside from the issue of using one of the 2 remaining unused ascii symbols for something that can already be done, it would not work in a slice call.
That's much harder to do in a backwards compatible way without introducing both the index() and rindex() types Ron (I think?) suggested (the End object in my proof-of-concept is a stripped down rindex type), and even then it's hard to provide both clamping for slices and an index error for out of bounds item lookup. They both also have the problem that __index__ isn't allowed to return None.
Regardless, the main thing I got out of writing that proof of concept is that I'd now be +1 on a patch to make it possible and practical to inherit from slice objects to override their construction and their indices() method. Previously I would have asked "What's the point?"
Indeed you did ;-) From the fourth message of http://bugs.python.org/issue17279: "From the current [2013 Feb] python-ideas 'range' thread: Me: Would it be correct to say (now) that all 4 are intentional omissions? and not merely oversights? Nick: Yes, I think so. People will have to be *real* convincing to explain a case where composition isn't a more appropriate solution." I think one point is that if seq.__getitem__(ob) uses 'if isinstance(ob, slice):' instead of 'if type(ob) is slice:', subclass instances will work whereas wrapper instances would not. I would make range subclassable at the same time. -- Terry Jan Reedy
On Wed, Oct 30, 2013 at 4:47 PM, Terry Reedy <tjreedy@udel.edu> wrote:
I though of using a magic symbol, $, for that -- a[$-n]. But aside from the issue of using one of the 2 remaining unused ascii symbols for something that can already be done, it would not work in a slice call.
Is that like where you have 1 more shot on your camera and you don't want to use it for fear that something more spectacular might show up afterward? (and hope that you didn't leave your lens cap on when you finally take the picture!) :-) -eric
On 30/10/2013 23:00, Eric Snow wrote:
On Wed, Oct 30, 2013 at 4:47 PM, Terry Reedy <tjreedy@udel.edu> wrote:
I though of using a magic symbol, $, for that -- a[$-n]. But aside from the issue of using one of the 2 remaining unused ascii symbols for something that can already be done, it would not work in a slice call.
Is that like where you have 1 more shot on your camera and you don't want to use it for fear that something more spectacular might show up afterward? (and hope that you didn't leave your lens cap on when you finally take the picture!) :-)
I don't think it's that bad; I count 3: "!", "$" and "?". :-)
2013/10/31 MRAB <python@mrabarnett.plus.com>:
On 30/10/2013 23:00, Eric Snow wrote:
On Wed, Oct 30, 2013 at 4:47 PM, Terry Reedy <tjreedy@udel.edu> wrote:
I though of using a magic symbol, $, for that -- a[$-n]. But aside from the issue of using one of the 2 remaining unused ascii symbols for something that can already be done, it would not work in a slice call.
Is that like where you have 1 more shot on your camera and you don't want to use it for fear that something more spectacular might show up afterward? (and hope that you didn't leave your lens cap on when you finally take the picture!) :-)
I don't think it's that bad; I count 3: "!", "$" and "?". :-)
Can't it be done by adding a __sub__ method to len? a[:len-n] Readable and short.
On 31/10/2013 00:05, אלעזר wrote:
2013/10/31 MRAB <python@mrabarnett.plus.com>:
On 30/10/2013 23:00, Eric Snow wrote:
On Wed, Oct 30, 2013 at 4:47 PM, Terry Reedy <tjreedy@udel.edu> wrote:
I though of using a magic symbol, $, for that -- a[$-n]. But aside from the issue of using one of the 2 remaining unused ascii symbols for something that can already be done, it would not work in a slice call.
Is that like where you have 1 more shot on your camera and you don't want to use it for fear that something more spectacular might show up afterward? (and hope that you didn't leave your lens cap on when you finally take the picture!) :-)
I don't think it's that bad; I count 3: "!", "$" and "?". :-)
Can't it be done by adding a __sub__ method to len?
a[:len-n]
Readable and short.
-1 I don't like how it makes that function special. I'd much prefer "end" (or "End") instead.
On 10/30/2013 06:36 PM, MRAB wrote:
On 31/10/2013 00:05, אלעזר wrote:
2013/10/31 MRAB <python@mrabarnett.plus.com>:
On 30/10/2013 23:00, Eric Snow wrote:
On Wed, Oct 30, 2013 at 4:47 PM, Terry Reedy <tjreedy@udel.edu> wrote:
I though of using a magic symbol, $, for that -- a[$-n]. But aside from the issue of using one of the 2 remaining unused ascii symbols for something that can already be done, it would not work in a slice call.
Is that like where you have 1 more shot on your camera and you don't want to use it for fear that something more spectacular might show up afterward? (and hope that you didn't leave your lens cap on when you finally take the picture!) :-)
I don't think it's that bad; I count 3: "!", "$" and "?". :-)
Can't it be done by adding a __sub__ method to len?
a[:len-n]
Readable and short.
-1
I don't like how it makes that function special.
Not only that, but len wouldn't know what it was subtracting from. -- ~Ethan~
On 31/10/2013 01:43, Ethan Furman wrote:
On 10/30/2013 06:36 PM, MRAB wrote:
On 31/10/2013 00:05, אלעזר wrote:
2013/10/31 MRAB <python@mrabarnett.plus.com>:
On 30/10/2013 23:00, Eric Snow wrote:
On Wed, Oct 30, 2013 at 4:47 PM, Terry Reedy <tjreedy@udel.edu> wrote:
I though of using a magic symbol, $, for that -- a[$-n]. But aside from the issue of using one of the 2 remaining unused ascii symbols for something that can already be done, it would not work in a slice call.
Is that like where you have 1 more shot on your camera and you don't want to use it for fear that something more spectacular might show up afterward? (and hope that you didn't leave your lens cap on when you finally take the picture!) :-)
I don't think it's that bad; I count 3: "!", "$" and "?". :-)
Can't it be done by adding a __sub__ method to len?
a[:len-n]
Readable and short.
-1
I don't like how it makes that function special.
Not only that, but len wouldn't know what it was subtracting from.
But you could have an "End" class, something like this: class End: def __init__(self, offset=0): self.offset = offset def __sub__(self, offset): return End(self.offset - offset) def __add__(self, offset): return End(self.offset + offset) def __str__(self): if self.offset < 0: return 'End - {}'.format(-self.offset) if self.offset > 0: return 'End + {}'.format(self.offset) return 'End' Unfortunately, all those methods that expect an index would have to be modified. :-(
2013/10/31 Ethan Furman <ethan@stoneleaf.us>:
On 10/30/2013 06:36 PM, MRAB wrote:
On 31/10/2013 00:05, אלעזר wrote:
2013/10/31 MRAB <python@mrabarnett.plus.com>:
On 30/10/2013 23:00, Eric Snow wrote:
On Wed, Oct 30, 2013 at 4:47 PM, Terry Reedy <tjreedy@udel.edu> wrote:
I though of using a magic symbol, $, for that -- a[$-n]. But aside from the issue of using one of the 2 remaining unused ascii symbols for something that can already be done, it would not work in a slice call.
Is that like where you have 1 more shot on your camera and you don't want to use it for fear that something more spectacular might show up afterward? (and hope that you didn't leave your lens cap on when you finally take the picture!) :-)
I don't think it's that bad; I count 3: "!", "$" and "?". :-)
Can't it be done by adding a __sub__ method to len?
a[:len-n]
Readable and short.
-1
I don't like how it makes that function special.
Not only that, but len wouldn't know what it was subtracting from.
But that doesn't matter; the operation will return the same End object discussed here. Perhaps we can get this End object by adding two tokens: ":-" and "[-". So a[-3:-5] == a[slice(End-3, End-5, None)] although it will turn a[-3] into a[End-3]. I don't think it's a problem if the latter will behave in the same way as the former (i.e End-3 be a subtype of int). Note that with an End object (regardless of wheather it's called "End", "len-x" or ":-x") we can get End/5. I think that's a nice thing to have. One more thing: End-5 should be callable, so it can be passed around. (End-3)("hello") == len("hello")-3 (End-0)("hello") == len("hello") This way End is a generalization of len, making len somewhat redundant.
On Thu, Oct 31, 2013 at 03:37:36PM +0200, אלעזר wrote:
Perhaps we can get this End object by adding two tokens: ":-" and "[-". So
a[-3:-5] == a[slice(End-3, End-5, None)]
That's ambiguous. Consider: a[-2] Is that a dict lookup with key -2, or a list indexed with End-2? Or worse, a dict lookup with key len(a)-2. To break the ambiguity, we'd need a rule that End objects can only occur inside slices with at least one colon. But, I think that means that the parser would have to look ahead to see whether it was within a slice or not, and that might not be possible with Python's parser. Even if were, it's still a special case that -2 means something different inside a slice than outside a slice, and we know what the Zen of Python says about special cases.
although it will turn a[-3] into a[End-3]. I don't think it's a problem if the latter will behave in the same way as the former (i.e End-3 be a subtype of int).
So what would (End-3)*10 return? How about End & 0xF ?
Note that with an End object (regardless of wheather it's called "End", "len-x" or ":-x") we can get End/5. I think that's a nice thing to have.
I presume that you expect a[End/5] to be equivalent to a[len(a)//5] ?
One more thing: End-5 should be callable, so it can be passed around.
(End-3)("hello") == len("hello")-3 (End-0)("hello") == len("hello")
This way End is a generalization of len, making len somewhat redundant.
All this seems very clever, but as far as I'm concerned, it's too clever. I don't like objects which are context-sensitive. Given: x = End-2 then in this context, x behaves like 4: "abcdef"[x:] while in this context, x behaves like 0: "abcd"[x:] I really don't like that. That makes it hard to reason about code. End seems to me to be an extremely sophisticated object, far too sophisticated for slicing, which really ought to be conceptually and practically a simple operation. I would not have to like to explain this to beginners to Python. I especially would not like to explain how it works. (How would it work?) I think the only answer is, "It's magic". I think it is something that would be right at home in PHP or Perl, and I don't mean that as an insult, but only that it's not a good fit to Python. -- Steven
2013/11/1 Steven D'Aprano <steve@pearwood.info>:
On Thu, Oct 31, 2013 at 03:37:36PM +0200, אלעזר wrote:
Note that with an End object (regardless of wheather it's called "End", "len-x" or ":-x") we can get End/5. I think that's a nice thing to have.
I presume that you expect a[End/5] to be equivalent to a[len(a)//5] ?
Yes. Perhaps End//5 is better of course.
One more thing: End-5 should be callable, so it can be passed around.
(End-3)("hello") == len("hello")-3 (End-0)("hello") == len("hello")
This way End is a generalization of len, making len somewhat redundant.
All this seems very clever, but as far as I'm concerned, it's too clever. I don't like objects which are context-sensitive. Given:
x = End-2
then in this context, x behaves like 4:
"abcdef"[x:]
while in this context, x behaves like 0:
"abcd"[x:]
Sure you mean "x behaves like 2":
I really don't like that. That makes it hard to reason about code.
Well, look at that:
x=-2 "abcdef"[x:] # x behaves like 4: 'ef' "abcd"[x:] # x behaves like 2: 'cd'
Magic indeed. Not to mention what happens when it happens to be x=-0 Unlike ordinary int, where -x is an abbreviation for 0-x, so -0 == 0-0 == 0, in the context of slicing and element access (in a list, not in a dict) -x is an abbreviation for len(this list)-x, so -0 == len(this list)-0 != 0. Well, End is len(this list), or equivalently it's "0 (mod len this list)". I think it's natural.
End seems to me to be an extremely sophisticated object, far too sophisticated for slicing, which really ought to be conceptually and practically a simple operation. I would not have to like to explain this to beginners to Python. I especially would not like to explain how it works. (How would it work?) say lst[:end-0] become lst[slice(end-0), and inside list you take stop(self) and continue from there as before. (Turns out slice does not take keyword arguments. Why is that?) There are other ways, of course. Which makes it possible to pass any other callable, unless you want to explicitly forbid it. I agree that this:
"abcdef"[1:(lambda lst: lst[0])] is a horrible idea. The reason it is hard to explain is that you don't want to explain functional programming to a beginner. But if the End object (which I don't think is "extremely" sophisticated) will be accessible only from within a slice expression, all you have to explain is that it is context-dependent, regardless of how it is actually implemented. It's just a readable "$", if you like. Slice notation is already a domain specific sublanguage of its own right; for example, nowhere else is x:y:z legal at all, and nowhere else does -x have a similar meaning. As for the negative stride, I would suggest a terminating minus for "reverse=True" or Nick's rslice. Possibly with a preceding comma: "abcdef"[1:-1 -] == "edcb" "abcdef"[-] == "fedcba" "abcdef"[1:end-1:2, -] == ''.join(reversed("abcdef"[1:-1:2])) == "db" (I don't understand why can't we have -[1,2,3] == list(reversed([1,2,3])) I'm sure that's a conscious decision but is it documented anywhere? Yes, it's not exactly a negation, but then 'ab'+'cd' is not exactly an addition - it is not commutative. nor does it have an inverse. It's just an intuitive notation, and I think "xyz" == -"zyx" is pretty intuitive; much more so than "zyx"[::-1], although not much more than "xyz"[-]. Taking it one step further, you can have things like "abcde"[-(1:3)] == "abcde"[-slice(1,3)] == "cb" Again, this look intuitive to me. I admit that my niece failed to guess this meaning so I might be wrong; She is completely unfamiliar with Python though) Elazar
On 1 November 2013 10:17, Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Oct 31, 2013 at 03:37:36PM +0200, אלעזר wrote:
Perhaps we can get this End object by adding two tokens: ":-" and "[-". So
a[-3:-5] == a[slice(End-3, End-5, None)]
That's ambiguous. Consider:
a[-2]
Is that a dict lookup with key -2, or a list indexed with End-2? Or worse, a dict lookup with key len(a)-2.
I've been thinking about that. The End object would have to be unhashable just like slice objects: $ python3 Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.
d = {} d[1:2] Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'slice'
To break the ambiguity, we'd need a rule that End objects can only occur inside slices with at least one colon.
This would defeat much of the point of having a new notation. The problem with negative wraparound applies just as much to ordinary indexing as to slicing.
But, I think that means that the parser would have to look ahead to see whether it was within a slice or not, and that might not be possible with Python's parser.
If you've tried to write a parser for Python expressions you'll know that it takes a lot of special casing to handle slices anyway. (It would be simpler if slices were a valid expression in their own right).
Even if were, it's still a special case that -2 means something different inside a slice than outside a slice, and we know what the Zen of Python says about special cases.
although it will turn a[-3] into a[End-3]. I don't think it's a problem if the latter will behave in the same way as the former (i.e End-3 be a subtype of int).
So what would (End-3)*10 return? How about End & 0xF ?
I would expect End to just behave like an integer in the index/slice expression. [snip]
All this seems very clever, but as far as I'm concerned, it's too clever. I don't like objects which are context-sensitive. Given:
x = End-2
then in this context, x behaves like 4:
"abcdef"[x:]
while in this context, x behaves like 0:
"abcd"[x:]
I really don't like that. That makes it hard to reason about code.
I think that's a good argument for not having a magic object. Matlab doesn't allow you to use the keyword 'end' outside of an index/slice expression (well actually it does but it's used to signify the end of a block rather than as a magic object). Note that it's currently hard to reason about something like "abcde"[x:] because you need to know the sign of x to understand what it does.
End seems to me to be an extremely sophisticated object, far too sophisticated for slicing, which really ought to be conceptually and practically a simple operation. I would not have to like to explain this to beginners to Python. I especially would not like to explain how it works. (How would it work?) I think the only answer is, "It's magic". I think it is something that would be right at home in PHP or Perl, and I don't mean that as an insult, but only that it's not a good fit to Python.
I agree that the magic End object is a probably a bad idea on the basis that it should never be used outside of a slice/index expression. I'm still thinking about what would be a good, backward-compatible way of implementing something that achieves the desired semantics. Oscar
2013/11/1 Oscar Benjamin <oscar.j.benjamin@gmail.com>:
On 1 November 2013 10:17, Steven D'Aprano <steve@pearwood.info> wrote:
To break the ambiguity, we'd need a rule that End objects can only occur inside slices with at least one colon.
This would defeat much of the point of having a new notation. The problem with negative wraparound applies just as much to ordinary indexing as to slicing.
Not exactly as much. taking x[:-0] # intention: x[:len(x)] is a reasonable, which happens to fail in Python. While x[-0] # intention: x[len(x)] is an error in the first place, which happens not to raise an Exception in Python, but rather gives you a wrong result. Elazar
On 1 November 2013 13:26, אלעזר <elazarg@gmail.com> wrote:
2013/11/1 Oscar Benjamin <oscar.j.benjamin@gmail.com>:
On 1 November 2013 10:17, Steven D'Aprano <steve@pearwood.info> wrote:
To break the ambiguity, we'd need a rule that End objects can only occur inside slices with at least one colon.
This would defeat much of the point of having a new notation. The problem with negative wraparound applies just as much to ordinary indexing as to slicing.
Not exactly as much. taking
x[:-0] # intention: x[:len(x)]
is a reasonable, which happens to fail in Python. While
x[-0] # intention: x[len(x)]
is an error in the first place, which happens not to raise an Exception in Python, but rather gives you a wrong result.
I'm not really sure what you mean by this so I'll clarify what I mean: If I write x[-n] then my intention is that n should always be positive. If n happens to be zero or negative then I really want an IndexError but I won't get one because it coincidentally has an alternate meaning. I would rather be able to spell that as x[end-n] and have any negative index be an error. While I can write x[len(x)-n] right now it still does the wrong thing when n>len(x). Oscar
אלעזר wrote:
x[-0] # intention: x[len(x)]
is an error in the first place, which happens not to raise an Exception in Python, but rather gives you a wrong result.
Using x[End-n] would allow an exception to be properly raised when n is out of bounds instead of spuriously wrapping around. -- Greg
On 31/10/2013 00:17, Alexander Belopolsky wrote:
On Wed, Oct 30, 2013 at 7:59 PM, MRAB <python@mrabarnett.plus.com <mailto:python@mrabarnett.plus.com>> wrote:
I don't think it's that bad; I count 3: "!", "$" and "?". :-)
Wasn't use of "`" dropped from Python 3? This makes it 4!
Wasn't one of the reasons it was dropped because it looked too much like "'"? Well, it still does! :-)
On 31/10/13 12:59, MRAB wrote:
I don't think it's that bad; I count 3: "!", "$" and "?". :-)
And we also have ` in reserve if we get really desperate. Hmmm... backquote... backwards indexing... (Ducks as Tim Peters throws a bucket of grit that he's cleaned off his monitor.) -- Greg
On 10/30/2013 7:59 PM, MRAB wrote:
On 30/10/2013 23:00, Eric Snow wrote:
On Wed, Oct 30, 2013 at 4:47 PM, Terry Reedy <tjreedy@udel.edu> wrote:
I though of using a magic symbol, $, for that -- a[$-n]. But aside from the issue of using one of the 2 remaining unused ascii symbols for something that can already be done, it would not work in a slice call.
Is that like where you have 1 more shot on your camera and you don't want to use it for fear that something more spectacular might show up afterward? (and hope that you didn't leave your lens cap on when you finally take the picture!) :-)
I don't think it's that bad; I count 3: "!", "$" and "?". :-)
2 != 3 True
-- Terry Jan Reedy
On Oct 30, 2013, at 15:47, Terry Reedy <tjreedy@udel.edu> wrote:
I think one point is that if seq.__getitem__(ob) uses 'if isinstance(ob, slice):' instead of 'if type(ob) is slice:', subclass instances will work whereas wrapper instances would not.
Why would anyone use isinstance(ob, slice)? If you can write "start, stop, step = ob.indices(length) without getting a TypeError, what else do you care about? (Code that _does_ care about the difference probably wouldn't work correctly with any custom slice object anyway.) If there really is an issue, we could easily add a collections.abc.Slice. Or... This may be a heretical and/or just stupid idea, but what about reviving the old names __getslice__ and friends (now taking a slice object instead of 2.x's start and stop)? Then the interpreter calls __getslice__ if you use slicing syntax, or if you use indexing syntax with an instance of abc.Slice (or anything but a numbers.Number even?). That way the code is only in one place instead of having to be written in each sequence class.
This reminded me of something related. Quasi-sequences—things that implement the implicit sequence protocol (being indexable with contiguous integers starting from 0, so they're good enough to be used as iterables even though they don't define __iter__), but aren't Sequences (e.g., because they're lazy and/or infinite and therefore can't be Sized) work with slices (as long as start and stop are nonnegative). But only because they ignore the indices method and use the start, stop, and step attributes directly. And that will break any meaningful subclass of slice (except those that only deviate from the base class when given negative indices). One option is to allow these quasi-sequences to call indices(None), which (in slice; subclasses could do something different if they wanted) would raise an IndexError if its start or stop were negative, otherwise act as if it were given an infinite length. (This would also make such quasi-sequences easier to write, and more consistent.) Here's an example (which may be kind of silly, but someone wrote it, and it works, and it's in a project I maintain…): a LazyList that wraps up an iterator and acts like a quasi-sequence—you can index it and slice it, and even mutate it; the first time you try to get/set/del an index higher than all that have been accessed so far, it moves an appropriate number of values from the stored iterator to a list, then just does the get/set/del on that list. For example, if Squares(i) is an iterator that's like (n*n for n in itertools.count(i)) but with a useful repr:
ll = LazyList(Squares(0))
ll LazyList(Squares(0)) ll[1:-1] IndexError: LazyList indices cannot be negative ll[1] 1 ll LazyList(0, 1, Squares(2)) del ll[2:6:2]
ll LazyList(0, 1, 9, Squares(5)) ll[5:2:-1] [49, 36, 25] ll LazyList(0, 1, 9, 25, 36, 49, Squares(8))
For an even simpler example, here's an InfiniteRange class:
r = InfiniteRange(2) r InfiniteRange(2) r[0] 2 r[11::2] InfiniteRange(13, 2) r[11:15:2] range(13, 17, 2) r[15:11:-2] range(17, 13, -2) r[:-1] IndexError: InfiniteRange indices cannot be negative
----- Original Message -----
From: Andrew Barnert <abarnert@yahoo.com> To: Terry Reedy <tjreedy@udel.edu> Cc: "python-ideas@python.org" <python-ideas@python.org> Sent: Wednesday, October 30, 2013 11:06 PM Subject: Re: [Python-ideas] Where did we go wrong with negative stride?
On Oct 30, 2013, at 15:47, Terry Reedy <tjreedy@udel.edu> wrote:
I think one point is that if seq.__getitem__(ob) uses 'if isinstance(ob, slice):' instead of 'if type(ob) is slice:', subclass instances will work whereas wrapper instances would not.
Why would anyone use isinstance(ob, slice)? If you can write "start, stop, step = ob.indices(length) without getting a TypeError, what else do you care about? (Code that _does_ care about the difference probably wouldn't work correctly with any custom slice object anyway.)
If there really is an issue, we could easily add a collections.abc.Slice.
Or... This may be a heretical and/or just stupid idea, but what about reviving the old names __getslice__ and friends (now taking a slice object instead of 2.x's start and stop)? Then the interpreter calls __getslice__ if you use slicing syntax, or if you use indexing syntax with an instance of abc.Slice (or anything but a numbers.Number even?). That way the code is only in one place instead of having to be written in each sequence class. _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
Nick Coghlan wrote:
That means that until people have fully explored exactly the semantics they want in terms of the existing object model ... then there are *zero* grounds to be discussing syntax changes that provide those new semantics.
I don't think it's possible to decouple the syntactic and semantic issues that easily. Consider the problem of how to specify from-the-end indexes without zero behaving incorrectly. There are a couple of ways this could be tackled. One would be to introduce a new type representing an index from the end. This wouldn't require any new syntax, but it would be verbose to spell out, so later we would probably want to consider a new syntax for constructing this type inside a slice expression. But if we're willing to consider new syntax, we don't need a new type -- we can just invent a syntax for specifying from-the-end indexing directly, and end up with a simpler design overall. There would obviously have to be some way of specifying the same thing that the new syntax specifies using arguments to slice(), but that would be mostly an implementation detail. It shouldn't be the driving force behind the design. -- Greg
On 31 Oct 2013 07:45, "Greg Ewing" <greg.ewing@canterbury.ac.nz> wrote:
Nick Coghlan wrote:
That means that until people have fully explored exactly the semantics they want in terms of the existing object model ... then there are *zero*
grounds to be discussing syntax changes that provide those new
semantics.
I don't think it's possible to decouple the syntactic and semantic issues that easily.
Consider the problem of how to specify from-the-end indexes without zero behaving incorrectly. There are a couple of ways this could be tackled.
One would be to introduce a new type representing an index from the end. This wouldn't require any new syntax, but it would be verbose to spell out, so later we would probably want to consider a new syntax for constructing this type inside a slice expression.
But if we're willing to consider new syntax, we don't need a new type -- we can just invent a syntax for specifying from-the-end indexing directly, and end up with a simpler design overall.
It isn't simpler though - since, as you note below, anything we can express in the syntax *must* be expressible in the slice() API. Slice notation is currently pure syntactic sugar and it should stay that way.
There would obviously have to be some way of specifying the same thing that the new syntax specifies using arguments to slice(), but that would be mostly an implementation detail. It shouldn't be the driving force behind the design.
Who said it was? But defining an object API first lets us define and test proposed semantics in pure Python, avoiding any reliance on abstract handwaving. Cheers, Nick.
-- Greg _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
On Wed, 30 Oct 2013 20:22:22 +1000, Nick Coghlan <ncoghlan@gmail.com> wrote:
Regardless, my main point is this: slices are just objects. The syntax:
s[i:j:k]
is just syntactic sugar for:
s[slice(i, j, k)]
That means that until people have fully explored exactly the semantics they want in terms of the existing object model, just as I did for rslice(), then there are *zero* grounds to be discussing syntax changes that provide those new semantics.
(Sending from gmane on limited tablet) I think its sugar for a function that takes *args and depending on its contents makes a slice, or multiple slices, plus what ever is left over. In the case of a simple index its just whats left over. Slice syntax is simple on purpose so that they can pass through more than just ints. It the responsibility of the object that uses them to make sense of the slices. So it may possible to pass a callable index modifier through too. def reversed(obj, slice_obj): ... return reversed_slice_obj a[i:j:k, reversed] If it's also passed the object to be sliced, self from __getitem__ method, it could set end values and maybe do other things like control weather exceptions are raised or not. Its kind of like decorating a slice. For example it could do an infanite wrap around slice by normalizing the indices of the slice object. Ican't test it on this tablet unfornately. Cheers, Ron Adam.
On 30 October 2013 10:13, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
However, being able to write
last_n = s[-n:]
is extremely useful.
Until you hit the bug where n is 0.
a = 'abcde' for n in reversed(range(4)): ... print(n, a[-n:]) ... 3 cde 2 de 1 e 0 abcde
This is what I mean by the wraparound behaviour causing corner cases and bugs. I and others have reported that this is a bigger source of problems than the off-by-one negative stride issue which has never caused me any actual problems.
OK, fair enough. That has *never* been an issue to me, but nor have negative strides. So I'm in the same boat as Tim, that I never need any of this so I don't care how it's implemented :-) What I do care about is that functionality that I do use (s[:-n] where n is *not* zero) doesn't get removed because it leads to corner cases that I don't hit but others do. Adding extra functionality with better boundary conditions is one thing, removing something that people use *a lot* without issue is different. Most of my use cases tend to have constant n - something like "if filename.endswith('.py'): filename = filename[:-3]". Here, using -2 (or -4, or 3j, or len(filename)-3, or whatever ends up being proposed) isn't too hard, but it doesn't express the extent as clearly to me. Or I calculate n based on something that means that n will never be 0 (typically some sort of "does this case apply" check like the endswith above). And again, -n expresses my intent most clearly and won't trigger bugs. I'm not saying you don't have real-world use cases, and real bugs caused by this behaviour, but I am suggesting that it only bites in particular types of application. Paul
On 30 October 2013 20:02, Paul Moore <p.f.moore@gmail.com> wrote:
On 30 October 2013 09:52, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
Firstly would it not be better to add slice.__reversed__ so that it would be
b = a[reversed(slice(i, j))]
This won't work, because reversed returns an iterator, not a slice object.
Secondly I don't think I would ever actually want to use this over the existing possibilities.
Agreed, while my usage is pretty trivial, I would definitely use
b = a[::-1]
over
b = a[Slice(None, None, None, reversed=True)]
I could probably omit some of those None arguments, but I probably wouldn't simply because I can't remember which are optional.
Why does that give you trouble when it's identical to what you can omit from the normal slice syntax? (and from range)
There are real problems with slicing and indexing in Python that lead to corner cases and bugs but this particular issue is not one of them. The real problems, including the motivating example at the start of this thread, are caused by the use of negative indices to mean from the end.
And this is one of the things my rslice recipe handles correctly. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 30 October 2013 10:18, Nick Coghlan <ncoghlan@gmail.com> wrote:
b = a[::-1]
over
b = a[Slice(None, None, None, reversed=True)]
I could probably omit some of those None arguments, but I probably wouldn't simply because I can't remember which are optional.
Why does that give you trouble when it's identical to what you can omit from the normal slice syntax? (and from range)
slice(reversed=True)? I can omit all the arguments in the indexing case (OK, I enter a step of -1, but that's equivalent to reversed=True and a step of 1, which is default). And yet currently slice() fails as a minimum of 1 argument is needed. I'm not saying that it's ill-defined, just that I'd get confused fast. So "better to be explicit" (but verbose). And [::-1] is clear and simple (to me, at least). Paul
Paul Moore wrote:
I would definitely use
b = a[::-1]
over
b = a[Slice(None, None, None, reversed=True)]
Indeed, the whole reason for having slice syntax is that it's very concise. One of the things I like most about Python is that I get to write s[a:b] instead of something like s.substr(a, b). I would be very disappointed if I were forced to use the above monstrosity in some cases. -- Greg
On 31 Oct 2013 07:14, "Greg Ewing" <greg.ewing@canterbury.ac.nz> wrote:
Paul Moore wrote:
I would definitely use
b = a[::-1]
over
b = a[Slice(None, None, None, reversed=True)]
Indeed, the whole reason for having slice syntax is that it's very concise. One of the things I like most about Python is that I get to write s[a:b] instead of something like s.substr(a, b).
I would be very disappointed if I were forced to use the above monstrosity in some cases.
You can't have new syntax without defining the desired semantics for that syntax first. Since slices are just objects, it doesn't make sense to argue about syntactic details until the desired semantics are actually clear and demonstrated in an object based proof-of-concept. Cheers, Nick.
-- Greg _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
On 30/10/2013 21:52, Nick Coghlan wrote:
On 31 Oct 2013 07:14, "Greg Ewing" <greg.ewing@canterbury.ac.nz <mailto:greg.ewing@canterbury.ac.nz>> wrote:
Paul Moore wrote:
I would definitely use
b = a[::-1]
over
b = a[Slice(None, None, None, reversed=True)]
Indeed, the whole reason for having slice syntax is that it's very concise. One of the things I like most about Python is that I get to write s[a:b] instead of something like s.substr(a, b).
I would be very disappointed if I were forced to use the above monstrosity in some cases.
You can't have new syntax without defining the desired semantics for that syntax first. Since slices are just objects, it doesn't make sense to argue about syntactic details until the desired semantics are actually clear and demonstrated in an object based proof-of-concept.
How about a new function "rev" which returns the reverse of its argument: def rev(arg): if isinstance(arg, str): return ''.join(reversed(arg)) return type(arg)(reversed(arg)) The disadvantage is that it would be slicing and then reversing, so 2 steps, which is less efficient.
28.10.13 08:33, Chris Angelico написав(ла):
If it'd help, you could borrow Pike's syntax for counting-from-end ranges: <2 means 2 from the end, <0 means 0 from the end. So "abcdefg"[:<2] would be "abcde", and "abcdefg"[:<0] would be "abcdefg". Currently that's invalid syntax (putting a binary operator with no preceding operand), so it'd be safe and unambiguous.
There are parallels with alignment. C-style formatting uses positive width for right-aligned formatting and negative width for right-aligned formatting. New-style formatting uses positive '>' for right-aligned formatting and '<' for left-aligned formatting. So '>' should indicate counting from begin (as positive index now) and '<' should indicate counting from end (as negative index now). And '^' should indicate counting from center.
On Tue, Oct 29, 2013 at 1:11 AM, Serhiy Storchaka <storchaka@gmail.com> wrote:
So '>' should indicate counting from begin (as positive index now) and '<' should indicate counting from end (as negative index now). And '^' should indicate counting from center.
Sounds good to me! Anywhere else we can index from? ||||||||| to indicate the Dewey Decimal System (index using floats rather than ints)? ChrisA
On 28 October 2013 06:33, Chris Angelico <rosuav@gmail.com> wrote:
On Mon, Oct 28, 2013 at 10:45 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Neal Becker wrote:
One thing I find unfortunate and does trip me up in practice, is that if you want to do a whole sequence up to k from the end:
u[:-k]
hits a singularity if k=0
I think the only way to really fix this cleanly is to have a different *syntax* for counting from the end, rather than trying to guess from the value of the argument. I can't remember ever needing to write code that switches dynamically between from-start and from-end indexing, or between forward and reverse iteration direction -- and if I ever did, I'd be happy to write two code branches.
If it'd help, you could borrow Pike's syntax for counting-from-end ranges: <2 means 2 from the end, <0 means 0 from the end. So "abcdefg"[:<2] would be "abcde", and "abcdefg"[:<0] would be "abcdefg". Currently that's invalid syntax (putting a binary operator with no preceding operand), so it'd be safe and unambiguous.
Agreed in entirety. I'm not sure that this is the best method, but it's way better than the status quo. "<" or ">" with negative strides should raise an error and should be the recommended method until negatives are phazed out. *BUT* there is another solution. It's harder to formulate but I think it's more deeply intuitive. The simple problem is this mapping: list: [x, x, x, x, x] index: 0 1 2 3 4 -5 -4 -3 -2 -1 Which is just odd, 'cause those sequences are off by one. But you can stop thinking about them as *negative* indexes and start thinking about NOT'd indexes: ~4 ~3 ~2 ~1 ~0 which you have to say looks OK. Then you design slices around that. To take the first N elements: #>>> "0123456789"[:4] #>>> '0123' To take the last three: #>>> "0123456789"[~4:] # Currently returns '56789' #>>> '6789' For slicing with a mixture: "0123456789"[1:~1] # Currently returns '1234567' #>>> '12345678' "0123456789"[~5:5] # Currently returns '4' #>>> '' So the basic idea is that, for X:Y, X is closed iff positive and Y is open iff positive. If you go over this in your head, it's quite simple. For ~6:7; START: Count 6 from the back, looking at the *signposts* between items, not the items. END: Count 3 forward, looking at the *signposts* between items, not the items. Thus you get, for "0123456789": "|0|1|2|3|4|5|6|7|8|9|" S E and thus, obviously, you get "456". And look, it matches our current negative form! "0123456789"[-6:7] #>>> '456' Woah! *BUT* it works without silly coherence problems if you have -N, because ~0 is -1! אלעזר said the problem was with negative indexes, not strides, so it's good that this solves it. So, how does this help with negative *strides*? Well, Guido's #>>> "abcde"[::-1] #>>> 'edcba'
Apologies for the terrible post above; here it is in full and not riddled with as many editing errors: On 28 October 2013 06:33, Chris Angelico <rosuav@gmail.com> wrote:
On Mon, Oct 28, 2013 at 10:45 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Neal Becker wrote:
One thing I find unfortunate and does trip me up in practice, is that if you want to do a whole sequence up to k from the end:
u[:-k]
hits a singularity if k=0
I think the only way to really fix this cleanly is to have a different *syntax* for counting from the end, rather than trying to guess from the value of the argument. I can't remember ever needing to write code that switches dynamically between from-start and from-end indexing, or between forward and reverse iteration direction -- and if I ever did, I'd be happy to write two code branches.
If it'd help, you could borrow Pike's syntax for counting-from-end ranges: <2 means 2 from the end, <0 means 0 from the end. So "abcdefg"[:<2] would be "abcde", and "abcdefg"[:<0] would be "abcdefg". Currently that's invalid syntax (putting a binary operator with no preceding operand), so it'd be safe and unambiguous.
Agreed in entirety. I'm not sure that this is the best method, but it's way better than the status quo. "<" or ">" with negative strides should raise an error and should be the recommended method until negatives are phazed out. *BUT* there is another solution. It's harder to formulate but I think it's more deeply intuitive. The simple problem is this mapping: list: [x, x, x, x, x] index: 0 1 2 3 4 -5 -4 -3 -2 -1 Which is just odd. But you can stop thinking about them as *negative* indexes and start thinking about NOT'd indexes: ~4 ~3 ~2 ~1 ~0 which you have to say looks OK. Then you design slices around that. To take the first four elements: #>>> "0123456789"[:4] #>>> '0123' To take the last four: #>>> "0123456789"[~4:] # Currently returns '56789' #>>> '6789' For slicing with a mixture: "0123456789"[1:~1] # Currently returns '1234567' #>>> '12345678' "0123456789"[~5:5] # Currently returns '4' #>>> '' So the basic idea is that, for X:Y, X is closed iff positive and Y is open iff positive. If you go over this in your head, it's quite simple. For ~6:7; START: Count 6 from the back, looking at the *signposts* between items, not the items. END: Count 7 forward, looking at the *signposts* between items, not the items. Thus you get, for "0123456789": "|0|1|2|3|4|5|6|7|8|9|" S E and thus, obviously, you get "456". And look, it matches our current negative form! "0123456789"[-6:7] #>>> '456' Woah! *BUT* it works without silly coherence problems if you have -N, because ~0 is -1! אלעזר said the problem was with negative indexes, not strides, so it's good that this solves it. So, how does this help with negative *strides*? Well, Guido's #>>> "abcde"[::-1] #>>> 'edcba' would be hopefully solved by "abcde"[:0:-1] # Currently returns 'edcb' #>>> 'edcba' because you can just *inverse* the "X is closed iff positive and Y is open iff positive" rule. Does this pan out nicely? Really, we want "abcde"[2:4][::-1] == "abcde"[4:2:-1] which is exactly what happens. I'm thinking I'll make a string subclass and try and "intuit" the answers, but I think this is the right choice. Anyone with me, even partially?
On 28 October 2013 17:15, Joshua Landau <joshua@landau.ws> wrote:
<suggested using "~" instead of "-">
# Here's a quick mock-up of my idea. class NotSliced(list): def __getitem__(self, itm): if isinstance(itm, slice): start, stop, step = itm.start, itm.stop, itm.step if start is None: start = 0 if stop is None: stop = ~0 if step is None: step = 1 if start < 0: start += len(self) + 1 if stop < 0: stop += len(self) + 1 if step > 0: return NotSliced(super().__getitem__(slice(start, stop, step))) else: return NotSliced(super().__getitem__(slice(stop, start))[::step]) else: return super().__getitem__(itm) ns = NotSliced([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) [ns[i] for i in range(10)] #>>> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] # See why this is a much better mapping? [list(ns)[-i] for i in range(10)] [ns[~i] for i in range(10)] #>>> [0, 9, 8, 7, 6, 5, 4, 3, 2, 1] #>>> [9, 8, 7, 6, 5, 4, 3, 2, 1, 0] ns[~6:7] list(ns)[-6:7] #>>> [4, 5, 6] #>>> [4, 5, 6] ns[~4:~0][::-1] ns[~0:~4:-1] #>>> [] #>>> [9, 8, 7, 6] ns[~4:~0][::-2] ns[~0:~4:-2] #>>> [] #>>> [9, 7] # Here's something that makes me really feel this is natural. ns[2:~2] #>>> [2, 3, 4, 5, 6, 7] ns[1:~1] #>>> [1, 2, 3, 4, 5, 6, 7, 8] ns[0:~0] #>>> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] # VERSUS (!!!) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9][2:-2] #>>> [2, 3, 4, 5, 6, 7] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9][1:-1] #>>> [1, 2, 3, 4, 5, 6, 7, 8] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9][0:-0] #>>> [] # And some more... ns[~6:6:+1] ns[6:~6:-1] #>>> [4, 5] #>>> [5, 4] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9][-6:6:+1] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9][6:-6:-1] #>>> [4, 5] #>>> [6, 5] # Surely you agree this is much more intuitive. # Another example from the thread a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] for n in reversed(range(7)): print(n, a[:-n]) #>>> 6 [0, 1, 2, 3] #>>> 5 [0, 1, 2, 3, 4] #>>> 4 [0, 1, 2, 3, 4, 5] #>>> 3 [0, 1, 2, 3, 4, 5, 6] #>>> 2 [0, 1, 2, 3, 4, 5, 6, 7] #>>> 1 [0, 1, 2, 3, 4, 5, 6, 7, 8] #>>> 0 [] for n in reversed(range(7)): print(n, ns[:~n]) #>>> 6 [0, 1, 2, 3] #>>> 5 [0, 1, 2, 3, 4] #>>> 4 [0, 1, 2, 3, 4, 5] #>>> 3 [0, 1, 2, 3, 4, 5, 6] #>>> 2 [0, 1, 2, 3, 4, 5, 6, 7] #>>> 1 [0, 1, 2, 3, 4, 5, 6, 7, 8] #>>> 0 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
On 28 October 2013 21:06, Joshua Landau <joshua@landau.ws> wrote:
On 28 October 2013 17:15, Joshua Landau <joshua@landau.ws> wrote:
<suggested using "~" instead of "-">
# Here's a quick mock-up of my idea.
class NotSliced(list): ...
# And a minor bugfix and correction: class NotSliced(list): def __getitem__(self, itm): if isinstance(itm, slice): start, stop, step = itm.start, itm.stop, itm.step if step is None: step = 1 if start is None: start = ~0 if step < 0 else 0 if stop is None: stop = 0 if step < 0 else ~0 if start < 0: start += len(self) + 1 if stop < 0: stop += len(self) + 1 if step > 0: return NotSliced(super().__getitem__(slice(start, stop, step))) else: return NotSliced(super().__getitem__(slice(stop, start))[::step]) else: return super().__getitem__(itm) ns = NotSliced([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) # This came out wrong last time. I should be more careful... ns[~4:~0][::-1] ns[~0:~4:-1] #>>> [9, 8, 7, 6] #>>> [9, 8, 7, 6] ns[~4:~0][::-2] ns[~0:~4:-2] #>>> [9, 7] #>>> [9, 7]
Reversing strings and tuples easily. For a string, calling ''.join(reversed(mystr)) is just overkill. And tuples aren't quite as bad(tuple(reversed(mytuple))), but nonetheless odd. If you were to take out negative strides, a reverse method should be.added to strings and tuples: 'abcde'.reverse() => 'edcba' Guido van Rossum <guido@python.org> wrote:
On Sun, Oct 27, 2013 at 10:40 AM, MRAB <python@mrabarnett.plus.com> wrote:
On 27/10/2013 17:04, Guido van Rossum wrote:
In the comments of http://python-history.**blogspot.com/2013/10/why-**
python-uses-0-based-indexing.**html<http://python-history.blogspot.com/2013/10/why-python-uses-0-based-indexing.html>
there were some complaints about the interpretation of the bounds for negative strides, and I have to admin it feels wrong. Where did we go wrong? For example,
"abcde"[::-1] == "edcba"
as you'd expect, but there is no number you can put as the second bound to get the same result:
"abcde"[:1:-1] == "edc" "abcde"[:0:-1] == "edcb"
but
"abcde":-1:-1] == ""
I'm guessing it all comes from the semantics I assigned to negative stride for range() long ago, unthinkingly combined with the rules for negative indices.
For a positive stride, omitting the second bound is equivalent to length + 1:
"abcde"[:6:1] 'abcde'
Actually, it is equivalent to length; "abcde"[:5:1] == "abcde" too.
For a negative stride, omitting the second bound is equivalent to -(length + 1):
"abcde"[:-6:-1] 'edcba'
Hm, so the idea is that with a negative stride you you should use negative indices. Then at least you get a somewhat useful invariant:
if -len(a)-1 <= j <= i <= -1: len(a[i:j:-1]) == i-j
which at least somewhat resembles the invariant for positive indexes and stride:
if 0 <= i <= j <= len(a): len(a[i:j:1]) == j-i
For negative indices and stride, we now also get back this nice theorem about adjacent slices:
if -len(a)-1 <= i <= -1: a[:i:-1] + a[i::-1] == a[::-1]
Using negative indices also restores the observation that a[i:j:k] produces exactly the items corresponding to the values produced by range(i, j, k).
Still, the invariant for negative stride looks less attractive, and the need to use negative indices confuses the matter. Also we end up with -1 corresponding to the position at one end and -len(a)-1 corresponding to the position at the other end. The -1 offset feels really wrong here.
I wonder if it would have been simpler if we had defined a[i:j:-1] as the reverse of a[i:j]?
What are real use cases for negative strides?
-- --Guido van Rossum (python.org/~guido)
------------------------------------------------------------------------
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
-- Sent from my Android phone with K-9 Mail. Please excuse my brevity.
I wouldn't take out negative strides completely, but I might consider deprecating lower and upper bounds other than None (== missing). So a[::-1] would still work, and a[None:None:-1] would be a verbose way of spelling the same, but a[-1:-6:-1] would be deprecated. Then we could triumphantly (re-)introduce upper and lower bounds in Python 4, with the meaning a[i:j:-1] == a[i:j][::-1]. On Sun, Oct 27, 2013 at 2:27 PM, Ryan <rymg19@gmail.com> wrote:
Reversing strings and tuples easily. For a string, calling ''.join(reversed(mystr)) is just overkill. And tuples aren't quite as bad(tuple(reversed(mytuple))), but nonetheless odd.
If you were to take out negative strides, a reverse method should be.added to strings and tuples:
'abcde'.reverse() => 'edcba'
Guido van Rossum <guido@python.org> wrote:
On Sun, Oct 27, 2013 at 10:40 AM, MRAB <python@mrabarnett.plus.com>wrote:
On 27/10/2013 17:04, Guido van Rossum wrote:
In the comments of http://python-history.**blogspot.com/2013/10/why-** python-uses-0-based-indexing.**html<http://python-history.blogspot.com/2013/10/why-python-uses-0-based-indexing.html> there were some complaints about the interpretation of the bounds for negative strides, and I have to admin it feels wrong. Where did we go wrong? For example,
"abcde"[::-1] == "edcba"
as you'd expect, but there is no number you can put as the second bound to get the same result:
"abcde"[:1:-1] == "edc" "abcde"[:0:-1] == "edcb"
but
"abcde":-1:-1] == ""
I'm guessing it all comes from the semantics I assigned to negative stride for range() long ago, unthinkingly combined with the rules for negative indices.
For a positive stride, omitting the second bound is equivalent to length + 1:
"abcde"[:6:1] 'abcde'
Actually, it is equivalent to length; "abcde"[:5:1] == "abcde" too.
For a negative stride, omitting the second bound is equivalent to -(length + 1):
"abcde"[:-6:-1] 'edcba'
Hm, so the idea is that with a negative stride you you should use negative indices. Then at least you get a somewhat useful invariant:
if -len(a)-1 <= j <= i <= -1: len(a[i:j:-1]) == i-j
which at least somewhat resembles the invariant for positive indexes and stride:
if 0 <= i <= j <= len(a): len(a[i:j:1]) == j-i
For negative indices and stride, we now also get back this nice theorem about adjacent slices:
if -len(a)-1 <= i <= -1: a[:i:-1] + a[i::-1] == a[::-1]
Using negative indices also restores the observation that a[i:j:k] produces exactly the items corresponding to the values produced by range(i, j, k).
Still, the invariant for negative stride looks less attractive, and the need to use negative indices confuses the matter. Also we end up with -1 corresponding to the position at one end and -len(a)-1 corresponding to the position at the other end. The -1 offset feels really wrong here.
I wonder if it would have been simpler if we had defined a[i:j:-1] as the reverse of a[i:j]?
What are real use cases for negative strides?
-- Sent from my Android phone with K-9 Mail. Please excuse my brevity.
-- --Guido van Rossum (python.org/~guido)
[Guido]
I wouldn't take out negative strides completely, but I might consider deprecating lower and upper bounds other than None (== missing). So a[::-1] would still work, and a[None:None:-1] would be a verbose way of spelling the same,
Happy idea.
but a[-1:-6:-1] would be deprecated.
Not sure I've _ever_ seen that in real life. Where it comes up is on places like stackoverflow, when somebody (mistakely) suggests using seq[:-1:-1] to do a reverse slice. Then it's pointed out that this doesn't work like range(len(seq), -1, -1). Then some wiseass with too much obscure knowledge of implementation details ;-) points out that seq[:-len(seq)-1:-1] does work (well, in CPython - I don't know whether all implementations follow this quirk - although the docs imply that they should).
Then we could triumphantly (re-)introduce upper and lower bounds in Python 4, with the meaning a[i:j:-1] == a[i:j][::-1].
+1.
On Sun, 27 Oct 2013 16:56:34 -0500 Tim Peters <tim.peters@gmail.com> wrote:
[Guido]
I wouldn't take out negative strides completely, but I might consider deprecating lower and upper bounds other than None (== missing). So a[::-1] would still work, and a[None:None:-1] would be a verbose way of spelling the same,
Happy idea.
but a[-1:-6:-1] would be deprecated.
Not sure I've _ever_ seen that in real life.
If it's never seen in real life, then there's probably no urge to deprecate it and later replace it with a new thing, IMHO. Also, I get the feeling it's a bit early to start talking about Python 4 (is that supposed to happen at all?). Regards Antoine.
On 10/27/2013 04:20 PM, Antoine Pitrou wrote:
On Sun, 27 Oct 2013 16:56:34 -0500 Tim Peters <tim.peters@gmail.com> wrote:
[Guido]
I wouldn't take out negative strides completely, but I might consider deprecating lower and upper bounds other than None (== missing). So a[::-1] would still work, and a[None:None:-1] would be a verbose way of spelling the same, Happy idea.
but a[-1:-6:-1] would be deprecated. Not sure I've _ever_ seen that in real life. If it's never seen in real life, then there's probably no urge to deprecate it and later replace it with a new thing, IMHO.
Also, I get the feeling it's a bit early to start talking about Python 4 (is that supposed to happen at all?).
Regards
Antoine.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
If it's a misfeature and it's not currently being used, then this is the perfect time to deprecate it. -- Charles Hixson
On Sun, Oct 27, 2013 at 7:20 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Sun, 27 Oct 2013 16:56:34 -0500 Tim Peters <tim.peters@gmail.com> wrote:
[Guido]
I wouldn't take out negative strides completely, but I might consider deprecating lower and upper bounds other than None (== missing). So a[::-1] would still work, and a[None:None:-1] would be a verbose way of spelling the same,
Happy idea.
but a[-1:-6:-1] would be deprecated.
Not sure I've _ever_ seen that in real life.
If it's never seen in real life, then there's probably no urge to deprecate it and later replace it with a new thing, IMHO.
I think there is to minimize even the chance someone has done something like this since it's so wonky. We all know someone has somewhere in code out in the world. +1 on doing a deprecation in 3.4.
Also, I get the feeling it's a bit early to start talking about Python 4 (is that supposed to happen at all?).
Well, I'm sure there will be one after 3.9, but probably more along the lines of "everything previously deprecated has now been removed" rather than the 2->3 shift. -Brett
Regards
Antoine.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
Le Mon, 28 Oct 2013 09:57:43 -0400, Brett Cannon <brett@python.org> a écrit :
but a[-1:-6:-1] would be deprecated.
Not sure I've _ever_ seen that in real life.
If it's never seen in real life, then there's probably no urge to deprecate it and later replace it with a new thing, IMHO.
I think there is to minimize even the chance someone has done something like this since it's so wonky. We all know someone has somewhere in code out in the world.
But that code probably works anyway. It's slightly unintuitive to write but it works afterwards. I think there's a tension here between discouraging new uses of the feature, and breaking existing uses. Regards Antoine.
On 28 October 2013 13:57, Brett Cannon <brett@python.org> wrote:
On Sun, Oct 27, 2013 at 7:20 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Sun, 27 Oct 2013 16:56:34 -0500 Tim Peters <tim.peters@gmail.com> wrote:
[Guido]
I wouldn't take out negative strides completely, but I might consider deprecating lower and upper bounds other than None (== missing). So a[::-1] would still work, and a[None:None:-1] would be a verbose way of spelling the same,
Happy idea.
but a[-1:-6:-1] would be deprecated.
Not sure I've _ever_ seen that in real life.
If it's never seen in real life, then there's probably no urge to deprecate it and later replace it with a new thing, IMHO.
I think there is to minimize even the chance someone has done something like this since it's so wonky. We all know someone has somewhere in code out in the world.
Are you saying that any code that depends on the current behaviour is "wonky" and therefore doesn't properly deserve continued support? I know I have private (numpy-based) code that depends on this behaviour. There's nothing wonky about me choosing the limits that I currently need to in order to get the correct slice. I think that the numpy mailing lists should be consulted before any decisions are made. As Antoine says: if you've never noticed this behaviour before then it obviously doesn't matter to you that much so why the rush to deprecate it?
+1 on doing a deprecation in 3.4.
-1 on any deprecation without a clear plan for a better syntax. Simply changing the semantics of the current syntax would bring in who knows how many off-by-one errors for virtually no benefit. Personally I think that negative slicing and indexing are both bad ideas. I've had many bugs from the wraparound behaviour of both and I've never had a situation where the wraparound was useful in itself (if it worked using modulo arithmetic then there would at least be some uses - but it does not). Matlab has a much better way of handling this with the end keyword: % chop last n elements off: a_chopped = a(1:end-n) This works even when n is zero because it's not conflating integer arithmetic with indexing relative to the end. Oscar
On Mon, Oct 28, 2013 at 10:49 AM, Oscar Benjamin <oscar.j.benjamin@gmail.com
wrote:
On 28 October 2013 13:57, Brett Cannon <brett@python.org> wrote:
On Sun, Oct 27, 2013 at 7:20 PM, Antoine Pitrou <solipsis@pitrou.net>
On Sun, 27 Oct 2013 16:56:34 -0500 Tim Peters <tim.peters@gmail.com> wrote:
[Guido]
I wouldn't take out negative strides completely, but I might
consider
deprecating lower and upper bounds other than None (== missing). So a[::-1] would still work, and a[None:None:-1] would be a verbose way of spelling the same,
Happy idea.
but a[-1:-6:-1] would be deprecated.
Not sure I've _ever_ seen that in real life.
If it's never seen in real life, then there's probably no urge to deprecate it and later replace it with a new thing, IMHO.
I think there is to minimize even the chance someone has done something
wrote: like
this since it's so wonky. We all know someone has somewhere in code out in the world.
Are you saying that any code that depends on the current behaviour is "wonky" and therefore doesn't properly deserve continued support?
I'm saying the current semantics of how the strides work are wonky and we should fix it so valid code has to jump through less hoops to get the semantics they would expect/want.
I know I have private (numpy-based) code that depends on this behaviour. There's nothing wonky about me choosing the limits that I currently need to in order to get the correct slice.
Sure, all I'm saying is that you probably had to mentally work through more to get the right semantics you wanted compared to if this was changed the way Guido and Tim are suggesting.
I think that the numpy mailing lists should be consulted before any decisions are made. As Antoine says: if you've never noticed this behaviour before then it obviously doesn't matter to you that much so why the rush to deprecate it?
I'm not saying not to talk to them, but I also don't think we should necessarily not change it because no one uses it either. If it's wide spread then sure, we just live with it. It's always a balancing act of fixing for future code vs. pain of current code. I'm just saying we shouldn't dismiss changing this out of hand because you are so far the only person who has relied on this. As for the rush, it's because 3.4b1 is approaching and if this slips to 3.5 that's 1.5 years of deprecation time lost for something that doesn't have a syntactic break to help you discover the change in semantics.
+1 on doing a deprecation in 3.4.
-1 on any deprecation without a clear plan for a better syntax. Simply changing the semantics of the current syntax would bring in who knows how many off-by-one errors for virtually no benefit.
The deprecation would be in there from now until Python 4 so it wouldn't be sudden (remember that we are on a roughly 18 month release cycle, so if this went into 3.4 that's 7.5 years until this changes in Python 4). And there's already a future-compatible way to change your code to get the same results in the end that just require more explicit steps/code. -Brett
On 28 October 2013 15:04, Brett Cannon <brett@python.org> wrote:
On Mon, Oct 28, 2013 at 10:49 AM, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
On 28 October 2013 13:57, Brett Cannon <brett@python.org> wrote:
I think that the numpy mailing lists should be consulted before any decisions are made. As Antoine says: if you've never noticed this behaviour before then it obviously doesn't matter to you that much so why the rush to deprecate it?
I'm not saying not to talk to them, but I also don't think we should necessarily not change it because no one uses it either. If it's wide spread then sure, we just live with it. It's always a balancing act of fixing for future code vs. pain of current code. I'm just saying we shouldn't dismiss changing this out of hand because you are so far the only person who has relied on this.
Anyone who has used negative strides and non-default start/stop is relying on it. It has been a core language feature since long before I started using Python. Also I'm not the only person to point out that a more common problem is with wraparound when doing something like a[:n]. That is a more significant problem with slicing and I don't understand why the emphasis here is all on the negative strides rather then negative indexing. Using negative indices to mean "from the end" is a mistake and it leads to things like this:
a = 'abcdefg' for n in reversed(range(7)): ... print(n, a[:-n]) ... 6 a 5 ab 4 abc 3 abcd 2 abcde 1 abcdef 0
You can do something like a[:-n or None] to correctly handle zero but that still does the wrong thing when n is negative. Also why do you get an error when your index goes off one end of the array but not when it goes off the other?
a[len(a)] Traceback (most recent call last): File "<stdin>", line 1, in <module> IndexError: string index out of range a[-1] 'g'
I have never been in a situation where I was writing code and didn't know whether I wanted to slice/index from the end or the beginning at coding time. I would much rather that a[-1] be an error and have an explicit syntax for indexing from the end. I have never found this wraparound to be useful and I think that if there is a proposal to change slicing in a backward incompatible way then it should be to something that is significantly better by solving these real problems. I have often had bugs or been forced to write awkward code because of these. The negative slicing indices may be a bit harder to reason about but it has never actually caused me any problems. Something like the matlab/pike syntax would fix these problems as well as making it possible to use the same indices for negative stride slices. That would be worth a deprecation process in my opinion. The specific suggestion so far does not have enough of an advantage to justify breaking anyone's code IMO.
As for the rush, it's because 3.4b1 is approaching and if this slips to 3.5 that's 1.5 years of deprecation time lost for something that doesn't have a syntactic break to help you discover the change in semantics.
+1 on doing a deprecation in 3.4.
-1 on any deprecation without a clear plan for a better syntax. Simply changing the semantics of the current syntax would bring in who knows how many off-by-one errors for virtually no benefit.
The deprecation would be in there from now until Python 4 so it wouldn't be sudden (remember that we are on a roughly 18 month release cycle, so if this went into 3.4 that's 7.5 years until this changes in Python 4). And there's already a future-compatible way to change your code to get the same results in the end that just require more explicit steps/code.
This argument would be more persuasive if you said: "the new better syntax that solves many of the slicing problems will be introduced *now*, and the old style/syntax will be deprecated later". Oscar
On 10/28/2013 01:20 PM, Oscar Benjamin wrote:
On 28 October 2013 15:04, Brett Cannon <brett@python.org> wrote:
I'm not saying not to talk to them, but I also don't think we should necessarily not change it because no one uses it either. If it's wide spread then sure, we just live with it. It's always a balancing act of fixing for future code vs. pain of current code. I'm just saying we shouldn't dismiss changing this out of hand because you are so far the only person who has relied on this.
Anyone who has used negative strides and non-default start/stop is relying on it. It has been a core language feature since long before I started using Python.
I have code that relies on this.
Also I'm not the only person to point out that a more common problem is with wraparound when doing something like a[:n]. That is a more significant problem with slicing and I don't understand why the emphasis here is all on the negative strides rather then negative indexing. Using negative indices to mean "from the end" is a mistake and it leads to things like this:
--> a = 'abcdefg' --> for n in reversed(range(7)): ... print(n, a[:-n]) ... 6 a 5 ab 4 abc 3 abcd 2 abcde 1 abcdef 0
I've been bitten by this more than once. :(
Something like the matlab/pike syntax would fix these problems as well as making it possible to use the same indices for negative stride slices. That would be worth a deprecation process in my opinion. The specific suggestion so far does not have enough of an advantage to justify breaking anyone's code IMO.
+1
As for the rush, it's because 3.4b1 is approaching and if this slips to 3.5 that's 1.5 years of deprecation time lost for something that doesn't have a syntactic break to help you discover the change in semantics.
The difference between 7.5 years and 6.0 years of deprecation time don't seem that significant to me. Besides, can't we add the deprecation to the docs whenever? -- ~Ethan~
On 29 Oct 2013 06:21, "Oscar Benjamin" <oscar.j.benjamin@gmail.com> wrote:
On 28 October 2013 15:04, Brett Cannon <brett@python.org> wrote:
On Mon, Oct 28, 2013 at 10:49 AM, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
On 28 October 2013 13:57, Brett Cannon <brett@python.org> wrote:
I think that the numpy mailing lists should be consulted before any decisions are made. As Antoine says: if you've never noticed this behaviour before then it obviously doesn't matter to you that much so why the rush to deprecate it?
I'm not saying not to talk to them, but I also don't think we should necessarily not change it because no one uses it either. If it's wide
then sure, we just live with it. It's always a balancing act of fixing for future code vs. pain of current code. I'm just saying we shouldn't dismiss changing this out of hand because you are so far the only person who has relied on this.
Anyone who has used negative strides and non-default start/stop is relying on it. It has been a core language feature since long before I started using Python.
Also I'm not the only person to point out that a more common problem is with wraparound when doing something like a[:n]. That is a more significant problem with slicing and I don't understand why the emphasis here is all on the negative strides rather then negative indexing. Using negative indices to mean "from the end" is a mistake and it leads to things like this:
a = 'abcdefg' for n in reversed(range(7)): ... print(n, a[:-n]) ... 6 a 5 ab 4 abc 3 abcd 2 abcde 1 abcdef 0
You can do something like a[:-n or None] to correctly handle zero but that still does the wrong thing when n is negative.
Also why do you get an error when your index goes off one end of the array but not when it goes off the other?
a[len(a)] Traceback (most recent call last): File "<stdin>", line 1, in <module> IndexError: string index out of range a[-1] 'g'
I have never been in a situation where I was writing code and didn't know whether I wanted to slice/index from the end or the beginning at coding time. I would much rather that a[-1] be an error and have an explicit syntax for indexing from the end.
I have never found this wraparound to be useful and I think that if there is a proposal to change slicing in a backward incompatible way then it should be to something that is significantly better by solving these real problems. I have often had bugs or been forced to write awkward code because of these. The negative slicing indices may be a bit harder to reason about but it has never actually caused me any problems.
Something like the matlab/pike syntax would fix these problems as well as making it possible to use the same indices for negative stride slices. That would be worth a deprecation process in my opinion. The specific suggestion so far does not have enough of an advantage to justify breaking anyone's code IMO.
As for the rush, it's because 3.4b1 is approaching and if this slips to 3.5 that's 1.5 years of deprecation time lost for something that doesn't have a syntactic break to help you discover the change in semantics.
+1 on doing a deprecation in 3.4.
-1 on any deprecation without a clear plan for a better syntax. Simply changing the semantics of the current syntax would bring in who knows how many off-by-one errors for virtually no benefit.
The deprecation would be in there from now until Python 4 so it wouldn't be sudden (remember that we are on a roughly 18 month release cycle, so if
went into 3.4 that's 7.5 years until this changes in Python 4). And
spread this there's
already a future-compatible way to change your code to get the same results in the end that just require more explicit steps/code.
This argument would be more persuasive if you said: "the new better syntax that solves many of the slicing problems will be introduced *now*, and the old style/syntax will be deprecated later".
Indeed. I like Terry's proposed semantics, so if we can give that a new syntax, we can just give range and slice appropriate "reverse=False" keyword arguments like sorted and list.sort, and never have to deprecate negative strides (although negative strides would be disallowed when reverse=True). For example: s[i:j:k] - normal forward slice s[i:j:<<k] - reversed slice (with i and j as left/right rather than start/stop) Reversing with unit stride could be: s[i:j:<<] When reverse=True, start, stop and step for the range or slice would be calculated as follows: start = len(s)-1 if j is None else j-1 stop = -1 if i is None else i-1 step = -k It doesn't solve the -j vs len(s)-j problem for the end index, but I think it's still more intuitive for the reasons Tim and Terry gave. Cheers, Nick.
Oscar _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
I'm not sure I like new syntax. We'd still have to find a way to represent this with slice() and also with range(). On Mon, Oct 28, 2013 at 3:29 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 29 Oct 2013 06:21, "Oscar Benjamin" <oscar.j.benjamin@gmail.com> wrote:
On 28 October 2013 15:04, Brett Cannon <brett@python.org> wrote:
On Mon, Oct 28, 2013 at 10:49 AM, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
On 28 October 2013 13:57, Brett Cannon <brett@python.org> wrote:
I think that the numpy mailing lists should be consulted before any decisions are made. As Antoine says: if you've never noticed this behaviour before then it obviously doesn't matter to you that much so why the rush to deprecate it?
I'm not saying not to talk to them, but I also don't think we should necessarily not change it because no one uses it either. If it's wide
then sure, we just live with it. It's always a balancing act of fixing for future code vs. pain of current code. I'm just saying we shouldn't dismiss changing this out of hand because you are so far the only person who has relied on this.
Anyone who has used negative strides and non-default start/stop is relying on it. It has been a core language feature since long before I started using Python.
Also I'm not the only person to point out that a more common problem is with wraparound when doing something like a[:n]. That is a more significant problem with slicing and I don't understand why the emphasis here is all on the negative strides rather then negative indexing. Using negative indices to mean "from the end" is a mistake and it leads to things like this:
a = 'abcdefg' for n in reversed(range(7)): ... print(n, a[:-n]) ... 6 a 5 ab 4 abc 3 abcd 2 abcde 1 abcdef 0
You can do something like a[:-n or None] to correctly handle zero but that still does the wrong thing when n is negative.
Also why do you get an error when your index goes off one end of the array but not when it goes off the other?
a[len(a)] Traceback (most recent call last): File "<stdin>", line 1, in <module> IndexError: string index out of range a[-1] 'g'
I have never been in a situation where I was writing code and didn't know whether I wanted to slice/index from the end or the beginning at coding time. I would much rather that a[-1] be an error and have an explicit syntax for indexing from the end.
I have never found this wraparound to be useful and I think that if there is a proposal to change slicing in a backward incompatible way then it should be to something that is significantly better by solving these real problems. I have often had bugs or been forced to write awkward code because of these. The negative slicing indices may be a bit harder to reason about but it has never actually caused me any problems.
Something like the matlab/pike syntax would fix these problems as well as making it possible to use the same indices for negative stride slices. That would be worth a deprecation process in my opinion. The specific suggestion so far does not have enough of an advantage to justify breaking anyone's code IMO.
As for the rush, it's because 3.4b1 is approaching and if this slips to 3.5 that's 1.5 years of deprecation time lost for something that doesn't have a syntactic break to help you discover the change in semantics.
+1 on doing a deprecation in 3.4.
-1 on any deprecation without a clear plan for a better syntax. Simply changing the semantics of the current syntax would bring in who knows how many off-by-one errors for virtually no benefit.
The deprecation would be in there from now until Python 4 so it wouldn't be sudden (remember that we are on a roughly 18 month release cycle, so if this went into 3.4 that's 7.5 years until this changes in Python 4). And
spread there's
already a future-compatible way to change your code to get the same results in the end that just require more explicit steps/code.
This argument would be more persuasive if you said: "the new better syntax that solves many of the slicing problems will be introduced *now*, and the old style/syntax will be deprecated later".
Indeed. I like Terry's proposed semantics, so if we can give that a new syntax, we can just give range and slice appropriate "reverse=False" keyword arguments like sorted and list.sort, and never have to deprecate negative strides (although negative strides would be disallowed when reverse=True).
For example:
s[i:j:k] - normal forward slice s[i:j:<<k] - reversed slice (with i and j as left/right rather than start/stop)
Reversing with unit stride could be: s[i:j:<<]
When reverse=True, start, stop and step for the range or slice would be calculated as follows:
start = len(s)-1 if j is None else j-1 stop = -1 if i is None else i-1 step = -k
It doesn't solve the -j vs len(s)-j problem for the end index, but I think it's still more intuitive for the reasons Tim and Terry gave.
Cheers, Nick.
Oscar _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
-- --Guido van Rossum (python.org/~guido)
On 29 Oct 2013 08:41, "Guido van Rossum" <guido@python.org> wrote:
I'm not sure I like new syntax. We'd still have to find a way to
represent this with slice() and also with range(). Those are much easier: we can just add a "reverse=False" keyword-only argument. However, I realised that given the need to appropriately document these function signatures and the precedent set by sorted (where the reverse flag is essentially an optimisation trick that avoids a separate reversal operation) the cleaner interpretation of such an argument is for: range(i, j, k, reverse=True) to effectively mean: range(i, j, k)[::-1] and for: s[slice(i, j, k, reverse=True)] to effectively mean: s[i:j:k][::-1] range and slice would handle the appropriate start/stop/step calculations under the hood and hence be backwards compatible with existing container implementations and other code. This approach also means we could avoid addressing the slice reversal syntax question for 3.4, and revisit it in the 3.5 time frame (and ditto for deprecating negative strides). However, the idea of just allowing keyword args to be passed to the slice builtin in the slice syntax did occur to me: s[i:j:k:reverse=True] Cheers, Nick.
On Mon, Oct 28, 2013 at 3:29 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 29 Oct 2013 06:21, "Oscar Benjamin" <oscar.j.benjamin@gmail.com>
On 28 October 2013 15:04, Brett Cannon <brett@python.org> wrote:
On Mon, Oct 28, 2013 at 10:49 AM, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
On 28 October 2013 13:57, Brett Cannon <brett@python.org> wrote:
I think that the numpy mailing lists should be consulted before any decisions are made. As Antoine says: if you've never noticed this behaviour before then it obviously doesn't matter to you that much
so
why the rush to deprecate it?
I'm not saying not to talk to them, but I also don't think we should necessarily not change it because no one uses it either. If it's wide spread then sure, we just live with it. It's always a balancing act of fixing for future code vs. pain of current code. I'm just saying we shouldn't dismiss changing this out of hand because you are so far the only person who has relied on this.
Anyone who has used negative strides and non-default start/stop is relying on it. It has been a core language feature since long before I started using Python.
Also I'm not the only person to point out that a more common problem is with wraparound when doing something like a[:n]. That is a more significant problem with slicing and I don't understand why the emphasis here is all on the negative strides rather then negative indexing. Using negative indices to mean "from the end" is a mistake and it leads to things like this:
a = 'abcdefg' for n in reversed(range(7)): ... print(n, a[:-n]) ... 6 a 5 ab 4 abc 3 abcd 2 abcde 1 abcdef 0
You can do something like a[:-n or None] to correctly handle zero but that still does the wrong thing when n is negative.
Also why do you get an error when your index goes off one end of the array but not when it goes off the other?
a[len(a)] Traceback (most recent call last): File "<stdin>", line 1, in <module> IndexError: string index out of range a[-1] 'g'
I have never been in a situation where I was writing code and didn't know whether I wanted to slice/index from the end or the beginning at coding time. I would much rather that a[-1] be an error and have an explicit syntax for indexing from the end.
I have never found this wraparound to be useful and I think that if there is a proposal to change slicing in a backward incompatible way then it should be to something that is significantly better by solving these real problems. I have often had bugs or been forced to write awkward code because of these. The negative slicing indices may be a bit harder to reason about but it has never actually caused me any problems.
Something like the matlab/pike syntax would fix these problems as well as making it possible to use the same indices for negative stride slices. That would be worth a deprecation process in my opinion. The specific suggestion so far does not have enough of an advantage to justify breaking anyone's code IMO.
As for the rush, it's because 3.4b1 is approaching and if this slips to 3.5 that's 1.5 years of deprecation time lost for something that doesn't have a syntactic break to help you discover the change in semantics.
+1 on doing a deprecation in 3.4.
-1 on any deprecation without a clear plan for a better syntax.
Simply
changing the semantics of the current syntax would bring in who knows how many off-by-one errors for virtually no benefit.
The deprecation would be in there from now until Python 4 so it wouldn't be sudden (remember that we are on a roughly 18 month release cycle, so if this went into 3.4 that's 7.5 years until this changes in Python 4). And
already a future-compatible way to change your code to get the same results in the end that just require more explicit steps/code.
This argument would be more persuasive if you said: "the new better syntax that solves many of the slicing problems will be introduced *now*, and the old style/syntax will be deprecated later".
Indeed. I like Terry's proposed semantics, so if we can give that a new syntax, we can just give range and slice appropriate "reverse=False" keyword arguments like sorted and list.sort, and never have to deprecate negative strides (although negative strides would be disallowed when reverse=True).
For example:
s[i:j:k] - normal forward slice s[i:j:<<k] - reversed slice (with i and j as left/right rather than start/stop)
Reversing with unit stride could be: s[i:j:<<]
When reverse=True, start, stop and step for the range or slice would be calculated as follows:
start = len(s)-1 if j is None else j-1 stop = -1 if i is None else i-1 step = -k
It doesn't solve the -j vs len(s)-j problem for the end index, but I
wrote: there's think it's still more intuitive for the reasons Tim and Terry gave.
Cheers, Nick.
Oscar _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
-- --Guido van Rossum (python.org/~guido)
On Mon, Oct 28, 2013 at 7:13 PM, Nick Coghlan <ncoghlan@gmail.com> wrote: ..
to effectively mean:
range(i, j, k)[::-1]
and for:
s[slice(i, j, k, reverse=True)]
to effectively mean:
s[i:j:k][::-1]
range and slice would handle the appropriate start/stop/step calculations under the hood and hence be backwards compatible with existing container implementations and other code.
This approach also means we could avoid addressing the slice reversal syntax question for 3.4, and revisit it in the 3.5 time frame (and ditto for deprecating negative strides). However, the idea of just allowing keyword args to be passed to the slice builtin in the slice syntax did occur to me:
s[i:j:k:reverse=True]
+1 In fact, I suggested the same before reading your e-mail.
On Mon, Oct 28, 2013 at 6:41 PM, Guido van Rossum <guido@python.org> wrote:
I'm not sure I like new syntax.
Neither do I, but I've never liked the current extended slicing syntax either.
We'd still have to find a way to represent this with slice() and also with range().
These seem easy: slice(i, j, k, reverse=True) and range(i, j, k, reverse=True). FWIW, I won't miss extended slicing syntax if it goes away in Python 4. I find a[slice(i, j, step=2, reverse=True)] more readable than a[i:j:-k]. Alternatively, we can allow keyword arguments like syntax inside []: a[i:j,step=2, reverse=True] can become syntactic sugar for a[slice(i, j, step=2, reverse=True)] .
Guido van Rossum wrote:
I'm not sure I like new syntax. We'd still have to find a way to represent this with slice() and also with range().
If we allowed slice[...] to create slice objects, any new indexing syntax would carry over to that. Similarly we could use range[...] to create ranges using slice syntax. -- Greg
On 28 October 2013 22:41, Guido van Rossum <guido@python.org> wrote:
I'm not sure I like new syntax. We'd still have to find a way to represent this with slice() and also with range().
It's a shame there isn't an indexing syntax where you can supply an iterator that produces the set of indexes you want and returns the subsequence - then we could experiment with alternative semantics in user code. So, for example (silly example, because I don't have the time right now to define an indexing function that matches any of the proposed solutions): >>> def PrimeSlice(): >>> yield 2 >>> yield 3 >>> yield 5 >>> yield 7 >>> 'abcdefgh'[[PrimeSlice()]] 'bceg' But of course, to make this user-definable needs new syntax in the first place :-( Paul
On 29 October 2013 20:23, Paul Moore <p.f.moore@gmail.com> wrote:
On 28 October 2013 22:41, Guido van Rossum <guido@python.org> wrote:
I'm not sure I like new syntax. We'd still have to find a way to represent this with slice() and also with range().
It's a shame there isn't an indexing syntax where you can supply an iterator that produces the set of indexes you want and returns the subsequence - then we could experiment with alternative semantics in user code.
So, for example (silly example, because I don't have the time right now to define an indexing function that matches any of the proposed solutions):
>>> def PrimeSlice(): >>> yield 2 >>> yield 3 >>> yield 5 >>> yield 7
>>> 'abcdefgh'[[PrimeSlice()]] 'bceg'
But of course, to make this user-definable needs new syntax in the first place :-(
Tangent: I thought of a list comprehension based syntax for that a while ago, but decided it wasn't particularly interesting since it's too hard to provide sensible fallback behaviour for existing containers: 'abcdefgh'[x for x in PrimeSlice()] Back on the topic of slicing with negative steps, I did some experimentation to see what could be done in current Python using a callable that produces the appropriate slice objects, and it turns out you can create a quite usable "rslice" callable, provided you pass in the length when dealing with mismatched signs on the indices (that's the advantage of the "[i:j][::-k]" interpretation of the reversed slice - if you want to interpret it as "[i:j:k][::-1]" as I suggested previously, I believe you would need to know the length of the sequence in all cases): def rslice(*slice_args, length=None): """For args (i, j, k) computes a slice equivalent to [i:j][::-k] (which is not the same as [i:j:-k]!)""" forward = slice(*slice_args) # Easiest way to emulate slice arg parsing! # Always negate the step step = -forward.step # Given slice args are closed on the left, open on the right, # simply negating the step and swapping left and right will introduce # an off-by-one error, so we need to adjust the endpoints to account # for the open/closed change left = forward.start right = forward.stop # Check for an empty slice before tinkering with offsets if left is not None and right is not None: if (left >= 0) != (right >= 0): if length is None: raise ValueError("Must supply length for indices of different signs") if left < 0: left += length else: right += length if left >= right: return slice(0, 0, 1) stop = left if stop is not None: # Closed on the left -> open stop value in the reversed slice if stop: stop -= 1 else: # Converting a start offset of 0 to an end offset of -1 does # the wrong thing - need to convert it to None instead stop = None start = right if start is not None: # Open on the right -> closed start value in the reversed slice if start: start -= 1 else: # Converting a stop offset of 0 to a start offset of -1 does # the wrong thing - need to convert it to None instead start = None return slice(start, stop, step) # Test case data = range(10) for i in range(-10, 11): for j in range(-10, 11): for k in range(1, 11): expected = data[i:j][::-k] actual = data[rslice(i, j, k, length=len(data))] if actual != expected: print((i, j, k), actual, expected) So, at this point, I still quite like the idea of adding a "reverse=True" keyword only arg to slice and range (with the semantics of rslice above), and then revisit the idea of offering syntax for it in Python 3.5. Since slices are objects, they could store the "reverse" flag internally, and only apply it when the indices() method (or the C API equivalent) is called to convert the abstract indices to real ones for the cases where the info is needed - otherwise they'd do the calculation above to create a suitable "forward" definition for maximum compatibility with existing container implementations. A separate keyword only arg like "addlen=True" would also make it possible to turn off the negative indexing support in a slice object's indices() method, and switch it to clamping to zero instead. An alternative to both of those ideas would be to eliminate the restriction on subclassing slice objects in CPython, then you could implement slice objects with different indices() method behaviour (they'd still have to produce a start, stop, step triple though, so they wouldn't offer the full generality Paul was describing). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 29/10/2013 13:32, Nick Coghlan wrote:
So, at this point, I still quite like the idea of adding a "reverse=True" keyword only arg to slice and range (with the semantics of rslice above), and then revisit the idea of offering syntax for it in Python 3.5.
I sincerely hope that you meant reverse=False? :) -- Python is the second best programming language in the world. But the best has yet to be invented. Christian Tismer Mark Lawrence
On 29 October 2013 23:55, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
On 29/10/2013 13:32, Nick Coghlan wrote:
So, at this point, I still quite like the idea of adding a "reverse=True" keyword only arg to slice and range (with the semantics of rslice above), and then revisit the idea of offering syntax for it in Python 3.5.
I sincerely hope that you meant reverse=False? :)
I could claim that I was referring to the way you would call it (which is why the subclass idea is also attractive), but yes, that's really just a typo and the default would be the other way around :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 29/10/2013 10:23, Paul Moore wrote:
On 28 October 2013 22:41, Guido van Rossum <guido@python.org> wrote:
I'm not sure I like new syntax. We'd still have to find a way to represent this with slice() and also with range().
It's a shame there isn't an indexing syntax where you can supply an iterator that produces the set of indexes you want and returns the subsequence - then we could experiment with alternative semantics in user code.
So, for example (silly example, because I don't have the time right now to define an indexing function that matches any of the proposed solutions):
>>> def PrimeSlice(): >>> yield 2 >>> yield 3 >>> yield 5 >>> yield 7
>>> 'abcdefgh'[[PrimeSlice()]] 'bceg'
But of course, to make this user-definable needs new syntax in the first place :-(
We already use (...) and [...], which leaves {...}:
'abcdefgh'{PrimeSlice()} 'bceg'
On 29 October 2013 18:08, MRAB <python@mrabarnett.plus.com> wrote:
On 29/10/2013 10:23, Paul Moore wrote:
On 28 October 2013 22:41, Guido van Rossum <guido@python.org> wrote:
I'm not sure I like new syntax. We'd still have to find a way to represent this with slice() and also with range().
It's a shame there isn't an indexing syntax where you can supply an iterator that produces the set of indexes you want and returns the subsequence - then we could experiment with alternative semantics in user code.
So, for example (silly example, because I don't have the time right now to define an indexing function that matches any of the proposed solutions):
>>> def PrimeSlice(): >>> yield 2 >>> yield 3 >>> yield 5 >>> yield 7
>>> 'abcdefgh'[[PrimeSlice()]] 'bceg'
But of course, to make this user-definable needs new syntax in the first place :-(
We already use (...) and [...], which leaves {...}:
'abcdefgh'{PrimeSlice()} 'bceg'
You could probably do it by simply adding an extra case to __getitem__ on builtin types: check in order for integer/object with __index__ (single index), Slice object (traditional slice), iterable (series of arbitrary indices). User defined types would have to implement this in the same way that they currently have to implement Slice behaviour, and dictionaries would not behave the same (again, in the same way as for Slice). Basically an iteravble doesn't need to be any more of a special case than Slice (in fact Slice is *more* special, because there is syntax that generates Slice objects). Paul
On 10/28/2013 4:20 PM, Oscar Benjamin wrote:
Also I'm not the only person to point out that a more common problem is with wraparound when doing something like a[:n].
I think it a mistake to think in terms of 'wraparound'. This implies to me that there is a mod len(s) operation applied, and there is not. A negative index or slice position, -n, is simply an abbreviation for len(s) - n. Besides being faster to write, the abbreviation runs about 3x as fast with 3.3.2 on my machine.
timeit.timeit('pass', "s='abcde'") 0.02394336495171956 timeit.timeit('pass', "s='abcde'") 0.02382040032352961 .024 timeit overhead
timeit.timeit('s[-3]', "s='abcde'") 0.06969358444349899 timeit.timeit('s[-3]', "s='abcde'") 0.06534832190172146 .068 - .024 = .044 net
timeit.timeit('s[len(s)-3]', "s='abcde'") 0.15656133106750403 timeit.timeit('s[len(s)-3]', "s='abcde'") 0.15518289758767878 .156 - .024 = .132 net
The trick works because Python, unlike some other languages, does not allow negative indexing from the start of the array. If Python had required an explicit end marker from the beginning, conditional code would be required if the sign were unknown. If Python gained one today, it would have to be optional for back compatibility. -- Terry Jan Reedy
On 28 October 2013 22:39, Terry Reedy <tjreedy@udel.edu> wrote:
On 10/28/2013 4:20 PM, Oscar Benjamin wrote:
Also I'm not the only person to point out that a more common problem is with wraparound when doing something like a[:n].
I think it a mistake to think in terms of 'wraparound'. This implies to me that there is a mod len(s) operation applied, and there is not. A negative index or slice position, -n, is simply an abbreviation for len(s) - n. Besides being faster to write, the abbreviation runs about 3x as fast with 3.3.2 on my machine.
I realise that it doesn't use modulo arithmetic. As I said earlier I would be able to find uses for the current behaviour if it did. However it does wraparound in some sense when the sign changes.
The trick works because Python, unlike some other languages, does not allow negative indexing from the start of the array. If Python had required an explicit end marker from the beginning, conditional code would be required if the sign were unknown. If Python gained one today, it would have to be optional for back compatibility.
(I don't know if I understand what you mean.) Have you ever written code where you didn't know if you wanted to index from the start or the end of a sequence? I haven't and I use slicing/indexing extensively. I have to write conditional code to handle the annoyingly permissive current behaviour. Things that should be an error such as passing in a negative index don't produce an error so I have to check for it myself. Oscar
The deprecation would be in there from now until Python 4 so it wouldn't be sudden (remember that we are on a roughly 18 month release cycle, so if
Am 28.10.2013 16:08 schrieb "Brett Cannon" <brett@python.org>: this went into 3.4 that's 7.5 years until this changes in Python 4). I don't get your calculation: after 3.9 clearly follows 3.10, as versions aren't decimal numbers, but tuples of integers. So we have 1.5×X years, with X being any number from 1 to infinity that Guido deems suitable. @proposal: -1 for explicit impliciticity in slicing syntax, as it's ass complicated as it sounds (when phrased like I just did) and noisier than obfuscated C +1 for deprecating negative slicing, and teaching people to use reversed. But I think we should consider adding some sort of slice view function, since list[::2] already creates a copy, and reversed(list[::2]) creates two.
On Mon, Oct 28, 2013 at 7:17 PM, Philipp A. <flying-sheep@web.de> wrote:
Am 28.10.2013 16:08 schrieb "Brett Cannon" <brett@python.org>:
The deprecation would be in there from now until Python 4 so it wouldn't be sudden (remember that we are on a roughly 18 month release cycle, so if this went into 3.4 that's 7.5 years until this changes in Python 4).
I don't get your calculation: after 3.9 clearly follows 3.10, as versions aren't decimal numbers, but tuples of integers.
So we have 1.5×X years, with X being any number from 1 to infinity that Guido deems suitable.
Because Guido (and I as well) doesn't like minor version numbers that go past single digits, so the chances of 3.10 are very slim. That's why I put a cap on the possible number of years before something gets removed. -Brett
@proposal: -1 for explicit impliciticity in slicing syntax, as it's ass complicated as it sounds (when phrased like I just did) and noisier than obfuscated C
+1 for deprecating negative slicing, and teaching people to use reversed.
But I think we should consider adding some sort of slice view function, since list[::2] already creates a copy, and reversed(list[::2]) creates two.
On Oct 29, 2013, at 09:39 AM, Brett Cannon wrote:
Because Guido (and I as well) doesn't like minor version numbers that go past single digits, so the chances of 3.10 are very slim. That's why I put a cap on the possible number of years before something gets removed.
And why 2.6.9 is the end of the line for 2.6. :) -Barry
Am 27.10.2013 22:56, schrieb Tim Peters:
[Guido]
I wouldn't take out negative strides completely, but I might consider deprecating lower and upper bounds other than None (== missing). So a[::-1] would still work, and a[None:None:-1] would be a verbose way of spelling the same,
Happy idea.
but a[-1:-6:-1] would be deprecated.
Not sure I've _ever_ seen that in real life. [...]
Before such a change is considered, I'd like to see the numpy community consulted; numpy users probably use slicing more than anyone else. At least it should be possible to integrate the new slicing without breaking too many other numpy behavior. As a datapoint, Matlab negative-stride slicing is similar to Python, but it is less confusing (IMO) since the slices are inclusive on both ends. Let "a" be a range from 1 to 10:
a(2:7) % slicing is done with parens; 1-based indexing [2 3 4 5 6 7] a(2:2:7) % the stride is the middle value [2 4 6] a(2:-1:7) % same as in Python [] a(7:-1:2) % a(i:-1:j) == reverse of a(j:1:i) due to end-inclusive [7 6 5 4 3 2] a(7:-2:2) % but obviously not so for non-unity stride [7 5 3]
cheers, Georg
I wonder if it would have been simpler if we had defined a[i:j:-1] as the reverse of a[i:j]?
Maybe, I'm not venturing an opinion. But if so: What about negative strides other than -1? Should a[i:j:-2] always be the reverse of a[i:j:2]? My feeling is not, i.e. "abcdefghij"[3:8:2] == "dfh" but I feel that "abcdefghij"[3:8:-2] under this suggestion should be what "abcdefghij"[8:3:-2] is now, i.e. "ige", not "hfd". I.e. any non-empty start:stop:stride slice where 0 <= slice < length should always start with the character indexed by "start". Rob Cliffe
What are real use cases for negative strides?
-- --Guido van Rossum (python.org/~guido <http://python.org/%7Eguido>)
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
No virus found in this message. Checked by AVG - www.avg.com <http://www.avg.com> Version: 2012.0.2242 / Virus Database: 3222/6285 - Release Date: 10/27/13
On 27/10/2013 23:34, Rob Cliffe wrote:
I wonder if it would have been simpler if we had defined a[i:j:-1] as the reverse of a[i:j]?
Maybe, I'm not venturing an opinion. But if so: What about negative strides other than -1? Should a[i:j:-2] always be the reverse of a[i:j:2]? My feeling is not, i.e. "abcdefghij"[3:8:2] == "dfh" but I feel that "abcdefghij"[3:8:-2] under this suggestion should be what "abcdefghij"[8:3:-2] is now, i.e. "ige", not "hfd". I.e. any non-empty start:stop:stride slice where 0 <= slice < length should always start with the character indexed by "start".
Correction: I meant "where 0 <= start < length". On reconsideration, please ignore this condition.
Rob Cliffe
What are real use cases for negative strides?
-- --Guido van Rossum (python.org/~guido <http://python.org/%7Eguido>)
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
No virus found in this message. Checked by AVG - www.avg.com <http://www.avg.com> Version: 2012.0.2242 / Virus Database: 3222/6285 - Release Date: 10/27/13
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
No virus found in this message. Checked by AVG - www.avg.com <http://www.avg.com> Version: 2012.0.2242 / Virus Database: 3222/6285 - Release Date: 10/27/13
On 2013-10-27 18:32, Guido van Rossum wrote:
What are real use cases for negative strides?
The main use case is numpy, I would wager. Slicing a numpy array returns a view on the original array; negative-stride views work just as well as positive strides in numpy's memory model. Most other sequences copy when sliced, so reversed() tends to work fine for them. In my experience, the most common use of negative strides is a simple reversal of the whole array by leaving out the bounds: a[::-stride] I think I have done the following once (to clip the first `i` and last `j` elements and reverse, cleanly handling reasonable values of `i` and `j`): a[-j+len(a)-1:-i-len(a)-1:-stride] But I think I tend to do this more often: a[i:-j][::-stride] (Though really, this needs to start with `a[i:len(a)-j]`, to handle `j==0`, as others have pointed out. I run into that problem more commonly.) Implementation issues aside, the intention is just easier to read and reason about with the last option. It doesn't take much experience to get a good feeling for what each of those simple operations do and how they would compose together. Combining them into one operation, no matter what syntax you pick, is just going to be harder to learn. I don't think the language needs to change. The latter uses are pretty rare in my experience, and the last option is a good one. The amount of documentation you would need for any new syntax would be about the same as just pointing to the last option. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
27.10.13 19:04, Guido van Rossum написав(ла):
In the comments of http://python-history.blogspot.com/2013/10/why-python-uses-0-based-indexing.... there were some complaints about the interpretation of the bounds for negative strides, and I have to admin it feels wrong. Where did we go wrong? For example,
"abcde"[::-1] == "edcba"
as you'd expect, but there is no number you can put as the second bound to get the same result:
But you can put None.
"abcde"[:None:-1] 'edcba'
I may have a different slant on this. I've found that - by far - the most successful way to "teach slices" to newcomers is to invite them to view indices as being _between_ sequence elements. <position 0> <element> <position 1> <element> <position 2> <element> <position 3> ... Then i:j selects the elements between position i and position j.
"abcde"[2:4] 'cd'
But for negative strides this is all screwed up ;-)
"abcde"[4:2:-1] 'ed'
They're not getting the elements between "positions" 2 and 4 then, they're getting the elements between positions 3 and 5. Why? "Because that's how it works" - they have to switch from thinking about positions to thinking about array indexing. So I would prefer that the i:j in s[i:j:k] _always_ specify the positions in play: If i < 0: i += len(s) # same as now if j < 0: j += len(s) # same as now if i >= j: the slice is empty! # this is different - the sign of k is irrelevant else: the slice indices selected will be i, i + abs(k), i + 2*abs(k), ... up to but not including j if k is negative, this index sequence will be taken in reverse order Then "abcde"[4:2:-1} would be "", while "abcde"[2:4:-1] would be "dc", the reverse of "abcde"[2:4]. And s[0:len(s):-1] would be the same as reversed(s). So it's always a semi-open range, inclusive "at the left" and exclusive "at the right". But that's more a detail: the _point_ is to preserve the mental model of selecting the elements "between position". Of course I'd change range() similarly. [Guido]
For example,
"abcde"[::-1] == "edcba"
as you'd expect, but there is no number you can put as the second bound to get the same result:
Actually, any integer <= -1-len("abcde") = --6 works. But, yes, that's bizarre ;-)
... Are we stuck with this forever?
Probably :-(
...
On Sun, Oct 27, 2013 at 11:38 AM, Tim Peters <tim.peters@gmail.com> wrote:
I may have a different slant on this.
Hardly -- I agree with everything you say here. :-)
I've found that - by far - the most successful way to "teach slices" to newcomers is to invite them to view indices as being _between_ sequence elements.
Yup.
<position 0> <element> <position 1> <element> <position 2> <element> <position 3> ...
Then i:j selects the elements between position i and position j.
"abcde"[2:4] 'cd'
But for negative strides this is all screwed up ;-)
"abcde"[4:2:-1] 'ed'
Right, that's the point of my post.
They're not getting the elements between "positions" 2 and 4 then, they're getting the elements between positions 3 and 5. Why? "Because that's how it works" - they have to switch from thinking about positions to thinking about array indexing.
So I would prefer that the i:j in s[i:j:k] _always_ specify the positions in play:
If i < 0: i += len(s) # same as now if j < 0: j += len(s) # same as now if i >= j: the slice is empty! # this is different - the sign of k is irrelevant else: the slice indices selected will be i, i + abs(k), i + 2*abs(k), ... up to but not including j if k is negative, this index sequence will be taken in reverse order
Then "abcde"[4:2:-1] would be "", while "abcde"[2:4:-1] would be "dc", the reverse of "abcde"[2:4]. And s[0:len(s):-1] would be the same as reversed(s).
Except reversed() returns an iterator. But yes. This would also make a[i:j:k] == a[i:j][::k]. If I could do it over I would do it this way.
So it's always a semi-open range, inclusive "at the left" and exclusive "at the right". But that's more a detail: the _point_ is to preserve the mental model of selecting the elements "between position". Of course I'd change range() similarly.
Which probably would cause more backward incompatibility bugs, since by now many people have figured out that if you want [4, 3, 2, 1, 0] you have to write range(4, -1, -1). :-(
[Guido]
For example,
"abcde"[::-1] == "edcba"
as you'd expect, but there is no number you can put as the second bound to get the same result:
Actually, any integer <= -1-len("abcde") = --6 works. But, yes, that's bizarre ;-)
Yup, MRAB pointed this out too.
... Are we stuck with this forever?
Probably :-(
Sadly, I agree. If we wanted to change this in Python 4, we'd probably have to start deprecating range() with negative stride today to force people to replace their uses of that with reversed(range(...)). -- --Guido van Rossum (python.org/~guido)
2013-10-27, 21:44, Guido van Rossum wrote:
On Sun, Oct 27, 2013 at 11:38 AM, Tim Peters <tim.peters@gmail.com> wrote: [...] If I could do it over I would do it this way.
So it's always a semi-open range, inclusive "at the left" and exclusive "at the right". But that's more a detail: the _point_ is to preserve the mental model of selecting the elements "between position". Of course I'd change range() similarly.
Which probably would cause more backward incompatibility bugs, since by now many people have figured out that if you want [4, 3, 2, 1, 0] you have to write range(4, -1, -1). :-(
Maybe introduction of a new builtin and deprecation of range() could be the remedy? The new builtin, named e.g. "scope", could even be combination of todays range + slice? >>> list(scope(0, 5, -1)) # Py 3.5+ [4, 3, 2, 1, 0] >>>> 'abcdef'[scope(0, 5, -1)] # Py 3.5+ 'edcba' >>>> 'abcdef'[0:5:-1] # Py 4.0+ 'edcba' It's just a loud thinking... Cheers. *j
On Sun, Oct 27, 2013 at 01:38:05PM -0500, Tim Peters wrote:
I may have a different slant on this. I've found that - by far - the most successful way to "teach slices" to newcomers is to invite them to view indices as being _between_ sequence elements.
<position 0> <element> <position 1> <element> <position 2> <element> <position 3> ...
Then i:j selects the elements between position i and position j.
"abcde"[2:4] 'cd'
I really like that view point, but it has a major problem. As beautifully elegant as the "cut between positions" model is for stride=1, it doesn't extend to non-unit strides. You cannot think about non-contiguous slices in terms of a pair of cuts at position <start> and <end>. I believe that the cleanest way to understand non-contiguous slices with stride > 1 is to think of array indices. That same model works for the negative stride case too. Further details below.
But for negative strides this is all screwed up ;-)
"abcde"[4:2:-1] 'ed'
They're not getting the elements between "positions" 2 and 4 then, they're getting the elements between positions 3 and 5.
As I suggested above, the "between elements" model doesn't work for non-unit strides. Consider this example: py> s = "abcdefghi" py> s[1:8:2] 'bdfh' Here are the labelled between-element positions, best viewed with a monospaced font: |a|b|c|d|e|f|g|h|i| 0 1 2 3 4 5 6 7 8 9 Since the slice here is non-contiguous, we don't have a single pair of cuts, but a series of them: s[1:8:2] => s[1:2:1] + s[3:4:1] + s[5:6:1] + s[7:8:1] => 'bdfh' that is, start at the <start> position and make a thin (one element) slice, advance forward by step and repeat until you reach the <end> position. But that's just a longer way of writing this: s[1:8:2] => s[1] + s[3] + s[5] + s[7] => 'bdfh' which I maintain is a cleaner way to think about non-unit step-sizes. It's certainly *shorter* to think of indexing rather than repeated thin slices, and it avoids the mistake (which I originally made) of thinking that each subslice has to be <stride> wide. # Not this! s[1:8:2] => s[1:3] + s[3:5] + s[5:7] + s[7:9] => 'bcdefghi' # Or this! s[1:8:2] => s[1:3] + s[4:6] + s[7:9] 'bcefhi' So I think that the cleanest way of thinking about *positive* non-unit strides is terms of array indexing. *Negative* non-unit strides, including -1, are no different. First, here's an example with negative positions: py> s[-1:-8:-2] 'igec' which is just like s[-1:-8:-2] => s[-1] + s[-3] + s[-5] + s[-7] => 'igec' which is precisely the same as the positive step case: start at the <start>, continue until the end, stepping by <step> each time. If you insist on the "cut between positions" way of thinking, we can do that too: s[-1:-8:-2] => s[-1:-2:-1] + s[-3:-4:-1] + s[-5:-6:-1] + s[-7:-8:-1] => 'igec' 10 9 8 7 6 5 4 3 2 1 # all negative positions |a|b|c|d|e|f|g|h|i| The slice from -1 to -2 is "i", from -3 to -4 is "g", and so forth, exactly as for positive positions, except that negative positions are one-based instead of zero-based. (If ints could distinguish -0 from 0, we could fix that.) Here's an example like the one that Tim described as "screwed up", with positive start and end positions and -1 stride: py> s[6:2:-1] 'gfed' This is not "take the slice from [2:6] and reverse it", which would give 'fedc'. That doesn't work, because slices are always closed at <start> and open at <end>, no matter which direction you go: * [2:6:1] is closed at 2 (the start), open at 6 (the end) * [6:2:-1] is closed at 6 (the start), open at 2 (the end) If you are expecting differently, then (I believe) you are expecting that slices are closed on the *left* (lowest number), open on the *right* (highest number). But that's not what slices do. (Whether they *should* do it is another story.) However, the array index viewpoint works just fine here too: s[6:2:-1] => s[6] + s[5] + s[4] + s[3] Start at index 6, step down by -1 each time, stop at 2. As elegant as "cut between elements" is for the common case where the stride is 1, it doesn't work for stride != 1. I'm not concerned about the model breaking down for negative strides when it breaks down for positive non-unit strides too :-)
Why? "Because that's how it works" - they have to switch from thinking about positions to thinking about array indexing.
But you already have to do this as soon as you allow non-unit strides! You even do it yourself, below: "the slice indices selected will be..." so you can't get away from indexes even if you try.
So I would prefer that the i:j in s[i:j:k] _always_ specify the positions in play:
If i < 0: i += len(s) # same as now if j < 0: j += len(s) # same as now if i >= j: the slice is empty! # this is different - the sign of k is irrelevant else: the slice indices selected will be i, i + abs(k), i + 2*abs(k), ... up to but not including j if k is negative, this index sequence will be taken in reverse order
In other words, you want negative strides to just mean "reverse the slice". Perhaps that would have been a good design. But we already have two good idioms for reversing slices: reversed(seq[start:stop:step]) seq[start:stop:step][::-1]
Then "abcde"[4:2:-1} would be "", while "abcde"[2:4:-1] would be "dc", the reverse of "abcde"[2:4]. And s[0:len(s):-1] would be the same as reversed(s).
So it's always a semi-open range, inclusive "at the left" and exclusive "at the right". But that's more a detail:
It isn't a mere detail, it is the core of the change: changing from inclusive at the start to inclusive on the left, which are not the same thing. This is a significant semantic change. (Of course it is. You don't like the current semantics, since they trick you into off-by-one errors for negative strides. If the change was insignificant, it wouldn't help.) One consequence of this proposed change is that the <start> parameter is no longer always the first element returned. Sometimes <start> will be last rather than first. That disturbs me.
the _point_ is to preserve the mental model of selecting the elements "between position". Of course I'd change range() similarly.
Currently, this is how you use range to count down from 10 to 1: range(10, 0, -1) # 0 is excluded To me, this makes perfect sense: I want to start counting at 10, so the first argument I give is 10 no matter whether I'm counting up or counting down. With your suggestion, we'd have: range(1, 11, -1) # 11 is excluded So here I have to put one more than the number I want to start with as the *second* argument, and the last number first, just because I'm counting down. I don't consider that an improvement. Certainly not an improvement worth breaking backwards compatibility for.
Are we stuck with this forever?
Probably :-(
Assuming we want to change -- and I'm not convinced we should -- there's always Python 4000, or if necessary from __future__ import negative_slices_reverse -- Steven
]Steven D'Aprano <steve@pearwood.info>]
... I really like that view point, but it has a major problem. As beautifully elegant as the "cut between positions" model is for stride=1, it doesn't extend to non-unit strides. You cannot think about non-contiguous slices in terms of a pair of cuts at position <start> and <end>. I believe that the cleanest way to understand non-contiguous slices with stride > 1 is to think of array indices. That same model works for the negative stride case too.
"Cut between positions" in my view has nothing to do with the stride. It's used to determine the portion of the sequence _to which_ the stride applies. From that portion, we take the first element, the first + |stride|'th element, the first + |stride|*2'th element, and so on. Finally, if stride < 0, we reverse that sequence. That's the proposal. It's simple and uniform.
Further details below.
Not really needed - I already know exactly how slicing works in Python today ;-)
... But that's just a longer way of writing this:
s[1:8:2] => s[1] + s[3] + s[5] + s[7] => 'bdfh'
which I maintain is a cleaner way to think about non-unit step-sizes.
As above, so do I. start:stop just delineates the subsequence to which the stride applies.
It's certainly *shorter* to think of indexing rather than repeated thin slices,
And I don't have "repeated thin slices" in mind at all.
.... If you are expecting differently, then (I believe) you are expecting that slices are closed on the *left* (lowest number), open on the *right* (highest number). But that's not what slices do. (Whether they *should* do it is another story.)
Guido started this thread precisely to ask what they should do. We already know what they _do_ do ;-)
So I would prefer that the i:j in s[i:j:k] _always_ specify the positions in play:
If i < 0: i += len(s) # same as now if j < 0: j += len(s) # same as now if i >= j: the slice is empty! # this is different - the sign of k is irrelevant else: the slice indices selected will be i, i + abs(k), i + 2*abs(k), ... up to but not including j if k is negative, this index sequence will be taken in reverse order
In other words, you want negative strides to just mean "reverse the slice"
If they're given a ;meaning at all.
. Perhaps that would have been a good design. But we already have two good idioms for reversing slices:
reversed(seq[start:stop:step])
I'm often annoyed by `reversed()`, since it returns an iterator and doesn't preserve the type of its argument.
reversed('abc') <reversed object at 0x00C722D0>
Oops! OK, let's turn it back into a string:
str(_) '<reversed object at 0x00C722D0>'
LOL! It's enough to make a guy give up ;-) Yes, I know ''.join(_) would have worked.
seq[start:stop:step][::-1]
That's an improvement over seq[start:stop:-step]? Na.
... So it's always a semi-open range, inclusive "at the left" and exclusive "at the right". But that's more a detail:
It isn't a mere detail,
Not "mere", "more".
it is the core of the change: changing from inclusive at the start to inclusive on the left,
No, the proposal says a[i:j:anything] is _empty_ if (after normalizing negative i and/or negative j) i >= j. "The start" and "the left" are always the same thing under the proposal (where "the start" applies to the input subsequence - which may be "the end" of the output subsequence).
which are not the same thing. This is a significant semantic change.
Yes, it is.
(Of course it is. You don't like the current semantics, since they trick you into off-by-one errors for negative strides.
No, I dislike the current semantics for the same reason it appears Guido dislikes them: they're hard to teach, and hard for people to get right in practice.
If the change was insignificant, it wouldn't help.)
Bingo ;-)
One consequence of this proposed change is that the <start> parameter is no longer always the first element returned. Sometimes <start> will be last rather than first. That disturbs me.
? <start> is always the first element of the subsequence to which the stride is applied. If the stride is negative, then yes, of course the first element of the source subsequence would be the last element of the returned subsequence.
... Of course I'd change range() similarly.
Currently, this is how you use range to count down from 10 to 1:
range(10, 0, -1) # 0 is excluded
To me, this makes perfect sense: I want to start counting at 10, so the first argument I give is 10 no matter whether I'm counting up or counting down.
With your suggestion, we'd have:
range(1, 11, -1) # 11 is excluded
So here I have to put one more than the number I want to start with as the *second* argument, and the last number first, just because I'm counting down. I don't consider that an improvement. Certainly not an improvement worth breaking backwards compatibility for.
I agree this one is annoying. Not _more_ annoying than the current range(10, -1, -1) to count down from 10 through 0 - which I've seen people get wrong more often than I can recall - but _as_ annoying. reversed(range(1, 11)) would work for your case, and reversed(range(11)) for mine.
On Sun, Oct 27, 2013 at 10:05:11PM -0500, Tim Peters wrote:
]Steven D'Aprano <steve@pearwood.info>]
... I really like that view point, but it has a major problem. As beautifully elegant as the "cut between positions" model is for stride=1, it doesn't extend to non-unit strides. You cannot think about non-contiguous slices in terms of a pair of cuts at position <start> and <end>. I believe that the cleanest way to understand non-contiguous slices with stride > 1 is to think of array indices. That same model works for the negative stride case too.
"Cut between positions" in my view has nothing to do with the stride. It's used to determine the portion of the sequence _to which_ the stride applies. From that portion, we take the first element, the first + |stride|'th element, the first + |stride|*2'th element, and so on. Finally, if stride < 0, we reverse that sequence. That's the proposal. It's simple and uniform.
That's quite a nice model, and I can't really say I dislike it. But it fails to address your stated complaint about the current behaviour, namely that it forces the user to think about array indexing instead of cutting between elements. No matter what we do, we still have to think about array indexes.
But that's just a longer way of writing this:
s[1:8:2] => s[1] + s[3] + s[5] + s[7] => 'bdfh'
which I maintain is a cleaner way to think about non-unit step-sizes.
As above, so do I. start:stop just delineates the subsequence to which the stride applies.
That's the part that was unclear to me from your earlier post. [...]
In other words, you want negative strides to just mean "reverse the slice"
If they're given a ;meaning at all.
Is this a serious proposal to prohibit negative slices? [...]
One consequence of this proposed change is that the <start> parameter is no longer always the first element returned. Sometimes <start> will be last rather than first. That disturbs me.
? <start> is always the first element of the subsequence to which the stride is applied. If the stride is negative, then yes, of course the first element of the source subsequence would be the last element of the returned subsequence.
Right. And that's exactly what I dislike about the proposal. I have a couple of range-like or slice-like functions which take start/stop/stride arguments. I think I'll modify them to have your suggested semantics and see how well they work in practice. But in the meantime, here are my tentative votes: -1 on prohibiting negative strides altogether. They're useful. -1 on deprecating negative strides, for temporary removal, followed by reintroduce them again in the future with different semantics. If I'm going to be forced to change my code to deal with this, I want to only do it once, not twice. +0 on introducing a __future__ directive to change the semantics of negative strides (presumably in Python 3.5, since 3.4 feature-freeze is so close), with the expectation that it will probably become the default in 3.6 or 3.7. +0.5 on the status quo. -- Steven
... [Tim]
"Cut between positions" in my view has nothing to do with the stride. It's used to determine the portion of the sequence _to which_ the stride applies. From that portion, we take the first element, the first + |stride|'th element, the first + |stride|*2'th element, and so on. Finally, if stride < 0, we reverse that sequence. That's the proposal. It's simple and uniform.
[Steven D'Aprano]
That's quite a nice model, and I can't really say I dislike it. But it fails to address your stated complaint about the current behaviour, namely that it forces the user to think about array indexing instead of cutting between elements. No matter what we do, we still have to think about array indexes.
As a Python implementer, _I_ do, but not as a user. As Guido noted, under the proposal we have: s[i:j:k] == s[i:j][::k] That should (finally?) make it crystal clear that applying the stride has nothing directly to do with the indices of the selected elements in the original sequence (`s`). In the RHS's "[::k]", all knowledge of s's indices has been lost. If _you_ want to think of it in terms of indices into `s`, that's fine, and the implementation obviously needs to index into `s` to produce the result, but a user can just think "OK! start at the start and take every k'th element thereafter". As in, e.g., the sieve of Eratosthenes: "search right until you find the first integer not yet crossed out. Call it p. Then cross out every p'th integer following. Repeat." There's no reference made to "indices" there, and none needed in the mental model. An implementation using an array or list will need to know the index of p, but that's it. If that index is `i`, then, e.g., array_or_list[i+p: :p] = [False] * len(range(i+p, len(array_or_list), p)) faithfully translates the rest. [...]
In other words, you want negative strides to just mean "reverse the slice"
If they're given a ;meaning at all.
Is this a serious proposal to prohibit negative slices?
No. I do wonder whether negative strides have been "an attractive nuisance" overall, but guess they'd be more "attractive" than "nuisance" under the proposal. [...]
One consequence of this proposed change is that the <start> parameter is no longer always the first element returned. Sometimes <start> will be last rather than first. That disturbs me.
? <start> is always the first element of the subsequence to which the stride is applied. If the stride is negative, then yes, of course the first element of the source subsequence would be the last element of the returned subsequence.
Right. And that's exactly what I dislike about the proposal.
OK. Then use a positive stride ;-)
I have a couple of range-like or slice-like functions which take start/stop/stride arguments. I think I'll modify them to have your suggested semantics and see how well they work in practice.
Good idea! I made only one real use of non-trivial negative strides, that I can recall, in the last year: # nw index is n-1+i-j # in row i, that's n-1+i thru n-1+i-(n-1) = i # the leftmost is irrelevant, so n-1+i-1 = n-2+i thru i # ne index is i+j # in row i, that's i thru i+n-1 # the rightmost is irrelevant, so i thru i+n-2 assert nw[n-1+i] == 0 assert ne[i+n-1] == 0 codes = [0] * (3*n - 2) codes[0::3] = up codes[1::3] = ne[i: n-1+i] codes[2::3] = nw[n-2+i: i-1: -1] # here That was excruciating to get right. Curiously, the required `ne` and 'nw` index sets turned out to be exactly the same, but _end up_ looking different in that code because of the extra fiddling needed to deal with that the required 'nw` index _sequence_ (as opposed to the index set) is the reverse of the required `ne` index sequence. Under the proposal, the last line would be written: codes[2::3] = nw[i: n-1+i: -1] instead, making it obvious at a glance that the the 'nw` index sequence is the reverse of the `ne` index sequence. And that's all the empirical proof I need - LOL ;-)
...
On Mon, Oct 28, 2013 at 11:49 AM, Tim Peters <tim.peters@gmail.com> wrote:
As a Python implementer, _I_ do, but not as a user. As Guido noted, under the proposal we have:
s[i:j:k] == s[i:j][::k]
That should (finally?) make it crystal clear that applying the stride has nothing directly to do with the indices of the selected elements in the original sequence (`s`).
It's definitely not "finally clear" as it's a change in semantics. What about negative strides other than -1? Which of these is expected? (A) '012345678'[::-2] == '86420' '0123456789'[::-2] == '97531' or: (B) '012345678'[::-2] == '86420' '0123456789'[::-2] == '86420' If (A) I can get the (B) result by writing [::2][::-1] but if (B), I'm forced to write: s[0 if len(s) % 2 == 1 else 1::2] or something equally ugly. --- Bruce (Also, (A) is the current behavior and switching to (B) would break any existing use of strides < -1.)
[Tim]
As Guido noted, under the proposal we have:
s[i:j:k] == s[i:j][::k]
That should (finally?) make it crystal clear that applying the stride has nothing directly to do with the indices of the selected elements in the original sequence (`s`).
[Bruce Leban]
It's definitely not "finally clear" as it's a change in semantics.
Of course it is.
What about negative strides other than -1?
All strides start the same way, by (conceptually) selecting s[i:j] first. Then the stride is applied to that contiguous slice, starting with the first element and taking every abs(k) element thereafter. Finally, if k is negative, that sequence is reversed. k=1, k=-1, k=2, k=-2, ..., all the same.
Which of these is expected?
(A) '012345678'[::-2] == '86420' '0123456789'[::-2] == '97531' or:
(B) '012345678'[::-2] == '86420' '0123456789'[::-2] == '86420'
B.
If (A) I can get the (B) result by writing [::2][::-1] but if (B), I'm forced to write:
s[0 if len(s) % 2 == 1 else 1::2]
or something equally ugly.
You're assuming something here you haven't said. The easiest way to get '97531' is to type '97531' ;-) If your agenda is the general "return every 2nd element starting with the last element", then the obvious way to do that under the proposal is to write [::-1][::2]. You can't seriously claim that's harder than the "[::2][::-1]" you presented as the obvious way to "get the (B) result" given (A).
(Also, (A) is the current behavior and switching to (B) would break any existing use of strides < -1.)
Did you notice that Guido titled this thread "Where did we go wrong with negative stride?".;-) BTW, do you have use cases for negative strides other than -1? Not examples, use cases. There haven't been any in this thread yet.
On 10/28/2013 4:56 PM, Tim Peters wrote:
[Bruce Leban]
It's definitely not "finally clear" as it's a change in semantics.
Of course it is.
What about negative strides other than -1?
All strides start the same way, by (conceptually) selecting s[i:j] first. Then the stride is applied to that contiguous slice, starting with the first element and taking every abs(k) element thereafter. Finally, if k is negative, that sequence is reversed. k=1, k=-1, k=2, k=-2, ..., all the same.
I think this is wrong. Conceptually reverse first, if indicated by negative stride, then select: see my previous post for my rationale, and below.
Which of these is expected?
(A) '012345678'[::-2] == '86420' '0123456789'[::-2] == '97531'
This is the current behavior and I think that [::k] should continue to work as it does. I believe Guido only suggested deprecating negative strides with non-default endpoints, which implies negative strides *with* default endpoints should continue as are. We should not break more than necessary.
(B) '012345678'[::-2] == '86420' '0123456789'[::-2] == '86420'
B.
Aside from all else, I find A) more intuitive. It certainly strikes me as more likely to be wanted, though either is obviously rare.
If (A) I can get the (B) result by writing [::2][::-1] but if (B), I'm forced to write:
s[0 if len(s) % 2 == 1 else 1::2]
or something equally ugly.
No, just reverse the slices.
'0123456789'[::-1][::2] '97531' '012345678'[::-1][::2] '86420'
If we were to make the change, I think the docs, at least the tutorial, should say that s[i:j:-k] could mean either s[i:j:-1][::k] or s[i:j:k][::-1] and that is does mean the former, so if one wants the latter, spell it out.
You're assuming something here you haven't said. The easiest way to get '97531' is to type '97531' ;-) If your agenda is the general "return every 2nd element starting with the last element", then the obvious way to do that under the proposal is to write [::-1][::2]. You can't seriously claim that's harder than the "[::2][::-1]" you presented as the obvious way to "get the (B) result" given (A).
(Also, (A) is the current behavior and switching to (B) would break any existing use of strides < -1.)
Did you notice that Guido titled this thread "Where did we go wrong with negative stride?".;-)
I did, and I explained exactly where I thing we went wrong, which was to make the interpretation of i and j depend on the sign of k. Undoing this does not mandate B instead of A. -- Terry Jan Reedy
[Tim\
All strides start the same way, by (conceptually) selecting s[i:j] first. Then the stride is applied to that contiguous slice, starting with the first element and taking every abs(k) element thereafter. Finally, if k is negative, that sequence is reversed. k=1, k=-1, k=2, k=-2, ..., all the same.
[Terry]
I think this is wrong. Conceptually reverse first, if indicated by negative stride, then select: see my previous post for my rationale, and below.
I already replied to your previous post, and agreed with you :-)
... I think that [::k] should continue to work as it does. I believe Guido only suggested deprecating negative strides with non-default endpoints, which implies negative strides *with* default endpoints should continue as are. We should not break more than necessary.
Take "yes" for an answer ;-)
... Did you notice that Guido titled this thread "Where did we go wrong with negative stride?".;-)
I did,
And did you notice that I posed that question to someone else? ;-)
and I explained exactly where I thing we went wrong, which was to make the interpretation of i and j depend on the sign of k. Undoing this does not mandate B instead of A.
Agreed.
On 10/28/2013 2:49 PM, Tim Peters wrote:
under the proposal we have:
s[i:j:k] == s[i:j][::k]
I think where we went wrong with strides was to have the sign of the stride affect the interpretation of i and j (in analogy with ranges). The change is to correct this by decoupling steps 1. and 2. below. The result is that i and j would mean left and right ends of the slice, rather than 'start' and 'stop' ends of the slice. I presume s[::-k], k a count, would continue to mean 'reverse and take every kth' (ie, take every kth item from the right instead of the left): s[i:j:-k] == s[i:j:-1][::k] (And one would continue to write the alternative, 'take every kth and reverse' explicitly as s[i:j:k][::-1].) Whether selecting or replacing, this proposal makes the rule for indicating an arithmetic subsequence to be: 1. indicate the contiguous slice to work on with left and right endpoints (left end i, right end j, i <= j after normalization with same rules as at present); 2. indicate the starting end and direction of movement, left to right (default) or right to left (negate k); 3. indicate whether to pick every member of the slice (k=1, default) or every kth (k > 1), starting with the first item at the indicated end (if there is one) and moving in the appropriate direction. --- My quick take on slicing versus indexing. The slice positions of a single item are i:(i+1). The average is i.5. Some languages (0-based, like Python) round this down to i, others (1-based) round up to i+1. String 'indexing' is really unit slicing: s[i] == s[i:i+1]. Any sequence can sliced. True indexing requires that the members of the sequence either be Python objects (tuples, lists) or usefully convert to such (bytes, other arrays, which convert integer members to Python ints). -- Terry Jan Reedy
under the proposal we have:
s[i:j:k] == s[i:j][::k]
[Terry Reedy]
I think where we went wrong with strides was to have the sign of the stride affect the interpretation of i and j (in analogy with ranges).
I think that's right :-)
The change is to correct this by decoupling steps 1. and 2. below. The result is that i and j would mean left and right ends of the slice, rather than 'start' and 'stop' ends of the slice.
Right.
I presume s[::-k], k a count, would continue to mean 'reverse and take every kth' (ie, take every kth item from the right instead of the left):
That's not what the proposal said, but it's a reasonable alternative. Maybe that's better because it's "more compatible" with what happens today. The hangup for me is that I have no use cases for negative strides other than -1, so have no real basis for picking one over the other. OTOH, since I _don't_ have any use cases, I don't care either what happens then ;-)
s[i:j:-k] == s[i:j:-1][::k]
That's a nice form of symmetry too. Sold ;-)
(And one would continue to write the alternative, 'take every kth and reverse' explicitly as s[i:j:k][::-1].)
Whether selecting or replacing, this proposal makes the rule for indicating an arithmetic subsequence to be:
1. indicate the contiguous slice to work on with left and right endpoints (left end i, right end j, i <= j after normalization with same rules as at present);
2. indicate the starting end and direction of movement, left to right (default) or right to left (negate k);
3. indicate whether to pick every member of the slice (k=1, default) or every kth (k > 1), starting with the first item at the indicated end (if there is one) and moving in the appropriate direction.
Yup!
--- My quick take on slicing versus indexing. The slice positions of a single item are i:(i+1). The average is i.5. Some languages (0-based, like Python) round this down to i, others (1-based) round up to i+1.
I think they all round down. For example, Icon uses 1-based indexing, and supports slicing. "abc"[1] is "a" in Icon, and so is "abc"[1:2]. 0 isn't a valid index in Icon, can be used in slicing, where it means "the position just after the last element".
String 'indexing' is really unit slicing: s[i] == s[i:i+1]. Any sequence can sliced. True indexing requires that the members of the sequence either be Python objects (tuples, lists) or usefully convert to such (bytes, other arrays, which convert integer members to Python ints).
On 10/28/2013 04:31 PM, Tim Peters wrote:
That's not what the proposal said, but it's a reasonable alternative. Maybe that's better because it's "more compatible" with what happens today. The hangup for me is that I have no use cases for negative strides other than -1, so have no real basis for picking one over the other. OTOH, since I_don't_ have any use cases, I don't care either what happens then;-)
s[i:j:-k] == s[i:j:-1][::k] That's a nice form of symmetry too. Sold;-)
+1 Looks good to me. We could add a new_slice object without any problems now. You just need to be explicit when using it. (This works now)
"abcdefg"[slice(1,5,1)] 'bcde'
And this could work and not cause any compatibility issues. "abcdefg"[new_slice(1, 5, -2)] "ec" It would offer an alternative for those who need or want it now and help with the change over when/if the time comes. We also need to remember that slicing is also used for inserting things.
a = list("python") b = list("PYTHON") a[::2] = b[::2] a ['P', 'y', 'T', 'h', 'O', 'n']
Cheers, Ron
On Oct 28, 2013, at 16:51, Ron Adam <ron3200@gmail.com> wrote:
We also need to remember that slicing is also used for inserting things.
a = list("python") b = list("PYTHON") a[::2] = b[::2] a ['P', 'y', 'T', 'h', 'O', 'n']
I was about to write the same thing. Half the mails so far have said things like "you don't need to do [i: j:k] because you can do [m:n:o][::p]". But that doesn't work with assignment; you're just assigning to the temporary copy of the first slice. And I think people will be bitten by that. People are _already_ bitten by that today, because they saw somewhere on StackOverflow that foo[i:j][::-1] is easier to understand than foo[j+1:i+1:-1] and tried to assign to it (presumably on the assumption that index and slice assignment must work like C++ and other languages that return "references" to the values that can be assigned into) and don't understand why it had no effect. Today, people who ask this question are opening a useful door. You can explain to them exactly what slicing does, including how __setitem__ works, and they come out of it knowing how to assign to the range that they wanted. Never mind that these people have no real need for what they're writing and are just screwing around with language features because it's neat; we don't want to say that anyone who learns that way shouldn't be learning Python, do we? Anyway, a change that makes it impossible to assign to the range looks like a hole in the language. All you can say is that if you wanted to get the slice you could rewrite it this way, but there's no way to rewrite it in terms of setting a slice, but don't worry, we're pretty sure you'll never need to.
[Ron Adam]
We also need to remember that slicing is also used for inserting things.
a = list("python") b = list("PYTHON") a[::2] = b[::2] a ['P', 'y', 'T', 'h', 'O', 'n']
[Andrew Barnert]
I was about to write the same thing. Half the mails so far have said things like "you don't need to do [i: j:k] because you can do [m:n:o][::p]".
You must be missing most of the messages, then ;-)
But that doesn't work with assignment; you're just assigning to the temporary copy of the first slice. ...
Do you have a specific example of a currently-working slice assignment that couldn't easily be done under proposed alternatives? I can't think of one, under "my" proposal as amended by Terry. Ron's example is no problem under any of them (because no proposal so far has suggested changing the current meaning of [::2]). When you see things like s[i:j;k] = s[i:j][::k] *nobody* is suggesting using the spelling on the RHS. They're pointing out a pleasant mathematical equivalence.
On Oct 28, 2013, at 17:41, Tim Peters <tim.peters@gmail.com> wrote:
[Ron Adam]
We also need to remember that slicing is also used for inserting things.
a = list("python") b = list("PYTHON") a[::2] = b[::2] a ['P', 'y', 'T', 'h', 'O', 'n']
[Andrew Barnert]
I was about to write the same thing. Half the mails so far have said things like "you don't need to do [i: j:k] because you can do [m:n:o][::p]".
You must be missing most of the messages, then ;-)
For example, the whole sub discussion starting with Bruce Leban's post, which I'll quote here:
(A) '012345678'[::-2] == '86420' '0123456789'[::-2] == '97531'
or:
(B) '012345678'[::-2] == '86420' '0123456789'[::-2] == '86420'
If (A) I can get the (B) result by writing [::2][::-1] but if (B), I'm forced to write:
s[0 if len(s) % 2 == 1 else 1::2]
The idea is that A (today's behavior) is fine because you can get the B result (one of the proposals, which Bruce apparently didn't like) with two slices. Someone then pointed out that the proposal is equally fine because, despite what Bruce suggested, you can get the A result with two slices. But if you take away the ability to specify A in a single slice, you take away the ability to assign to A. Imagine I'd written s[:-5:-2]=1, 3 (or equivalently s[:5:-2]) and the language changed to make this now replace the 8 and 6 instead of the 9 and 7. How would I change my code to get the previous behavior back? It's quite possible no one has ever intentionally written such code. Or, even if they _have_, that they shouldn't have. (You can hardly call something readable if you have to sit down and work through what it would do to various sequences.) And correctly supporting such nonexistent code has been a burden on every custom sequence ever implemented. So, a proposal that makes strides < -1 with non-None end points into an error (as some of them have) seems reasonable. But I think a proposal that changes the meaning of such slices into something different (as some of them have) is a lot riskier. (Especially since many custom sequences that didn't implement slice assignment in the clever way would have to be rewritten.)
But that doesn't work with assignment; you're just assigning to the temporary copy of the first slice. ...
Do you have a specific example of a currently-working slice assignment that couldn't easily be done under proposed alternatives?
s[:-4:-2]=1, 2 This replaces the last and antepenultimate elements, whether s is even or odd. I suppose you could mechanically convert it to this: s[-mid+2::2]=reversed((1,2)) But I don't know that I'd call that "easy". The question is whether this is realistic code anyone would ever intentionally write.
Sorry, accidentally hit Send in mid-edit... Please see the fixed version of the last segment below. Sent from a random iPhone On Oct 28, 2013, at 21:15, Andrew Barnert <abarnert@yahoo.com> wrote:
On Oct 28, 2013, at 17:41, Tim Peters <tim.peters@gmail.com> wrote:
[Ron Adam]
We also need to remember that slicing is also used for inserting things.
> a = list("python") > b = list("PYTHON") > a[::2] = b[::2] > a ['P', 'y', 'T', 'h', 'O', 'n']
[Andrew Barnert]
I was about to write the same thing. Half the mails so far have said things like "you don't need to do [i: j:k] because you can do [m:n:o][::p]".
You must be missing most of the messages, then ;-)
For example, the whole sub discussion starting with Bruce Leban's post, which I'll quote here:
(A) '012345678'[::-2] == '86420' '0123456789'[::-2] == '97531'
or:
(B) '012345678'[::-2] == '86420' '0123456789'[::-2] == '86420'
If (A) I can get the (B) result by writing [::2][::-1] but if (B), I'm forced to write:
s[0 if len(s) % 2 == 1 else 1::2]
The idea is that A (today's behavior) is fine because you can get the B result (one of the proposals, which Bruce apparently didn't like) with two slices.
Someone then pointed out that the proposal is equally fine because, despite what Bruce suggested, you can get the A result with two slices.
But if you take away the ability to specify A in a single slice, you take away the ability to assign to A.
Imagine I'd written s[:-5:-2]=1, 3 (or equivalently s[:5:-2]) and the language changed to make this now replace the 8 and 6 instead of the 9 and 7. How would I change my code to get the previous behavior back?
It's quite possible no one has ever intentionally written such code. Or, even if they _have_, that they shouldn't have. (You can hardly call something readable if you have to sit down and work through what it would do to various sequences.) And correctly supporting such nonexistent code has been a burden on every custom sequence ever implemented.
So, a proposal that makes strides < -1 with non-None end points into an error (as some of them have) seems reasonable.
But I think a proposal that changes the meaning of such slices into something different (as some of them have) is a lot riskier. (Especially since many custom sequences that didn't implement slice assignment in the clever way would have to be rewritten.)
But that doesn't work with assignment; you're just assigning to the temporary copy of the first slice. ...
Do you have a specific example of a currently-working slice assignment that couldn't easily be done under proposed alternatives?
s[:-4:-2]=1, 2
This replaces the last and antepenultimate elements, whether s is even or odd.
I suppose you could mechanically convert it to this:
s[-3::2]=reversed((1, 2))
But I don't know that I'd call that "easy".
The question is whether this is realistic code anyone would ever intentionally write.
... [Tim]
Do you have a specific example of a currently-working slice assignment that couldn't easily be done under proposed alternatives?
[Andrew Barnert]
s[:-4:-2]=1, 2
This replaces the last and antepenultimate elements, whether s is even or odd.
I suppose you could mechanically convert it to this:
s[-mid+2::2]=reversed((1,2))
But I don't know that I'd call that "easy".
Under my & Terry's proposal, it would be written s[-3::-2] = 1, 2 And, at least to me, it's far more obvious this way that it affects (all and only) s[-1] and s[-3]. It's immediate from "OK, s[-3:] is the last three elements, so only those can possibly be affected. Then the stride -2 skips the one in the middle, and picks on the last element first." Analyzing the current spelling is a royal PITA. "OK, umm, ah! The stride is negative, so the empty part at the start refers to the last element of the sequence. Then the second bit is -4, which is one larger then we'll actually go. Oops! No, the stride is negative here, so -4 is one *smaller* than we'll actually go. It will stop at -4+1 = -3. I think." ;-)
The question is whether this is realistic code anyone would ever intentionally write.
Obviously so, whenever they need to replace the last element element with 1 and the antepenultimate element with 2 ;-) BTW, do you know of any real code that uses a negative stride other than -1? Still looking for a real example of that.
Tim Peters wrote:
BTW, do you know of any real code that uses a negative stride other than -1? Still looking for a real example of that.
I tend to scrupulously avoid writing any such code, precisely because it's so hard to figure out what it will mean! I'm even a bit wary of using -1, preferring to use reversed() if possible. -- Greg
On Oct 28, 2013, at 21:39, Tim Peters <tim.peters@gmail.com> wrote:
BTW, do you know of any real code that uses a negative stride other than -1? Still looking for a real example of that.
No. Which is exactly why I suggested just making it illegal instead of giving it a new (somewhat less confusing, but still not obvious) meaning. It doesn't much matter which one is "right" if neither one is useful, does it?
On 2013-10-29 00:41, Tim Peters wrote:
When you see things like
s[i:j;k] = s[i:j][::k]
*nobody* is suggesting using the spelling on the RHS. They're pointing out a pleasant mathematical equivalence.
Actually, I am suggesting that, in the case of negative k. It is much easier to learn, read, and reason about the composition of those two operations than any [i:j:k] construction, not matter what semantics you apply to that syntax. That said, I always use this in numpy where slices are practically free. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
On Mon, Oct 28, 2013 at 05:06:09PM -0400, Terry Reedy wrote:
On 10/28/2013 2:49 PM, Tim Peters wrote:
under the proposal we have:
s[i:j:k] == s[i:j][::k]
I think where we went wrong with strides was to have the sign of the stride affect the interpretation of i and j (in analogy with ranges). The change is to correct this by decoupling steps 1. and 2. below. The result is that i and j would mean left and right ends of the slice, rather than 'start' and 'stop' ends of the slice.
Sorry Terry, your paragraph above is ambiguous to me. It sounds like you are saying that having slices work by analogy with range was a mistake. Are you suggesting to break the analogy between slicing and range? That is, range continues to work they way it currently does, but change slice?
Whether selecting or replacing, this proposal makes the rule for indicating an arithmetic subsequence to be:
1. indicate the contiguous slice to work on with left and right endpoints (left end i, right end j, i <= j after normalization with same rules as at present);
2. indicate the starting end and direction of movement, left to right (default) or right to left (negate k);
3. indicate whether to pick every member of the slice (k=1, default) or every kth (k > 1), starting with the first item at the indicated end (if there is one) and moving in the appropriate direction.
"pick every kth element" works for k=1 as well as k > 1, no need for a special case here. Every 1th element is every element :-)
My quick take on slicing versus indexing. The slice positions of a single item are i:(i+1). The average is i.5. Some languages (0-based, like Python) round this down to i, others (1-based) round up to i+1.
I don't think it's helpful to talk about averaging or rounding the indexes. Better to talk about whether indexes are included or excluded, or whether the interval is open (end points are excluded) or closed (end points are included). Ruby provides both closed and half-open ranges: 2..5 => 2, 3, 4, 5 2...5 => 2, 3, 4 (If you think that the choice of .. versus ... is backwards, you're not alone.) Ruby has no syntax for stride, but range objects have a step(k) method that returns every kth value. http://www.ruby-doc.org/core-1.9.3/Range.html -- Steven
On 10/28/2013 10:25 PM, Steven D'Aprano wrote:
On Mon, Oct 28, 2013 at 05:06:09PM -0400, Terry Reedy wrote:
I think where we went wrong with strides was to have the sign of the stride affect the interpretation of i and j (in analogy with ranges). The change is to correct this by decoupling steps 1. and 2. below. The result is that i and j would mean left and right ends of the slice, rather than 'start' and 'stop' ends of the slice.
Sorry Terry, your paragraph above is ambiguous to me. It sounds like you are saying that having slices work by analogy with range was a mistake.
I once suggested that slice and range should be consolidated into one class. I was told (correctly, I see now) that no, slices and ranges are related but different. I have forgotten whatever explanation was given, but their parameters have different meanings. For slices, start and stop are symmetrical and both mark boundaries between what is included and what is excluded. For ranges, start and stop are symmetrical; one is included and the other excluded. What I said above is that a negative stride is enough to say 'reverse the direction of selection (or replacement)'. It is not actually necessary to also switch the endpoint arguments.
Are you suggesting to break the analogy between slicing and range?
It was already broken, more than I really noticed until today.
That is, range continues to work they way it currently does, but change slice?
Perhaps, but I read Guido's post less than 12 hours ago. Thinking about ranges is for another day.
Whether selecting or replacing, this proposal makes the rule for indicating an arithmetic subsequence to be:
1. indicate the contiguous slice to work on with left and right endpoints (left end i, right end j, i <= j after normalization with same rules as at present);
2. indicate the starting end and direction of movement, left to right (default) or right to left (negate k);
3. indicate whether to pick every member of the slice (k=1, default) or every kth (k > 1), starting with the first item at the indicated end (if there is one) and moving in the appropriate direction.
"pick every kth element" works for k=1 as well as k > 1, no need for a special case here. Every 1th element is every element :-)
Right. The reason I special-cased 1 is that -1 is an unambiguous special case that can be used to define the -k for k>1 case. Currently, s[::-k] == s[::-1][::k]. It has been proposed to change that to s[::k][::-1], first by Tim (who changed his mind, as least for now) and perhaps by Nick.
My quick take on slicing versus indexing. The slice positions of a single item are i:(i+1). The average is i.5. Some languages (0-based, like Python) round this down to i, others (1-based) round up to i+1.
Because of my experience drawing graphs with dots representing objects and with axes with labelled tick marks, I think of slicing in terms of labelled tick marks with objects (or object references) 'centered' between the tick marks. |_|_| 0 1 2 If a character is centered between the tick marks labelled 0 and 1, its 'coordinate' would be .5. I agree that one could get to the same result by simply dropping one of the slice endpoints. Another reason to think of the objects as being at half coordinates is that is explains the count correctly without introducing a spurious asymmetry. A slice from i to j includes (j-i)+1 slice positions if you include i and j or (j-i)-1 if you do not. It include j-i half coordinates, which is exactly how many items are included in the slice.
I don't think it's helpful to talk about averaging or rounding the indexes. Better to talk about whether indexes are included or excluded, or whether the interval is open (end points are excluded) or closed (end points are included).
s[i:j] includes all items between slice positions i and j. I do not think that the concept open/closed really applies to slices, as opposed to arithmetic interval (whether discrete or continuous). Slicing uses slice coordinates, but does not include them in the slice. If one forces the concept on slices, the slice interval would be either open or closed at both ends. They are definitely not asymmetric, half one, half the other. However, the 'length' of a slice is the same as a half-open interval. -- Terry Jan Reedy
Terry Reedy wrote:
I have forgotten whatever explanation was given, but their parameters have different meanings. For slices, start and stop are symmetrical and both mark boundaries between what is included and what is excluded. For ranges, start and stop are symmetrical; one is included and the other excluded.
I don't think that's the difference; you can equally well think of the parameters to range as delineating a slice of an infinite list of integers. The real difference is in how negative numbers are interpreted: range(-2, 3) gives [-2, -1, 0, 1, 2], whereas slicing does something special with the -2. -- Greg
I'd like to throw in an idea. Not sure how serious it is (prepared to be shot down in flames :-) ), just want to be sure that all possibilities are examined. With positive strides, "start" is inclusive, "end" is exclusive". Suppose that with negative strides, "start" were exclusive and "end" was inclusive. (I.e. the "lower" bound was always inclusive and the "upper" bound was always exclusive.) Then "abcde"[:2:-1] would be "edc", not "ed". Then "abcde"[:1:-1] would be "edcb", not "edc". Then "abcde"[:0:-1] would be "edcba". I think this fits in with Tim Peters' concept of characters between positions, e.g. "abcde"[3:0:-1] would be "cba" (not "dcb" as at present), i.e. the characters between positions 0 and 3. Rob Cliffe On 27/10/2013 17:04, Guido van Rossum wrote:
In the comments of http://python-history.blogspot.com/2013/10/why-python-uses-0-based-indexing.... there were some complaints about the interpretation of the bounds for negative strides, and I have to admin it feels wrong. Where did we go wrong? For example,
"abcde"[::-1] == "edcba"
as you'd expect, but there is no number you can put as the second bound to get the same result:
"abcde"[:1:-1] == "edc" "abcde"[:0:-1] == "edcb"
but
"abcde":-1:-1] == ""
I'm guessing it all comes from the semantics I assigned to negative stride for range() long ago, unthinkingly combined with the rules for negative indices.
Are we stuck with this forever? If we want to fix this in Python 4 we'd have to start deprecating negative stride with non-empty lower/upper bounds now. And we'd have to start deprecating negative step for range() altogether, recommending reversed(range(lower, upper)) instead.
Thoughts? Is NumPy also affected?
-- --Guido van Rossum (python.org/~guido <http://python.org/%7Eguido>)
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
No virus found in this message. Checked by AVG - www.avg.com <http://www.avg.com> Version: 2012.0.2242 / Virus Database: 3222/6285 - Release Date: 10/27/13
On 10/27/2013 07:03 PM, Rob Cliffe wrote:
I'd like to throw in an idea. Not sure how serious it is (prepared to be shot down in flames :-) ), just want to be sure that all possibilities are examined.
I think your safe. ;-)
With positive strides, "start" is inclusive, "end" is exclusive". Suppose that with negative strides, "start" were exclusive and "end" was inclusive.
With the proposed behaviour, (That you and Tim described), it will be easier to think in terms of [left:right:step]. And it fits with Guido's ... s[left:right:-step] == s[left:right][::-step] One of the nice properties is that you can switch directions by just changing the step. With the current slice's, you need to change the start and stop as well. And also recalculate those if you want the same range. BTW. A negative step will only be the exact reversed sequence if the last item is also (j-1) for that step value. Cheers, Ron
(I.e. the "lower" bound was always inclusive and the "upper" bound was always exclusive.) Then "abcde"[:2:-1] would be "edc", not "ed". Then "abcde"[:1:-1] would be "edcb", not "edc". Then "abcde"[:0:-1] would be "edcba". I think this fits in with Tim Peters' concept of characters between positions, e.g. "abcde"[3:0:-1] would be "cba" (not "dcb" as at present), i.e. the characters between positions 0 and 3. Rob Cliffe
On 10/27/2013 10:04 AM, Guido van Rossum wrote:
Thoughts? Is NumPy also affected?
It seems to me that the issue is not with negative strides, but with negative indexing: - off by one errors because the end starts at -1 and not -0 - calculation errors because the end is -1 and not -0 -- ~Ethan~
On 10/27/2013 12:04 PM, Guido van Rossum wrote:
Are we stuck with this forever? If we want to fix this in Python 4 we'd have to start deprecating negative stride with non-empty lower/upper bounds now. And we'd have to start deprecating negative step for range() altogether, recommending reversed(range(lower, upper)) instead.
Thoughts? Is NumPy also affected?
I found this very short but interesting page that explains a little bit about how NumPy uses Python's slices. http://ilan.schnell-web.net/prog/slicing/ It looks like as long as we don't change the semantics of how slice objects get passed it won't effect NumPy. Using the example from the web page above, we can see how the slice syntax already excepts more than one set of index's and/or values separated by commas. ;-)
class foo: ... def __getitem__(self, *args): ... print(args) ... x = foo()
x[2:3, 4:5] ((slice(2, 3, None), slice(4, 5, None)),)
It looks like it's the __getitem__ and __setitem__ methods that complains if you send it more than one set of indices or value. It's not a syntax limitation. If the left and right indices are to be considered separate from the step, we can use this existing legal syntax, and just pass the step after a comma. a[i:j, k] And teach __getitem__ and __setitem__ to take the extra value. Then your proposed relationship becomes the following and it's even clearer that (i and j) are separate and not effected by k. a[i:j, k] == a[i:j][:, k]
i, j, k = 0, 9, -1
x[i:j, k] ((slice(0, 9, None), -1),)
x[i:j] (slice(0, 9, None),)
x[:, k] ((slice(None, None, None), -1),)
Cheers, Ron
Meant to send this the list... On 10/29/2013 12:07 AM, Greg Ewing wrote:
Ron Adam wrote:
If the left and right indices are to be considered separate from the step, we can use this existing legal syntax, and just pass the step after a comma.
a[i:j, k]
No, we can't do that, because NumPy uses that for indexing into a 2-dimensional array.
Can you explain why it's an issue? Currently lists won't accept that, So NumPy isn't using that spelling with lists, and it only requires changing the __getitem__ and __senditem__ methods on lists, (which numpy can't be using in this way because it currently doens't work.), and it doen't change slice objects, or the slice syntax at all? I can't see how it will effect them. It's a much smaller change than many of the other sugestions. Cheers, Ron
participants (33)
-
Alexander Belopolsky
-
Andrew Barnert
-
Antoine Pitrou
-
Barry Warsaw
-
Brett Cannon
-
Bruce Leban
-
Charles Hixson
-
Chris Angelico
-
Eric Snow
-
Ethan Furman
-
Georg Brandl
-
Greg Ewing
-
Guido van Rossum
-
Jan Kaliszewski
-
Joshua Landau
-
Mark Lawrence
-
MRAB
-
Neal Becker
-
Nick Coghlan
-
Oscar Benjamin
-
Paul Moore
-
Philipp A.
-
Richard Oudkerk
-
Rob Cliffe
-
Robert Kern
-
ron adam
-
Ron Adam
-
Ryan
-
Serhiy Storchaka
-
Steven D'Aprano
-
Terry Reedy
-
Tim Peters
-
אלעזר