[Python-ideas] Where did we go wrong with negative stride?

Mon Oct 28 18:15:19 CET 2013

Apologies for the terrible post above; here it is in full and not
riddled with as many editing errors:

On 28 October 2013 06:33, Chris Angelico <rosuav at gmail.com> wrote:
> On Mon, Oct 28, 2013 at 10:45 AM, Greg Ewing
> <greg.ewing at canterbury.ac.nz> wrote:
>> Neal Becker wrote:
>>>
>>> One thing I find unfortunate and does trip me up in practice, is that
>>> if you want to do a whole sequence up to k from the end:
>>>
>>> u[:-k]
>>>
>>> hits a singularity if k=0
>>
>>
>> I think the only way to really fix this cleanly is to have
>> a different *syntax* for counting from the end, rather than
>> trying to guess from the value of the argument. I can't
>> remember ever needing to write code that switches dynamically
>> between from-start and from-end indexing, or between
>> forward and reverse iteration direction -- and if I ever
>> did, I'd be happy to write two code branches.
>
> If it'd help, you could borrow Pike's syntax for counting-from-end
> ranges: <2 means 2 from the end, <0 means 0 from the end. So
> "abcdefg"[:<2] would be "abcde", and "abcdefg"[:<0] would be
> "abcdefg". Currently that's invalid syntax (putting a binary operator
> with no preceding operand), so it'd be safe and unambiguous.

Agreed in entirety. I'm not sure that this is the best method, but
it's way better than the status quo. "<" or ">" with negative strides
should raise an error and should be the recommended method until
negatives are phazed out.

*BUT* there is another solution. It's harder to formulate but I think
it's more deeply intuitive.

The simple problem is this mapping:

list:  [x,  x,  x,  x,  x]
index:  0   1   2   3   4
       -5  -4  -3  -2  -1

Which is just odd. But you can stop thinking about them as *negative*
indexes and start thinking about NOT'd indexes:

       ~4  ~3  ~2  ~1  ~0

which you have to say looks OK.

Then you design slices around that.

To take the first four elements:

#>>>
"0123456789"[:4]
#>>> '0123'

To take the last four:

#>>>
"0123456789"[~4:] # Currently returns '56789'
#>>> '6789'

For slicing with a mixture:

"0123456789"[1:~1] # Currently returns '1234567'
#>>> '12345678'

"0123456789"[~5:5] # Currently returns '4'
#>>> ''

So the basic idea is that, for X:Y, X is closed iff positive and Y is
open iff positive. If you go over this in your head, it's quite
simple.

For ~6:7;
START: Count 6 from the back, looking at the *signposts* between
items, not the items.
END: Count 7 forward, looking at the *signposts* between items, not the items.

Thus you get, for "0123456789":

"|0|1|2|3|4|5|6|7|8|9|"
             S     E

    and thus, obviously, you get "456".

    And look, it matches our current negative form!

"0123456789"[-6:7]
#>>> '456'

Woah! *BUT* it works without silly coherence problems if you have -N,
because ~0 is -1!

אלעזר said the problem was with negative indexes, not strides, so it's
good that this solves it. So, how does this help with negative
*strides*? Well, Guido's

#>>>
"abcde"[::-1]
#>>> 'edcba'

would be hopefully solved by

"abcde"[:0:-1] # Currently returns 'edcb'
#>>> 'edcba'

because you can just *inverse* the "X is closed iff positive and Y is
open iff positive" rule.

Does this pan out nicely?

Really, we want

"abcde"[2:4][::-1] == "abcde"[4:2:-1]

which is exactly what happens.

I'm thinking I'll make a string subclass and try and "intuit" the
answers, but I think this is the right choice.

Anyone with me, even partially?