inclusive-lower-bound, exclusive-upper-bound (was Re: Range Operation pre-PEP)

Mon May 14 20:49:36 EDT 2001

On Fri, 11 May 2001 15:46:09 -0400, Tim Peters <tim.one at home.com> wrote:
>> I'm not just being difficult; I'm trying to design my own language,
>> and this is one of the things I have different to Python.  If I've
>> missed something where the Python way is superior, then I might want
>> to change my mind.
>
>You can make it work either way, although (as above) there's reason to favor
>0-based indexing if ease of talking between Pixy and C is interesting to you.
>Icon is a good example of a language with the same basic "indices point
>*between* elements" (== "indices are offsets") approach, but where indices
>start at 1.  In two respects this can be nicer:
>
>1. The last element of a non-empty Icon list (or string) is (using
>   Python spelling) list[len(list)].  In Python, at the start, the
>   only way to spell it was list[len(list)-1].  That created its own
>   breed of "off by 1" errors.  But Python later grew meaning for
>   negative list indices too, and since then list[-1] is the best
>   way to get at the last list element.
>
>2. Spelling "the point just beyond the end of the sequence" is
>   easier in Icon:  in Python that's index len(list), in Icon it's
>   index 0 (or *its* breed of off-by-1 temptation, len(list)+1).
>   That is, indices in the 0-based Python look like:
>
>      x[0] x[1] x[3]
>     0    1    2    3  positive flavor
>    -3   -2   -1    3  negative flavor
>
>   but in the 1-based Icon they're:
>
>      x[1] x[2] x[3]
>     1    2    3    4  positive flavor
>    -3   -2   -1    0  negative flavor
>
>   If only 1's-complement integer arithmetic had caught on, Python
>   could get rid of the "3 wart" in the lower-right corner by using
>   -0 instead <wink>.

I've been implementing a library for eiffel and come across the same issue.  I
chose 1-based indexing because this is standard in eiffel-land.  One thing
that is standard eiffel which I didn't keep is that (in the standard) ranges
are inclusive at both ends.  This makes the "zero-length" slice rather ugly
("n to n-1 is the spot before n" was my interpretation).

Here's my random notes-to-myself on the subject:

---

Slices:
`lower' and `upper' indices in ranges always indicate the position before the
indexed element.  So a slice from 1 to 3 consists of elements 1 and 2.
ELKS specifies that ranges are inclusive, which means that the lower bound
indicates the position before the indexed element and the upper bound indicates
the position after the indexed element.  Both conventions are consistent and
"more" intuitive in their own way, but "half inclusive" ranges have some
nice properties:

                half-inclusive                      inclusive
length:         upper - lower                       upper - lower + 1
next n elts:    i, i + n                            i, i + n - 1
before i:       i, i                                i, i - 1

    on the other hand...

i to end:       i, count + 1                        i, count
    (or, in general, the "end" is count + 1, while in "normal" indexing it's
    count)

---

It seems it's impossible to avoid off-by-one errors, but half-inclusive looks
easier in the common cases to me.  I'm not supporting negative indexing
(unless 'a.item(a.count - x)' becomes really annoying), but if I did I'd do it
the Icon way.

There are all sorts of other interesting questions and tradeoffs:

Does searching using reference equality or value equality?  Python uses value
equality, but only on objects define __eq__ or __cmp__ methods, but eiffel
has standard '=', 'is_equal', and 'is_deep_equal' methods.  Which to use?

Do methods that take an index and a value use the order 'index, value', or
'value, index'?  Standard eiffel is 'value, index', but standard python is
'index, value'.  I prefer python, but can't articulate why.