Slice confusion : a[n:p] is a list exclude the last element p

Mon Apr 28 10:32:20 EDT 2003

<posted & mailed>

nguyen khanh wrote:

> a=range(10)
> b=a[3:7]
> b=[3,4,5,6]
> Why the design exclude the last element ?

It's an important design principle, which I first saw proposed in print
by Andrew Koenig in his landmark book "C Traps and Pitfalls": *all bounded
loops should be first-bound-included, last-bound-excluded* -- and which I
will therefore call "Koenig's Principle" although I have never seen the
principle *named* in the literature (I'mm CC'ing Andrew because of this --
I know there are already other things named after him, such as the "Koenig
Lookup" in C++, that he might like to get his name AWAY from, so it seems
only fair to allow him to disclaim THIS one too ASAP if he dislikes it!-).

Koenig wasn't talking about "slices", of course (there are no slices in C!),
but it works just as well for a C for-loop, "for(i=first; i<last; i++)" i.e.
using < and NOT <= for the terminating-condition.

It's an important design principle because of human-factors considerations,
which Koenig expresses better than I can (I really wish he would one day
write a "Python Traps and Pitfalls" book, but he seems too busy writing
great C++ books instead... perhaps the Python one would be too thin?-).  
But, for example, consider a question such as:

"How many items does XX[a:b] have?"

With Python's design (following Koenig's principle), the normal
answer (when a and b are both within XX'x boundaries) is:
        b - a

If the last-bound had been deemed to be included, it would have been:
        b - a + 1

Now, that innocuous-looking "+ 1" there is the cause of untold misery
in programming -- the root of most "off-by-one" errors.  By removing
the need for it, Koenig's principle makes you more productive.

Thanks to the principle, all sorts of pleasant invariants follows.

For example, for all XX, x, y, z, XX[x:y]+XX[y:z] == XX[x:z].  This
kind of regularity is exactly what facilitates reasoning about
programs, and in particular modifying programs.

> I's because the distinction of "element and list-element" ?
> a[3]=3        # element
> a[3:4]=[3]   # list-element
> 
> Why the designer d'nt use a[3:3]=[3] instead of a[p:p]=[] ?

Not sure at all what you mean by ``the distinction of "element and 
list-element" (the operations are called INDEXING and SLICING in
Python).  But I hope I started clarifying the design decision for you.

> The result a[p:p]=[] for all p d'nt add new information ,

Wrong!!!  Are you kidding...?

>>> a = list('ciao')
>>> a[3:3] = list('pe')
>>> a
['c', 'i', 'a', 'p', 'e', 'o']

vs

>>> a = list('ciao')
>>> a[2:2] = list('pe')
>>> a
['c', 'i', 'p', 'e', 'a', 'o']

Of COURSE it does "add new information", WHICH exact empty list you
are slicing -- i.e., exactly which p you're using for a[p:p] -- even
though you can "notice" this information only when you ASSIGN to the
particular slice rather than USE it (but of course, for reasons of
regularity, a[x:y] must indicate the same slice, for all x and y,
whether it's being assigned-to, or used).

> The design a[n:p] clarify the mind to include the last element p .

I deeply believe, on the contrary, that a systematic "last element
included" design would seriously damage Python, while its current
systematic "last element excluded" design enhances it.

> and a[p:p]=[p] is very clear and dinctint from a[p]=p .

In the current design, it is -- the first assignment INSERTS p
at the p-th position, the second assignment REPLACES the p-th
position with p, so the two are indeed "very clear AND DISTINCT".
If Python were to use the "last-element-included" design, then
the two assignments wouldn't be really distinct any more (except
perhaps in that slicing is more tolerant than indexing -- a[p]
raises unless -len(a) <= p < len(a), a[p:p] doesn't).

Alex