List slices

Alex Martelli aleaxit at yahoo.com
Tue Oct 31 11:30:33 EST 2000


"Daniel Klein" <DanielK at aracnet.com> wrote in message
news:YdBL5.7502$Qz2.188666 at typhoon.aracnet.com...
> Given the list
>
>     x = ['first','second','third','forth']
>
> and
>
>     x[-1]
>
> returns ['forth'], it seems inconsistent that
>
>     x[1:-1]
>
> returns ['second',third'] instead of the expected
['second',third','forth'].
> Can someone provide an explanation for this please?

Sure!  Let's forget for the moment negative indices -- they are equivalent,
in a slice, to just the same ">=0 indices" to which they would be in an
indexing (i.e., when applied to a sequence whose len is L, -1 is equivalent
to L-1, -2 to L-2, ..., whether in a slice or in an indexing).

In general, sequence[a:b], for 0 <= a <= b, returns (or lets you set)
the slice that includes items from sequence[a], INCLUDED, to
sequence[b], EXCLUDED (if a>=len(sequence), or b<=a, we have
'empty slices', but "correctly placed" -- at the end of the
sequence, if a >= len(sequence); this matters when you assign
something to the slice...).

The EXCLUDED part is what seems "inconsistent" with indexing to you.

The motivation for it is that this style -- _always_ denote intervals
as 'half-open': lower-bound-included, upper-bound-excluded -- has
proven with experience to be an excellent way to reduce the dreaded
"off-by-one" errors.  Note that the range() built-in function is
quite consistent with that -- it, too, returns numbers _up to but
excluding_ the "upperbound" argument that you pass to it!

You shall find at:
http://www.deja.com/getdoc.xp?AN=593933994
a nice article by Kragen Sitaker on the subject (with a funny
Dijkstra parable as a bonus, though the relevance to the
specific argument is suspect:-).

Say that you want "all the sequence except the last 3 items".
With the "Python convention", that's seq[:-3] -- idiomatic,
readable, clear, and quite symmetric with "all except the
first 3 items" which is seq[3:].  The number of items in
seq[a:b] (with b>=a, and b not _too_ large:-) is b-a --
again,readable and clear.  "All but the first and last" is
seq[1:-1].  To insert 'x' in list l at any position n, one
writes l[n:n]='x'.  Etc, etc; you won't believe how many
little fussy '+1's and '-1's this saves you for common list
processing tasks (and each is one less opportunity for an
off-by-one error...), particularly as it plays well with
the 0-based array indices.

AFAIK, Koenig (in "C Traps and Pitfalls") was the first to
explain this technique in detail, in print.  It's in a C
context, of course, thus focusing on loops such as
    for(i=a; i<b; ++i)
rather than 'list-slicing', but the key idea of the
"half-open interval" translates quite well.


Alex






More information about the Python-list mailing list