[Tutor] Slicing Tuples

Steven D'Aprano steve at pearwood.info
Sun Dec 12 00:39:03 CET 2010


John Russell wrote:

> So, my question is this, and I realize that this is *very* basic - what is
> going on with the last element? Why is it returning one less than I think it
> logically should. Am I missing something here? There is not much of an
> explanation in the book, but I would really like to understand what's going
> on here.


If the book really doesn't explain this (as opposed to you just having 
missed it), that's a fairly serious lack!

Slicing in Python uses what is called "half-open intervals". There are 
four obvious ways to count slices:

(1) Closed interval:
     both the start index and the end index are included.

(2) Open interval:
     both the start index and end index are excluded.

(3) Half-open interval:
     the start index is included, and the end index excluded.

(4) Half-open interval (reversed sense):
     the start index is excluded, and the end index included.


Python uses #3 because it is generally the most simple and least 
error-prone. Essentially, you should consider a slice like sequence[a:b] 
to be equivalent to "take a slice of items from seq, start at index=a 
and stop when you reach b". Because you stop *at* b, b is excluded.

Indexes should be considered as falling *between* items, not on them:

0.1.2.3.4.5.6.7.8
|a|b|c|d|e|f|g|h|

Run an imaginary knife along the line marked "4", and you divide the 
sequence abcdefgh into two pieces: abcd and efgh.

Why are half-open intervals less error-prone? Because you have to adjust 
by 1 less often, and you have fewer off-by-one errors. E.g.

* Take n items starting from index i:

   Half-open: seq[i:i+n]
   Closed:    seq[i:i+n-1]

* Length of a slice [i:j]:

   Half-open: j-i
   Closed:    j-i+1

* Dividing a sequence into two slices with no overlap:

   Half-open:
     s = seq[:i]
     t = seq[i:]

   Closed:
     s = seq[:i-1]
     t = seq[i:]

* Splitting a string at a delimiter:
   s = "abcd:efgh"

   We want to split the string into two parts, everything before
   the colon and everything after the colon.

   Half-open:
     p = s.find(':')
     before = s[:p]
     after = s[p+1:]

   Closed:
     p = s.find(':')
     before = s[:p-1]
     after = s[p+1:]

* Empty slice:

   Half-open: seq[i:i] is empty
   Closed:    seq[i:i] has a single item


So half-open (start included, end excluded) has many desirable 
properties. Unfortunately, it has two weaknesses:

- most people intuitively expect #1;
- when slicing in reverse, you get more off-by-one errors.

There's not much you can do about people's intuition, but since reverse 
slices are much rarer than forward slices, that's a reasonable cost to pay.



-- 
Steven


More information about the Tutor mailing list