[Tutor] s.insert(i, x) explanation in docs for Python 3.4 confusing to me
Steven D'Aprano
steve at pearwood.info
Sat Jan 16 06:54:11 EST 2016
On Fri, Jan 15, 2016 at 10:20:41PM -0600, boB Stepp wrote:
> At https://docs.python.org/3.4/library/stdtypes.html#sequence-types-list-tuple-range
> it states:
>
> "s.insert(i, x) inserts x into s at the index given by i (same as s[i:i] = [x])"
>
> I find this confusing.
That's because it is confusing, unless you get the missing picture. The
missing picture is how indexes are treated in Python. For the following,
you will need to read my email using a fixed-width (monospaced) font,
like Courier, otherwise the ASCII diagrams won't make any sense.
First, understand that indexes are treated two ways by Python. When you
just give a single index, like mylist[3], the indexes line up with the
items:
mylist = [ 100, 200, 300, 400, 500 ]
indexes: ^ ^ ^ ^ ^
0 1 2 3 4
This makes perfect sense: mylist[3] is the 3rd item in the list
(starting from zero), namely 400.
But slices are slightly different. When you provide two indexes in a
slice, they mark the gaps BETWEEN items:
mylist = [ 100, 200, 300, 400, 500 ]
indexes: ^ ^ ^ ^ ^ ^
0 1 2 3 4 5
and the slice cuts in the gaps, so for example, mylist[1:3] cuts in the
space *just before* 200 and *just before* 400, giving you [200, 300] as
the result.
I stress that these gaps aren't "real". It's not like Python allocates a
list, and leaves a physical chunk of memory between each item. That
would be just silly. You should consider these gaps to be infinitely
thin spaces beween the consecutive items.
The important thing here is that when you slice, the list is cut
*between* items, so slice index 1 comes immediately after the 0th item
and immediately before the 1st item.
Now, if you're paying attention, you will realise that earlier I told a
little fib. There's no need to say that Python treats the index
differently for regular indexing (like mylist[3]). We just need a slight
change in understanding:
mylist = [ 100, 200, 300, 400, 500 ]
indexes: ^ ^ ^ ^ ^ ^
0 1 2 3 4 5
Now the rule becomes:
- if you have mylist[3], return the next item starting at index 3
(in this case, 400);
- but if you have a slice like mylist[1:3], return the items
starting at the first index (1) and ending at the second (3),
namely [200, 300].
So mylist[1] is similar to mylist[1:2] except that the first returns the
item itself, namely 200, while the second returns a one-element list
containing that item, namely [200].
The picture is a bit more complicated once you introduce a third value
in the slice, like mylist[1:3:2], but the important thing to remember is
that indexes mark the gaps BETWEEN items, not the items themselves.
Now, what happens with *negative* indexes?
mylist = [ 100, 200, 300, 400, 500 ]
indexes: ^ ^ ^ ^ ^ ^
-6 -5 -4 -3 -2 -1
mylist[-5:-2] will be [200, 300, 400]. Easy.
Now, keeping in mind the rule that slices cut *between* items, you
should be able to explain why mylist[1:1] is the empty list [].
What happens when you assign to a slice? The rule is the same, except
instead of returning the slice as a new list, you *replace* that slice
with the list given. So to understand
mylist[1:3] = [777, 888, 999]
we first mark the gaps between items and underline the slice being
replaced:
mylist = [ 100, 200, 300, 400, 500 ]
indexes: ^ ^ ^ ^ ^ ^
0 1 2 3 4 5
-----------
and insert the new slice in its place (moving everything to the right
over):
mylist = [ 100, 777, 888, 999, 400, 500 ]
indexes: ^ ^ ^ ^ ^ ^ ^
0 1 2 3 4 5 6
Now let's look at an insertion. We're told that
mylist.insert(1, x)
is the same as mylist[1:1] = [x]. Let's see:
mylist = [ 100, 200, 300, 400, 500 ]
indexes: ^ ^ ^ ^ ^ ^
0 1 2 3 4 5
-
Note the tiny underline that extends from index 1 to index 1. Since that
doesn't extend to the next index, no items are removed. Now insert the
new list [x] into that slice:
mylist = [ 100, x, 200, 300, 400, 500 ]
indexes: ^ ^ ^ ^ ^ ^ ^
0 1 2 3 4 5 6
And that's an insertion! The insertion takes place at position one,
which means that x ends up just after the gap at position one.
> The second thing I find puzzling is the docs say x is inserted at
> position i, while in the interpreter:
>
> >>> help(list.insert)
> Help on method_descriptor:
>
> insert(...)
> L.insert(index, object) -- insert object before index
>
> The "...insert object before index" makes sense to me, but "...inserts
> x into s at the index given by i..." does not because:
The beauty of thinking about indexs as the gap between items is that it
doesn't matter whether you insert "after" the gap or "before" the gap or
"into" the gap, you get the same thing:
mylist = [ 100, x, 200, 300, 400, 500 ]
indexes: ^ ^ ^ ^ ^ ^ ^
0 1 2 3 4 5 6
whereas if you consider the indexes to point to the items themselves,
then insert "before" position 1, "after" position 1 and "into" position
1 are three different things:
mylist = [ 100, 200, 300, 400, 500 ]
indexes: ^ ^ ^ ^ ^
0 1 2 3 4
Insert *before* position 1: mylist = [ 100, x, 200, 300, 400, 500 ]
Insert *after* position 1: mylist = [ 100, 200, x, 300, 400, 500 ]
Insert *into* position 1: mylist = [ 100, x, 300, 400, 500 ]
This ambiguity doesn't occur when you think about indexes pointing at
the gaps between items instead of the items themselves.
--
Steve
More information about the Tutor
mailing list