Bug in string.find

Fredrik Lundh fredrik at pythonware.com
Fri Sep 2 09:59:09 CEST 2005


Ron Adam wrote:

>> indices point to the "gap" between items, not to the items themselves.
>
> So how do I express a -0?  Which should point to the gap after the last
> item.

that item doesn't exist when you're doing plain indexing, so being able
to express -0 would be pointless.

when you're doing slicing, you express it by leaving the value out, or by
using len(seq) or (in recent versions) None.

>> straight indexing returns the item just to the right of the given gap (this is
>> what gives you the perceived assymmetry), slices return all items between
>> the given gaps.
>
> If this were symmetrical, then positive index's would return the value
> to the right and negative index's would return the value to the left.

the gap addressing is symmetrical, but indexing always picks the item to
the right.

> Have you looked at negative steps?  They also are not symmetrical.

> print a[3:2:-1]    # d   These are symetric?!

the gap addressing works as before, but to understand exactly what characters
you'll get, you have to realize that the slice is really a gap index generator.  when
you use step=1, you can view slice as a "cut here and cut there, and return what's
in between".  for other step sizes, you have to think in gap indexes (for which the
plain indexing rules apply).

and if you know range(), you already know how the indexes are generated for
various step sizes.

from the range documentation:

    ... returns a list of plain integers [start, start + step, start + 2 * step, ...].
    If step is positive, the last element is the largest start + i * step less than
    stop; if step is negative, the last element is the largest start + i * step
    greater than stop.

or, in sequence terms (see http://docs.python.org/lib/typesseq.html )

    (3) If i or j is negative, the index is relative to the end of the string: len(s) + i
    or len(s) + j is substituted.

    ...

    (5) The slice of s from i to j with step k is defined as the sequence of items
    with index x = i + n*k for n in the range(0,(j-i)/k). In other words, the
    indices are i, i+k, i+2*k, i+3*k and so on, stopping when j is reached
    (but never including j).

so in this case, you get

>>> 3 + 0*-1
3
>>> 3 + 1*-1
2 # which is your stop condition

so a[3:2:-1] is the same as a[3].

> print a[-4:-5:-1]  # d

same as a[-4]

> print a[3:-5:-1]   # d

now you're mixing addressing modes, which is a great way to confuse
yourself.  if you normalize the gap indexes (rule 3 above), you'll get
a[3:2:-1] which is the same as your earlier example.  you can use the
"indices" method to let Python do this for you:

>>> slice(3,-5,-1).indices(len(a))
(3, 2, -1)
>>> range(*slice(3,-5,-1).indices(len(a)))
[3]

> print a[-4:2:-1]   # d

same problem here; indices will tell you what that really means:

>>> slice(-4,2,-1).indices(len(a))
(3, 2, -1)
>>> range(*slice(-4,2,-1).indices(len(a)))
[3]

same example again, in other words.  and same result.

> This is why it confuses so many people.  It's a shame too, because slice
> objects could be so much more useful for indirectly accessing list
> ranges. But I think this needs to be fixed first.

as everything else in Python, if you use the wrong mental model, things
may look "assymmetrical" or "confusing" or "inconsistent".  if you look at
how things really work, it's usually extremely simple and more often than
not internally consistent (since the designers have the "big picture", and
knows what they're tried to be consistent with; when slice steps were
added, existing slicing rules and range() were the obvious references).

it's of course pretty common that people who didn't read the documentation
very carefully and therefore adopted the wrong model will insist that Python
uses a buggy implementation of their model, rather than a perfectly consistent
implementation of the actual model.  slices with non-standard step sizes are
obviously one such thing, immutable/mutable objects and the exact behaviour
of for-else, while-else, and try-else are others.  as usual, being able to reset
your brain is the only thing that helps.

</F> 






More information about the Python-list mailing list