list index()

Tue Sep 4 08:15:59 EDT 2007

On 2007-09-04, Campbell Barton <cbarton at metavr.com> wrote:
> Jason wrote:
>> Returning -1 is not a good return value to indicate an error.
>> After all, -1 is a valid index in most Python lists.
>> (Negative numbers index from the tail of the list.)
>
> Agree in general tho its a bit inconsistent how...
>
> "string".find(val)
>
> ...can return -1 but .index() cant, though the inconsistency is
> probably with string since most other areas of the py api raise
> errors in cases like this.

The two functions have different ranges, but also different
domains.

sequence.index 
  domain: any item '==' to an item in sequence
  range: All non-negative indexes of sequence

string.find 
  domain: any string
  range: -1 for not found, and all non-negative indexes of string.

If you want to search for subsequences in a sequence, a la
string.find, you can use something like (what I naively assume
will be a maximally efficent):

def find(seq, subseq):
    """ Find a subsequence of seq in seq, and return its index. Return -1 if
    subseq is not found.

    >>> seq = [0, 1, 2, 3, 4, 5]
    >>> find(seq, [0, 1, 2])
    0
    >>> find(seq, [])
    0
    >>> find(seq, [3, 4])
    3
    >>> find(seq, [3, 2])
    -1
    >>> find(seq, [5, 6])
    -1
    >>> find(seq, [3])
    3
    >>> find(seq, [0, 2])
    -1

    """
    i = 0
    j = 0
    while i < len(seq) and j < len(subseq):
        if seq[i] == subseq[j]: 
            j += 1
        else:
            j = 0
        i += 1
    if j == len(subseq): return i - j
    else: return -1

It's probable that a simpler implementation using slice
operations will be faster for shortish lengths of subseq. It was
certainly easier to get it working correctly. ;)

def find(seq, subseq):
  for i, j in itertools.izip(xrange(len(seq)-len(subseq)),
                             xrange(len(subseq), len(seq))):
    if subseq == seq[i:j]:
      return i
  return -1

-- 
Neil Cerutti
I pulled away from the side of the road, glanced at my mother-in-law and
headed over the embankment. --Insurance Claim Blooper