[Tutor] Algorithm for sequence matching

Peter Otten __peter__ at web.de
Sun Jul 3 00:29:46 CEST 2011


ANKUR AGGARWAL wrote:

> Hey
> I am looking for an algo for the largest sequence search in the two list.
> 
> Example : list a accepts some say 'm' numbers. list b accept says 'n'
> numbers. I want to look for the largest same sequence between the two list
> and then display it. I tried out but failed to do so.
> Say A=[11,23,45,21,63,56,78,32]
> B=[56,78,11,23,45,21,111,234,56543]
> 
> There are two  similar sequence matching over here [11,23] and [23,45,21]
> i want to display second sequence because its larger in number. Plz help
> Thanks in Advance :)

The following is not particular efficient, but may be good enough if the 
sequences are not too long:

def index(items, value):
    """Generate all occurences of `value` in `items`"""
    start = 0
    while True:
        try:
            pos = items.index(value, start)
        except ValueError:
            break
        yield pos
        start = pos + 1

def matches(a, b):
    for i, x in enumerate(a):
        for k in index(b, x):
            for p, (s, t) in enumerate(zip(a[i:], b[k:])):
                if s != t:
                    break
            yield a[i:i+p]

if __name__ == "__main__":
    a = [11, 23, 45, 21, 63, 56, 78, 32]
    b = [56, 78, 11, 23, 45, 21, 111, 234, 56543]

    print(max(matches(a, b), key=len))
    print(sorted(matches(a, b)))




More information about the Tutor mailing list