#elements of seq A in seq B

Wed Aug 19 22:12:14 EDT 2009

On Aug 19, 8:17 pm, "Jan Kaliszewski" <z... at chopin.edu.pl> wrote:
> 20-08-2009 o 01:19:24 Neal Becker <ndbeck... at gmail.com> wrote:
>
> > What would be a time efficient way to count the number of occurrences of
> > elements of sequence A in sequence B?  (in this particular case, these
> > sequences are strings, if that matters).
>
> If you mean: to count occurences of each element of A (separately) in B...
>
> Hm, maybe something like this:
>
>    # result as a dict {<element of A>: <how many occurences in B>, ...}
>    dict((element, B.count(element)) for element in A)
>
> If you mean: to count non overlaping occurences of string A in B
> -- simply:
>
>    B.count(A)
>
> Regards,
> *j
>
> --
> Jan Kaliszewski (zuo) <z... at chopin.edu.pl>

You don't want to use count() in a case like this because it iterates
through B len(A) times, i.e. this algorithm is O(A x B).

If all you want is the total you can do it like this:

def f(a, b):
    a = set(a)
    return sum(item in a for item in b)

A = 'some string'
B = 'another string'

print f(A, B) # prints 12

If you want to know the count for each element you can use this:

from collections import defaultdict

def g(a, b):
    a = set(a)
    d = defaultdict(int)
    for item in b:
        if item in a:
            d[item] += 1
    return d

print g(A, B)

# prints defaultdict(<type 'int'>, {' ': 1, 'e': 1, 'g': 1, 'i': 1,
'o': 1, 'n': 2, 's': 1, 'r': 2, 't': 2})

HTH,
~Simon