#elements of seq A in seq B

Simon Forman sajmikins at gmail.com
Thu Aug 20 01:44:02 EDT 2009


On Aug 19, 11:34 pm, John Machin <sjmac... at lexicon.net> wrote:
> On Aug 20, 12:12 pm, Simon Forman <sajmik... at gmail.com> wrote:
>
> > On Aug 19, 8:17 pm, "Jan Kaliszewski" <z... at chopin.edu.pl> wrote:
> > > If you mean: to count non overlaping occurences of string A in B
> > > -- simply:
>
> > >    B.count(A)
>
> > You don't want to use count() in a case like this because it iterates
> > through B len(A) times, i.e. this algorithm is O(A x B).
>
> There's seems to be multiple confusion here. Jan was talking about
> counting non-overlapping occurrences of string A in [string] B. In
> that case B.count(A) is nowhere near O(A*B)... in fact in reasonable
> cases it is sub-linear i.e. O(B/A)
>
> Seehttp://svn.python.org/view/python/trunk/Objects/stringlib/fastsearch....
>
> And even a naive implementation of str.count() would not "iterate
> through B len(A) times".
>
> and list.count() can't be used on its own to solve Neal's problem; its
> arg is a single element so it would have to be wrapped in a loop: sum
> (B.count(a) for a in A) which would of course be O(A*B)

First, thanks for pointing out str.count().  I didn't know of it and I
didn't notice that that was what was being used in Jan's code. (It's
blindingly obvious-- now that you've pointed it out. :] )


Second, I was referring to this:

   dict((element, B.count(element)) for element in A)

not B.count(A). Sorry for not interspersing my comment better.

Regards,
~Simon



More information about the Python-list mailing list