[python-win32] Identify unique data from sequence array

Tim Roberts timr at probo.com
Wed Dec 22 20:37:53 CET 2010


otrov wrote:
>
> Thanks for your reply, but perhaps there is misunderstanding:
>
> I don't want unique values, but unique sequence (block) of data that is repeated in array:
>
> A B C D D D A B C D D D A B C D D D
> |_________| |_________| |_________|
>      |           |           |
>    unique      unique      unique
>   sequence    sequence    sequence
>     data        data        data
>
> I tested your approach and won't say it's slow. It works great but that's not what I'm after. Thanks anyway

That is a computationally intensive task, unless you are guaranteed that
the entire sequence consists of the repeating sequence.  If you KNOW
that the larger sequence starts with a shorter sequence that will
repeat, then it's not so bad.

1. Start with the first element (call it L)
2. Scan downwind for an matching element (call it R)
3. Compare L+1 and R+1 until you find a mismatch -- that's the current
"largest" match.
4. Repeat from 2 to see if you can find a longer match.

-- 
Tim Roberts, timr at probo.com
Providenza & Boekelheide, Inc.



More information about the python-win32 mailing list