Searching for uniqness in a list of data
ptmcg at austin.rr._bogus_.com
Wed Mar 1 19:26:42 CET 2006
"rh0dium" <steven.klass at gmail.com> wrote in message
news:1141228140.331259.8060 at j33g2000cwa.googlegroups.com...
> Hi all,
> I am having a bit of difficulty in figuring out an efficient way to
> split up my data and identify the unique pieces of it.
> Now I want to split each item up on the "_" and compare it with all
> others on the list, if there is a difference I want to create a list of
> the possible choices, and ask the user which choice of the list they
Check out difflib.
['1p2m', '3.3-1.8v', 'sal', 'ms']
['1p2m', '3.3-1.8', 'sal', 'log']
>>> from difflib import SequenceMatcher
>>> s = SequenceMatcher(None, data.split("_"), data.split("_"))
[(0, 0, 1), (2, 2, 1), (4, 4, 0)]
I believe one interprets the tuples in matching_blocks as:
In your case, the sequences have a matching element 0 and matching element
2, each of length 1. I don't fully grok the meaning of the (4,4,0) tuple,
unless this is intended to show that both sequences have the same length.
Perhaps from here, you could locate the gaps in the
SequenceMatcher.matching_blocks property, and prompt for the user's choice.
More information about the Python-list