# Searching for uniqness in a list of data

Alexander Schmolck a.schmolck at gmail.com
Wed Mar 1 17:55:10 CET 2006

```"rh0dium" <steven.klass at gmail.com> writes:

> Hi all,
>
> I am having a bit of difficulty in figuring out an efficient way to
> split up my data and identify the unique pieces of it.
>
> list=['1p2m_3.3-1.8v_sal_ms','1p2m_3.3-1.8_sal_log']
>
> Now I want to split each item up on the "_" and compare it with all
> others on the list, if there is a difference I want to create a list of
> the possible choices, and ask the user which choice of the list they
> want.  I have the questioning part under control.   I can't seem to get
> my hands around the logic - the list could be 2 items or 100 long.  The
> point of this is that I am trying to narrow a decision down for an end
> user.  In other words the end user needs to select one of the list
> items, and by breaking it down for them I hope to simplify this.
>
> list=['1p2m_3.3-1.8v_sal_ms','1p6m_3.3-1.8_sal_log']
>  would only question the first data set ['1p2m', '1p6m' ]
>
> list=['1p2m_3.3-1.8v_sal_ms','1p2m_3.3-1.8v_pol_ms','1p3m_3.3-18.v_sal_ms']
>  If on the list ['1p2m','1p2m','1p3m'] the user selected 1p2m then the
> next list would only be ['sal','pol']
>  but if the user initially only selected 1p3m they would be done..
>
> I hope this clarifies what I am trying to do.  I just can't seem to get
> my hands around this - so an explaination of logic would really be
> helpfull.  I picture a 2d list but I can't seem to get it..

The easiest way to do this is to have a nested dictionary of prefixes: for
each prefix as key add a nested dictionary of the rest of the split as value
or an empty dict if the split is empty. Accessing the dict with an userinput
will give you all the possible next choices.

Spoiler Warning -- sample implementation follows below.

(mostly untested)

if len(split):
if split[0] not in d:
d[split[0]] = addSplit({}, split[1:])
else:
return d
def queryUser(chosen, choices):
next = raw_input('So far: %s\nNow type one of %s: ' %
(chosen,choices.keys()))
return chosen+next, choices[next]
wordList=['1p2m_3.3-1.8v_sal_ms','1p2m_3.3-1.8v_pol_ms','1p3m_3.3-18.v_sal_ms']
choices = reduce(addSplit,(s.split('_') for s in wordList),  {})
chosen = ""
while choices:
chosen, choices = queryUser(chosen, choices)
print "You chose:", chosen

'as

```