[Tutor] Arbitrary-argument set function
Steven D'Aprano
steve at pearwood.info
Wed Oct 2 03:29:13 CEST 2013
On Tue, Oct 01, 2013 at 11:43:09PM +0100, Spyros Charonis wrote:
> Dear Pythoners,
>
>
> I am trying to extract from a set of about 20 sequences, the characters
> which are unique to each sequence. For simplicity, imagine I have only 3
> "sequences" (words in this example) such as:
>
>
> s1='spam'; s2='scam', s3='slam'
>
>
> I would like the character that is unique to each sequence, i.e. I need my
> function to return the list [ 'p', 'c', ',l' ]. This function I am using is
> as follows:
What happens if there is more than one unique character?
s1 = 'spam'; s2 = 'eggs'
Or none?
s1 = 'spam'; s2 = 'pams'
What happens if two sequences have the same unique character?
s1 = 'spam'; s2 = 'slam'; s3 = 'slam'
Do you get 'l' twice or only once?
Do you need the unique characters in the order seen? All mixed in
together?
Your requirements are not detailed enough to solve this question. For
instance, if all you want is a single collection of globally unique
characters, in no particular order, this is probably the simplest and
fastest way to do it:
def uniq(*strings):
result = set()
result.update(*strings)
return list(result)
If you want to extract unique characters from each string, but
keep them distinct, and in order, you can do this:
# untested
def uniq(*strings):
globally_unique = set()
globally_unique.update(*strings)
result = []
for s in strings:
tmp = []
for c in s:
if c in globally_unique and not c in tmp:
tmp.append(c)
result.append(''.join(tmp))
return result
If you want something else, you should be able to adapt the above two
recipes to do whatever you like.
--
Steven
More information about the Tutor
mailing list