[Tutor] Arbitrary-argument set function

Steven D'Aprano steve at pearwood.info
Wed Oct 2 03:29:13 CEST 2013


On Tue, Oct 01, 2013 at 11:43:09PM +0100, Spyros Charonis wrote:
> Dear Pythoners,
> 
> 
> I am trying to extract from a set of about 20 sequences, the characters
> which are unique to each sequence. For simplicity, imagine I have only 3
> "sequences" (words in this example) such as:
> 
> 
> s1='spam'; s2='scam', s3='slam'
> 
> 
> I would like the character that is unique to each sequence, i.e. I need my
> function to return the list [ 'p', 'c', ',l' ]. This function I am using is
> as follows:

What happens if there is more than one unique character?

s1 = 'spam'; s2 = 'eggs'

Or none?

s1 = 'spam'; s2 = 'pams'


What happens if two sequences have the same unique character?

s1 = 'spam'; s2 = 'slam'; s3 = 'slam'

Do you get 'l' twice or only once?

Do you need the unique characters in the order seen? All mixed in 
together? 

Your requirements are not detailed enough to solve this question. For 
instance, if all you want is a single collection of globally unique 
characters, in no particular order, this is probably the simplest and 
fastest way to do it:

def uniq(*strings):
    result = set()
    result.update(*strings)
    return list(result)


If you want to extract unique characters from each string, but 
keep them distinct, and in order, you can do this:

# untested
def uniq(*strings):
    globally_unique = set()
    globally_unique.update(*strings)
    result = []
    for s in strings:
        tmp = []
        for c in s:
            if c in globally_unique and not c in tmp:
                tmp.append(c)
        result.append(''.join(tmp))
    return result


If you want something else, you should be able to adapt the above two 
recipes to do whatever you like.



-- 
Steven


More information about the Tutor mailing list