Comparing modified elements in Sets

Asun Friere afriere at
Tue Jul 10 04:44:11 EDT 2007

On Jul 10, 5:57 am, ChrisEdge... at wrote:
> I'd like to be able to compare set 1 with set 2 and have it match
> filename1 and filename3, or compare set 1 with 3 and get back
> filename1, filename2.  etc.
> Is there a way for me to do this inside the compare function, rather
> than having to make duplicate copies of each set?

Is there a will?
Inevitably there is a way! Whether you should take it is another
question entirely. ;)

Assuming by 'compare' function you mean such methods as 'difference,'
'symetric_difference', 'intersection' and the like... here's a nasty
little hack (using the old-school Set from  It's not to spec
(you get the tails back in the result, but that's easily fixed), and
it only implements a replacement method for 'difference' (called

I apologise if the google groups mailer kludges the indentation ...
from sets import Set
from itertools import ifilterfalse
from os.path import splitext

class BodgySet (Set) :

    def tailess_difference (self, other) :
        """Return, as a new BodgySet, the difference of two
        sets, where element identity ignores all characters
        from the last stop (period).

        NOTE: As currently implemented all elements of said
        sets must be strings (fix this in self.has_key)!!!

        assert other.__class__ is self.__class__
        result = self.__class__()
        data = result._data
        value = True
        for elt in ifilterfalse(other.has_key, self) :
            data[elt] = value
        return result

    def has_key (self, target) :
        thead, ttail = splitext(target)
        for key in self._data.keys() :
            khead, ktail = splitext(key)
            if thead == khead :
                return True

Using this hacked set:
>>> a = BodgySet(['a1.txt', 'a2.txt'])
>>> b = BodgySet(['a1.xml', 'a2.xml', 'a3.xml'])
>>> b.tailess_difference(a)

Is that the kind of thing you had in mind?

While it can be done, I would prefer to make copies of the sets, with
a cast list comprehension something like:  set([os.path.splitext(x)[0]
for x in orig_set]).  Much better readibility and probably greater
efficiency (I haven't bothered timing or dissing it mind).

More information about the Python-list mailing list