[Python-ideas] exclusively1, common, exclusively2 = set1 - set2, set1 & set2, set2 - set1

Steven D'Aprano steve at pearwood.info
Thu Jul 4 04:10:28 CEST 2013


Hi Paddy, long time since I've read something from you, welcome back.

On 04/07/13 06:50, Paddy3118 wrote:
> I found myself repeating something that I know I have used before, several
> times: I get two sets of results, may be sets of the passing tests when a
> design has changed, and I need to work out what has changed so work out
>
> 1. What passed first time round
> 2. What passed both times.
> 3. What passed only the second time round.
>
> I usually use something like the set equations in the title to do this but
> I recognise that this requires both sets to be traversed at least three
> times which seems wasteful.
>
> I wondered if their was am algorithm to partition the two sets of data into
> three as above, but cutting down on the number of set traversals?

You could go from:

exclusively1 = set1 - set2
common = set1 & set2
exclusively2 = set2 - set1

(which I expect ends up traversing each set twice) to this:


common, exclusively1, exclusively2 = set(), set(), set()
for item in set1:
     if item in set2: common.add(item)
     else: exclusively1.add(item)
for item in set2:
     if item in set1: common.add(item)
     else: exclusively2.add(item)


which only does two set traversals. But I would expect that using the set operators would be faster. (Iterating over the sets 3 or 4 times in C is likely to be faster than iterating over them 2 times in pure Python.)


> I also wondered that if such an algorithm existed, would it be useful
> enough to be worth incorporating into the Python library?
>
> Maybe defined as:
>
> exclusively1, common, exclusively2 = set1.partition(set2)

[bikeshed]
I would expect common, only1, only2 in that order.


Does any other language with sets offer this as a set primitive? I too have needed it often enough that I'd be +1 on adding it.



-- 
Steven


More information about the Python-ideas mailing list