(Thanks Steven for the welcome back. I've been advocating Python through the Rosetta Code site). I couldn't think of Oscar and your solution but now that we have a Python algorithm, I understand that it could only be really tested if there were an implementation in C to test against the incumbent method. To give a bit more info on how much it is used: I have even incorporated the partition-type functionality in Python in throw-away bash shell scripts of the form: munge.sh data1 | sort > results1.txt munge.sh data2 | sort > results2 .txt python -c ' set1 = set(file("results1.txt")) set2 = set(file("results2.txt")) print([len(p) for p in (set1 - set2, set1 & set2, set2 - set1)]) ' I have used the above several times when my boss touches base with me and asks the informal "hi, and how are you getting on"? The above isn't my only use. I have used the construct enough times for me to think of it as a "design pattern". Before I posted, I looked up set partition on Wikipedia. It has a different meaning for a partition of a set, but doesn't apply any meaning to the partition of two sets leaving the dyadic meaning of partition vacant :-) - Paddy. On Thursday, 4 July 2013 03:10:28 UTC+1, Steven D'Aprano wrote:
Hi Paddy, long time since I've read something from you, welcome back.
On 04/07/13 06:50, Paddy3118 wrote:
I found myself repeating something that I know I have used before, several times: I get two sets of results, may be sets of the passing tests when a design has changed, and I need to work out what has changed so work out
1. What passed first time round 2. What passed both times. 3. What passed only the second time round.
I usually use something like the set equations in the title to do this but I recognise that this requires both sets to be traversed at least three times which seems wasteful.
I wondered if their was am algorithm to partition the two sets of data into three as above, but cutting down on the number of set traversals?
You could go from:
exclusively1 = set1 - set2 common = set1 & set2 exclusively2 = set2 - set1
(which I expect ends up traversing each set twice) to this:
common, exclusively1, exclusively2 = set(), set(), set() for item in set1: if item in set2: common.add(item) else: exclusively1.add(item) for item in set2: if item in set1: common.add(item) else: exclusively2.add(item)
which only does two set traversals. But I would expect that using the set operators would be faster. (Iterating over the sets 3 or 4 times in C is likely to be faster than iterating over them 2 times in pure Python.)
I also wondered that if such an algorithm existed, would it be useful enough to be worth incorporating into the Python library?
Maybe defined as:
exclusively1, common, exclusively2 = set1.partition(set2)
[bikeshed] I would expect common, only1, only2 in that order.
Does any other language with sets offer this as a set primitive? I too have needed it often enough that I'd be +1 on adding it.
-- Steven _______________________________________________ Python-ideas mailing list Python...@python.org <javascript:> http://mail.python.org/mailman/listinfo/python-ideas