Hi Sergey, I thought that set partitioning might be a worthy candidate for implementation in C if an algorithm that cut down on the set traversals could be found. Yep, I didn't think such an algorithm in Python would likely be any quicker. I guess there are two parts to my question:<div><ol><li><span style="line-height: normal;">Does an algorithm using less passes exist? Yes, thanks.</span></li><li><span style="line-height: normal;">Would a C implementation be a worthwhile addition to Python? - Ongoing...</span></li></ol>- Paddy.<br><br>On Thursday, 4 July 2013 13:33:35 UTC+1, Sergey wrote:<blockquote class="gmail_quote" style="margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">On Jul 4, 2013 Oscar Benjamin wrote:
<br>
<br>>> I usually use something like the set equations in the title to do this but I
<br>>> recognise that this requires both sets to be traversed at least three times
<br>>> which seems wasteful.
<br>>>
<br>>> I wondered if their was am algorithm to partition the two sets of data into
<br>>> three as above, but cutting down on the number of set traversals?
<br>>
<br>> You can do it in one traversal of each set:
<br>>
<br>> def partition(setx, sety):
<br>> xonly, xandy, yonly = set(), set(), set()
<br>> for set1, set2, setn in [(setx, sety, xonly), (sety, setx, yonly)]:
<br>> for val in set1:
<br>> if val in set2:
<br>> xandy.add(val)
<br>> else:
<br>> setn.add(val)
<br>> return xonly, xandy, yonly
<br>
<br>JFYI, despite using two passes this long partition() is twice slower
<br>than simple:
<br> def partition(set1, set2):
<br> common = set1 & set2
<br> return set1 - common, common, set2 - common
<br>
<br>That's because both functions are O(n) but the short one is just
<br>a few native calls, while long one uses lots of small calls, and
<br>thus overhead of so many function calls *significantly* exceeds
<br>time of one additional pass.
<br>
<br>Simple is better than complex. Readability counts. :)
<br></blockquote></div>