[Python-ideas] exclusively1, common, exclusively2 = set1 - set2, set1 & set2, set2 - set1

Paddy3118 paddy3118 at gmail.com
Thu Jul 4 16:33:34 CEST 2013


Hi Sergey, I thought that set partitioning might be a worthy candidate for 
implementation in C if an algorithm that cut down on the set traversals 
could be found. Yep, I didn't think such an algorithm in Python would 
likely be any quicker. I guess there are two parts to my question:

   1. Does an algorithm using less passes exist? Yes, thanks.
   2. Would a C implementation be a worthwhile addition to Python? - 
   Ongoing...

- Paddy.

On Thursday, 4 July 2013 13:33:35 UTC+1, Sergey wrote:
>
> On Jul 4, 2013 Oscar Benjamin wrote: 
>
> >> I usually use something like the set equations in the title to do this 
> but I 
> >> recognise that this requires both sets to be traversed at least three 
> times 
> >> which seems wasteful. 
> >> 
> >> I wondered if their was am algorithm to partition the two sets of data 
> into 
> >> three as above, but cutting down on the number of set traversals? 
> > 
> > You can do it in one traversal of each set: 
> > 
> > def partition(setx, sety): 
> >     xonly, xandy, yonly = set(), set(), set() 
> >     for set1, set2, setn in [(setx, sety, xonly), (sety, setx, yonly)]: 
> >         for val in set1: 
> >             if val in set2: 
> >                 xandy.add(val) 
> >             else: 
> >                 setn.add(val) 
> >     return xonly, xandy, yonly 
>
> JFYI, despite using two passes this long partition() is twice slower 
> than simple: 
>     def partition(set1, set2): 
>         common = set1 & set2 
>         return set1 - common, common, set2 - common 
>
> That's because both functions are O(n) but the short one is just 
> a few native calls, while long one uses lots of small calls, and 
> thus overhead of so many function calls *significantly* exceeds 
> time of one additional pass. 
>
> Simple is better than complex. Readability counts. :) 
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130704/b6ec8fe7/attachment.html>


More information about the Python-ideas mailing list