[Python-ideas] exclusively1, common, exclusively2 = set1 - set2, set1 & set2, set2 - set1
Paddy3118
paddy3118 at gmail.com
Thu Jul 4 07:22:36 CEST 2013
(Thanks Steven for the welcome back. I've been advocating Python through
the Rosetta Code site).
I couldn't think of Oscar and your solution but now that we have a Python
algorithm, I understand that it could only be really tested if there were
an implementation in C to test against the incumbent method.
To give a bit more info on how much it is used: I have even incorporated
the partition-type functionality in Python in throw-away bash shell
scripts of the form:
munge.sh data1 | sort > results1.txt
munge.sh data2 | sort > results2 .txt
python -c '
set1 = set(file("results1.txt"))
set2 = set(file("results2.txt"))
print([len(p) for p in (set1 - set2, set1 & set2, set2 - set1)])
'
I have used the above several times when my boss touches base with me and
asks the informal "hi, and how are you getting on"?
The above isn't my only use. I have used the construct enough times for me
to think of it as a "design pattern".
Before I posted, I looked up set partition on Wikipedia. It has a different
meaning for a partition of a set, but doesn't apply any meaning to the
partition of two sets leaving the dyadic meaning of partition vacant :-)
- Paddy.
On Thursday, 4 July 2013 03:10:28 UTC+1, Steven D'Aprano wrote:
>
> Hi Paddy, long time since I've read something from you, welcome back.
>
> On 04/07/13 06:50, Paddy3118 wrote:
> > I found myself repeating something that I know I have used before,
> several
> > times: I get two sets of results, may be sets of the passing tests when
> a
> > design has changed, and I need to work out what has changed so work out
> >
> > 1. What passed first time round
> > 2. What passed both times.
> > 3. What passed only the second time round.
> >
> > I usually use something like the set equations in the title to do this
> but
> > I recognise that this requires both sets to be traversed at least three
> > times which seems wasteful.
> >
> > I wondered if their was am algorithm to partition the two sets of data
> into
> > three as above, but cutting down on the number of set traversals?
>
> You could go from:
>
> exclusively1 = set1 - set2
> common = set1 & set2
> exclusively2 = set2 - set1
>
> (which I expect ends up traversing each set twice) to this:
>
>
> common, exclusively1, exclusively2 = set(), set(), set()
> for item in set1:
> if item in set2: common.add(item)
> else: exclusively1.add(item)
> for item in set2:
> if item in set1: common.add(item)
> else: exclusively2.add(item)
>
>
> which only does two set traversals. But I would expect that using the set
> operators would be faster. (Iterating over the sets 3 or 4 times in C is
> likely to be faster than iterating over them 2 times in pure Python.)
>
>
> > I also wondered that if such an algorithm existed, would it be useful
> > enough to be worth incorporating into the Python library?
> >
> > Maybe defined as:
> >
> > exclusively1, common, exclusively2 = set1.partition(set2)
>
> [bikeshed]
> I would expect common, only1, only2 in that order.
>
>
> Does any other language with sets offer this as a set primitive? I too
> have needed it often enough that I'd be +1 on adding it.
>
>
>
> --
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python... at python.org <javascript:>
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130703/a815bd7f/attachment-0001.html>
More information about the Python-ideas
mailing list