[Patches] [ python-Patches-625513 ] sets.BaseSet.isdisjointset(other)

Thu, 12 Dec 2002 15:17:52 -0800

Patches item #625513, was opened at 2002-10-19 00:20
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=625513&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Shane Holloway (shane_holloway)
Assigned to: Nobody/Anonymous (nobody)
Summary: sets.BaseSet.isdisjointset(other)

Initial Comment:
Provides an optimized test for disjoint sets without 
creating an intersection, and returning after finding first 
shared element.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2002-12-12 18:17

Message:
Logged In: YES 
user_id=80475

Here is an alternative implementation:

>>> def isdisjoint(a,b):
...     return len(a) != len(a|b)

The underlying dict.copy() and dict.update() help give this 
a speed advantage.

----------------------------------------------------------------------

Comment By: Shane Holloway (shane_holloway)
Date: 2002-10-22 13:32

Message:
Logged In: YES 
user_id=283742

I definitely agree that this disjoint set test and an intersection 
both have linear time complexity, but the intersection also 
has a memory complexity that a disjoint test does not have.  
Secondly, in the case of an intersecting set, it short-circuts 
the rest of the loop since it already has the answer, halving 
the time complexity on average.  However, disjoint test 
doesn't benefit from the C-loop of filter that the intersection 
does.

When I wrote this patch, I was creating some large sets 
doing reachability searches, and I wanted to merge the 
intersecting ones.  I started by examining the length of the 
intersection, but the time to allocate the complete result was 
bottlenecking things.  (400K entries.)  Thus I wrote the 
disjoint test and things went much faster.  But, as you said, 
this is neither common nor compelling.  

So I assumed it was a common enough function that others 
would profit from having it the standard library.  It was 
certainly all over my set theory classes a few years ago.  ;)  
Perhaps what we need is more c-loop functions to add to 
map, filter and reduce?  A function called 'first' would work 
wonders here and other places.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-10-22 00:54

Message:
Logged In: YES 
user_id=80475

In general, I'm a fan of early-out routines, but I agree with 
Tim, unless this patch solves a compelling common case 
and is clear about when and whether it can be used to 
advantage, the filter based itersection method remains the 
best solution for the general case.

Unless a compelling, common use case can be shown, I 
recommend that we don't fatten the interface further.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-10-19 01:33

Message:
Logged In: YES 
user_id=31435

In what sense is this optimized?  People usually 
mean "speed" when they use that word without qualification, 
but this surely runs much slower (most of the time) if the 
sets are in fact disjoint.  OTOH, I'm sure it does run much 
faster (most of the time) if the sets have the same 
elements.

If a method is merely provided for speed in *some* cases, 
it's generally a hard sell.  It would help your case if the 
docs here spelled out exactly when and why someone 
would want to use the new method; else it appears 
mysteriously redundant.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=625513&group_id=5470