[Python-ideas] exclusively1, common, exclusively2 = set1 - set2, set1 & set2, set2 - set1

Sat Jul 6 19:26:29 CEST 2013

On 07/07/13 01:34, Stephen J. Turnbull wrote:
> Steven D'Aprano writes:
>
>   > "partition" is not the ideal name for this method,
>
> +1
>
> It would completely confuse anybody who did want a partition.
>
>   > but the actual operation itself is very useful. I have often done
>   > this, mostly on dict views rather than sets. In my head, it's an
>   > obvious operation, to split a pair of sets into three, spoiled only
>   > by lack of a good name.
>   >
>   > Perhaps "split" is a reasonable name?
>   >
>   > only1, both, only2 = set1.split(set2)
>
> -1
>
> Set splitting is an intractable problem.
> https://en.wikipedia.org/wiki/Set_splitting_problem

Paddy's suggested method is a concrete, conceptually simple operation on two finite, discrete sets, not some theoretical problem[1] from complexity theory. If we're going to reject method names because they have some vague relation to some obscure corner of mathematics that 99% of programmers will never have heard of, let alone care about, I think we're going to soon run out of good names.

> That wouldn't bother me all that much except that I can imagine all
> kinds of ways to split sets that have little to do with boolean
> algebra (starting with Dedekind cuts, you see how messy this will
> get).

The string split method implements *one specific way* of splitting strings, by partitioning on some given delimiter. There are other ways of splitting, say by keeping the delimiter, or by partitioning the string in groups of N characters, or between pairs of brackets, etc. We don't reject the name "split" for strings just because there are alternative ways to split, and we shouldn't reject a simple, descriptive, understandable name "split" for sets just because there are other ways to split sets.

> I propose "join".[1]
...
> [1]  Or maybe it's the meet?  I never can keep the two straight....

I don't think that either is appropriate. Join and meet are operations on a single set, not a pair of them: the join of a set is the least upper bound (effectively, the maximum) and the meet is the greatest lower bound (effectively, the minimum) of the set. They are not operations on two sets.

http://en.wikipedia.org/wiki/Join_and_meet

Alternatively, join and meet can be defined as binary operations on elements of the set, rather than on the set itself.

But in any case, I don't think that a method that takes two sets as input and returns three sets should be called a "join". In plain English, when you join two things you get one thing, not three. And if we're going to reject set.partition because it doesn't behave quite the same as str.partition, then we should reject set.join because it doesn't behave anything even slightly like str.join, which is *far* more well-known than str.partition.

This is clearly a convenience method. There's already a fast way to calculate the result, it just takes three calls instead of one. This would be a little faster and more convenient but it wouldn't change what we can do with sets. I already have a utility function in my toolbox to calculate this, so it would be a Nice To Have if it were a built-in set method, but not if it means spending three weeks arguing about the method name :-)

[1] Not that I mean to imply that there is necessarily no concrete application for this problem. But being intractable, it is unlikely to be proposed or accepted as a set method.

-- 
Steven