[Python-ideas] Implement __add__ for set and frozenset

Brandon Mintern bmintern at gmail.com
Wed Jun 4 05:37:28 CEST 2008


Just realized that I failed to send this to the list as well:

On Tue, Jun 3, 2008 at 3:48 PM, Raymond Hettinger <python at rcn.com> wrote:
> That's silly.  Lot's of functions do odd things with random argument
> ordering:
>
>>>> s = set([9, 3])
>>>> int.__sub__(*s)
>
> 6
>
> Besides, you can already run the sample fragment in Py2.5:
>
>>>> sos = set( [frozenset([1, 2]), frozenset([2, 3])])
>>>> frozenset.difference(*sos)
>
> frozenset([1])
>
>
> Raymond

Right, but that's why these functions do not accept more than two
arguments. They are intended to be used as instance methods only. If
we're promoting the idea of a set.method(*args) usage, however, the
usage should probably be intuitive. Because they are associative,
union, intersection, and symmetric_difference are all intuitive and do
what is expected no matter what. That is not true of set-difference.

In other words, it looks to me like set.method(*args) is trying to say
"Take all of elements in these iterables and make one set out of
them," or more simply, "Throw all this crap together." Intuitively, it
_shouldn't_ matter what order the arguments are in. What is the
meaning of taking the set-difference of a bunch of sets? Should we
promote an operation that doesn't make any sense?

In mathematics, there are symbols for set.union(*args) (big-U) and
set.intersection(*args) (big-upside-down-U), because they actually
come up in common usage. I'm not aware of any such symbols for other
set operations. Now that doesn't necessarily mean we shouldn't support
them, but it is certainly something to think about.

To take it from another angle, it is easy to define:

set.union(*args) - the set of all the elements appearing in at least
one of the args

set.intersection(*args) - the set of all the elements appearing in every arg

set.symmetric_difference(*args) - the set of all the elements
appearing in an odd number of arguments

but:

set.difference(*args) - the set of all elements appearing in the first
arg but not any of the rest

is fundamentally different. When using set operations, ordering
shouldn't even be a consideration. However,

A.difference(*args) - the set of all elements in A that do not appear
in any of the args

is well-defined. For that reason, I say that we should support *args
for all set operations, but we should only promote the use of
set.method syntax for intersection and union. set.difference doesn't
seem well-defined, and set.symmetric_difference doesn't seem very
useful (and could lead to usage of set.difference).

So...

+1 supporting *args for all set operations
+1 documenting the usage of set.union(*args) and
set.intersection(*args) as unioning/intersecting all of the arguments
-1 even mentioning set.difference or set.symmetric_difference in static usage

That's my 2c,
Brandon



More information about the Python-ideas mailing list