[Python-ideas] universal set object for use in set manipulation

David Borowitz ddborowitz at gmail.com
Thu Jul 23 18:19:13 CEST 2009


Terry Jones briefly mentioned support for very large sets via complements,
which I think would solve many of the issues you bring up.

On Thu, Jul 23, 2009 at 08:26, Steven D'Aprano <steve at pearwood.info> wrote:

> On Thu, 23 Jul 2009 04:28:33 pm Andy Kish wrote:
> > Hi all,
> >
> > I think the addition of a universal set object would be a nice touch
> > for python sets.
>
> But you can't have a universal set until you know what the problem
> domain is. If the problem domain is citrus fruits, then the universal
> set is: {lemons, limes, grapefruits, oranges, mandarins, tangelos,
> kumquats, ... }
>
> If the problem domain is Presidents of the USA, then the universal set
> is: {Washington, Lincoln, Bush, Clinton, Obama, Wilson, ... }
>
> If the problem domain is integers, then it is {0, 1, -1, 2, -2,
> 3, -3, ...}
>
> The first two are finite, the third is infinite.
>

The domain is inherently all objects that can be put into a Python set, so
the universal set is all Python objects. (I think for the purposes of Python
we could gloss over the set-theoretical paradoxiness of "set.universal() in
set.universal()" returning True :)

> Manipulation of collections of sets becomes much
> > easier with it. Right now, certain operations are easy because empty
> > sets are easy to create while the dual operation requires special
> > cases:
> >
> >     set_union = set()
> >     for s in sets:
> >         set_union |= s
>
> Better written as:
>
> >>> reduce(operator.or_, [set([1, 2, 3]), set([2, 4, 5])])
> set([1, 2, 3, 4, 5])
>
>
> >     # this doesn't work
> >     set_intersection = set(???)
> >     for s in sets:
> >         set_intersection &= s
>
> But this does:
>
> >>> reduce(operator.and_, [set([1, 2, 3]), set([2, 4, 5])])
> set([2])
>
>
> > Implementation of a universal set would be pretty trivial.
>
> You think so? I can think of a number of problems.
>
> (1) What elements should go into the universal set? Strings, ints,
> floats, presidents... ?
>

There are two ways of defining what is "in" a set. One is what
iter(set.universal()) iterates through, which there is no good answer for.
But the other (arguably just as common) way to define what is "in" a set s
is "the set of Python objects p such that "p in s" returns True." This
latter definition is trivial to implement for the universal set. If we keep
track of the complement of a nearly-universal set, that also gives us an
easy way to implement in.

(2) What is len(set.universal())?
>

inf. Perhaps there is a complication since inf is technically a float type?
I don't see it being a huge deal. ("Real" universal sets have uncountably
many elements, whereas AFAICT floating-point inf is countable; since we have
finite-memory systems I can't imagine it makes a difference.)

(3) What does set.universal() print?
>

Some special-case string; maybe "set(U)" for the universal set, or
"set(U-[elements,in,complement])" for general nearly-infinite sets.

(4) What does set.universal().clear() do?
>

set.universal() doesn't have to be a singleton; it could return a brand new,
mutable universal set. So set.universal() returns a new set, and .clear()
clears a set regardless of what it contains.

(5) For that matter, union(), copy(), difference(), issubset(), etc?
>

I think these all fall pretty easily out of the definition of a universal
set and a complement representation of nearly infinite sets.

(The universal set for strings of length one is a subset of the
> universal set for all strings.)
>

The universal set is a superset of all of these: it is the set of all Python
objects. (Yes, that's paradoxical, but again, I don't think that matters for
most use cases in Python.)

I don't think there is a suitable solution for all of these issues. That
> would mean that set.universal() can't be a real set object, it has to
> be some sort of not-quite-a-set object that doesn't provide the entire
> set interface.
>
>
Storing the complement of an infinite set is pretty straightforward way of
implementing sets that are universal except for a few (maybe zero) elements,
and can satisfy the entire set interface except for iteration. As you
suggest, it would be far trickier to provide general infinite sets, like the
set of all integers, but that's not what's being asked for here.

I'm not saying it won't be tricky to implement; it requires changing pretty
much all set methods, I've never touched CPython code. I'm also roughly +0
on the issue (not that my vote counts since I lurk far more than I
contribute). I'm just pointing out it's possible.


>
>
> --
> Steven D'Aprano
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
It is better to be quotable than to be honest.
   -Tom Stoppard

Borowitz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090723/4078d427/attachment.html>


More information about the Python-ideas mailing list