
On Fri, 24 Jul 2009 02:22:47 am Georg Brandl wrote:
(1) What elements should go into the universal set? Strings, ints, floats, presidents... ?
Everything. Every possible (i.e. hashable) object is a (virtual) member of the universal set. You're thinking too mathematically here. Python has no notion of a category of elements for ordinary sets as well -- you can put a president in a set together with integers without problems.
No, I'm not thinking mathematically, I'm thinking concretely. A set that contains "everything" is an abstract concept. For concrete problem solving, the universe is (nearly always) smaller than "everything". It's everything within the problem domain, not everything imaginable.
(2) What is len(set.universal())?
Since it was apparently decided that len() can "lie", sys.maxsize would be a nice value. Otherwise, an exception.
Both are problematic. Firstly, if it raises an exception, then it means all set handling code needs to be written (e.g.): try: len(s) except Exception: # handle universal set print "What do I do here?" instead of the simple: len(s) The same for anything which iterates over a set, or calls pop() or remove(). That's bad -- we're complicating all set-handling code, just to make *one* special case a tiny bit not-really-simpler (actually more complicated -- the solution with reduce is simpler than the suggested idiom). Secondly, if it returns sys.maxint, that's not as bad, but it's still wrong -- sys.maxint is being used as a sentinel, a "magic value", rather than actually being the length. If you count the elements that are in the set, you get more than the length: n = 0 U = set.universal() for i in xrange(-sys.maxint, sys.maxint): # this may take a while... if i in U: n += 1 assert n > len(U) To me, having len() be accurate is *far* more important than being able to supposedly simplify (complexify?) a one-liner using reduce into a three-liner using set.universal(). Having a built-in type lie about its length gives me the willies. [...]
Of course, it could not be an ordinary set object (i.e. one actually containing references to its members) -- that would require an infinite amount of memory. However, it *can* implement much of the set interface without a problem.
Fair enough, but it's the parts that it can't implement which are critical. It's not a set, it's something which is nearly a set but is supposed to be used in combination with real sets. Any time you write code that expects sets, you have to allow for the risk that you might be given a not-really-a-set universal set instead. -- Steven D'Aprano