[Python-Dev] Efficient set complement and operation on large/infinite sets.
terry at jon.es
Fri May 12 09:57:39 CEST 2006
>>>>> "Raymond" == Raymond Hettinger <rhettinger at ewtllc.com> writes:
Raymond> There's room in the world for alternate implementations of sets,
Raymond> each with its own strengths and weaknesses.
Raymond> Alternatve implementations will most likely start-off as
Raymond> third-party extension modules
Raymond> As for the built-in set types, I recommend leaving those alone and
Raymond> keeping a API as simple as possible.
Agreed. I implemented most of what I need last night - not surprisingly in
less time than it took to write the original mail.
Raymond> The __rand__ idea is interesting but I don't see how to implement
Raymond> an equiprobable hash table selection in O(1) time -- keep in mind
Raymond> that a set may be very sparse at the time of the selection.
I think the rand issue is enough to make the proposal questionable.
Another thing I realized which I think adds to the argument that this
should be separate, is that it's not so straightforward to deal with
operations on sets that have different universes. You can do it, but it's
messy, not elegant, and not fun (and no-one is asking for it, not even me).
This leads to a much nicer approach in which you have a separate module
that provides a UniversalSet class. When you call the constructor, you
tell it whatever you can about what the universe looks like: how big it is,
how to generate random elements, how to exclude things, etc. This class
provides a simple (a la Python set.__init__) method that hands back a set
within this universe. You're then free to operate on that set, and other
instances provided by UniversalSet, as usual, and you get to do complement.
It's easy to imagine subclasses such as InfiniteSet, FiniteSet, etc.
Python's sets are the case where nothing is known about the universe (aside
from implementation details, like elements being hashable). The various
different kinds of UniversalSet subclasses would provide methods according
to what was known about their universes.
That would clearly be a 3rd party extension.
More information about the Python-Dev