[Python-Dev] Efficient set complement and operation on large/infinite sets.

Raymond Hettinger rhettinger at ewtllc.com
Fri May 12 02:34:16 CEST 2006


>Guido> Hm...  Without reading though all this, I expect that you'd be
>Guido> better off implementing this for yourself without attempting to pull
>Guido> the standard library sets into the picture (especially since sets.py
>Guido> is obsolete as of 2.4; set and frozenset are now built-in types).
>Guido> You're really after rather specialized set representations. I've
>Guido> done this myself and as long as you stick to solving *just* the
>Guido> problem at hand, it's easy. If you want a general solution, it's
>Guido> hard...
>
>I certainly would be better off in the short term, and probably the long
>term too. It's likely what I'll do in any case as it's much, much quicker,
>I only need a handful of the set operations, and I don't need to talk to
>anyone :-)
>
>I don't think I'm proposing something specialized. Set complement is
>something one learns in primary school. It's just difficult to provide in
>general, as you say.
>
>Aside from my own itch, which I know how to scratch, my question is whether
>it's worth trying to work this into Python itself. Sounds like not.
>  
>
There's room in the world for alternate implementations of sets, each 
with its own strengths and weaknesses. For example, bitsets potentially 
offer some speed and space economies for certain apps. Alternatve 
implementations will most likely start-off as third-party extension 
modules, and if they prove essential, they may make it to the 
collections module.

As for the built-in set types, I recommend leaving those alone and 
keeping a API as simple as possible.

The __rand__ idea is interesting but I don't see how to implement an 
equiprobable hash table selection in O(1) time -- keep in mind that a 
set may be very sparse at the time of the selection.


Raymond


More information about the Python-Dev mailing list