[Python-Dev] PEP 218 (sets); moving set.py to Lib
Guido van Rossum
Tue, 20 Aug 2002 16:49:25 -0400
> > I am still perplexed that I receoved *no* feedback on the sets module
> > except on this issue of sort order (which I consider solved by adding
> > a method _repr() that takes an optional 'sorted' argument).
> I haven't read the entire thread, but I was puzzled by the implementation
> approach. Did you consider kjbuckets for the standard Python distribution?
No. I think that would be the wrong idea at this point for two
reasons: (1) never change two variables at the same time; (2) let's
gather some experience with the new set API first, before we start
worrying about implementation speed.
I also believe that kjbuckets maintains its data in a sorted order,
which is unnecessary for sets -- a hash table is much faster. After
all we use a very fast hash table implementation to represent sets.
(The only improvement would be that we could save maybe 4 bytes per
hash table entry because we don't need a value pointer.)
> While the claim is rather old, the following quote from Aaron's
> intro  to the module suggests it might improve performance:
> For suitably large compute intensive uses these types should
> provide up to an order of magnitude speedup versus an
> implementation that uses analogous operations implemented
> directly in Python.
The sets module does not implement analogous operations directly in
Python. Almost all the implementation work is done by the dict
> Adding the gadfly SQL database to the standard library would also be
> useful, but since it is back under development it would be best for
> gadfly to live on a separate release cycle. The kjbuckets software,
> however, doesn't seem to be changing.
Because nobody is maintaining it any more.
> One more reason for adding kjbuckets, Tim Berner-Lee might find the
> kjGraphs class useful for the semantic web work.
>  http://starship.python.net/crew/aaron_watters/kjbuckets/kjbuckets.html
kjbuckets may be nice, but adding it to the core would add a serious
new maintenance burden for the core developers. I don't see anyone
raising their hand to help out here.
--Guido van Rossum (home page: http://www.python.org/~guido/)