[Python-Dev] I think my set module is ready for prime time; comments?
Greg Wilson
gvwilson@nevex.com
Tue, 23 Jan 2001 15:26:22 -0500
Greg Wilson:
Meta-question: do people want to continue to discuss sets on the
general python-dev list, or take it out-of-line (e.g. to an egroups
list)? I'm finding all of the discussion very useful, but I realize
that many readers might prefer to concentrate on the 2.1 release...
> Jeremy Hylton <jeremy@alum.mit.edu>:
> > The tests showed that dictionary-based sets were always faster.
> > small tests (3 operations), the difference was about 10 percent.
> > larger tests (88 operations), the difference ranged from
> > 180 to almost 700 percent.
> Eric Raymond <esr@thyrsus.com>:
> Not surprising. 88 elements is getting pretty large.
Greg Wilson:
Really? I was testing my implementation with sets of email addresses
grep'd out of old mail folders --- typical sizes were several thousand
elements.
> From: Christopher Petrilli <petrilli@amber.org>
> Unfortunately, for me, a Python implementation of Sets is only
> interesting academicaly. Any time I've needed to work with them at a
> large scale, I've needed them *much* faster than Python could achieve
> without a C extension.
Greg Wilson:
I had been expecting to implement this in C, not in pure Python, for
performance.
> From: Christopher Petrilli <petrilli@amber.org>
> In the "scripting" problem domain, I would agree that Sets would
> rarely reach large sizes,
> and so a algorithm which performed in quadratic time might be fine,
Greg Wilson:
I strongly disagree (see the email address example above --- it was
the first thing that occurred to me to try). I am still hoping to
find a sub-quadratic (preferably sub-linear) implementation. I can
do it in C++ with observer/observable (contained items notify containers
of changes in value, sets store all equivalent items in the same bucket),
but that doesn't really help...
> From: Ka-Ping Yee <ping@lfw.org>
> The only change that needs to be made to support sets of immutable
> elements is to provide "in" on dictionaries...
and:
> From: Neil Schemenauer <nas@arctrix.com>
> ...if sets are added to the core...we would
> use the implementation of PyDict but drop the values.
Unfortunately, if values are required to be immutable, then sets of
sets aren't possible... :-(
Thanks, everyone,
Greg