[Python-Dev] I think my set module is ready for prime time; comments?
Christopher Petrilli
petrilli@amber.org
Tue, 23 Jan 2001 14:06:05 -0500
Eric S. Raymond [esr@thyrsus.com] wrote:
> I believe most uses of sets are small sets. That reduces the speed ouch
> of using a list representation and increases the proportional memory
> ouch of a dictionary implementation.
The problem is that there are a lot of uses for large sets, especially
when you begin to introduce intersections and unions. If an
implementation is only useful for a few dozen (or a hundered) items in
the set, that eliminates a lot of places where the real use of set
types is useful---optimizing large scale manipulations.
Zope for example, manipulates sets with 10,000 items in it on a
regular basis when doing text index manipulation. The data structures
are heavily optimized for this kind of behaviour, without a major
sacrifice in space. I think Jim perhaps can talk to this.
Unfortunately, for me, a Python implementation of Sets is only
interesting academicaly. Any time I've needed to work with them at a
large scale, I've needed them *much* faster than Python could achieve
without a C extension.
Perhaps the difference is in problem domain. In the "scripting"
problem domain, I would agree that Setswould rarely reach large sizes,
and so a algorithm which performed in quadratic time might be fine,
because the actual resultant time is small. However, in more
full-blown applications, this would be counter productive, and the
user would be forced implement their own (or use Aaron's excellent
kjBuckets).
Just my opinion, of course.
Chris
--
| Christopher Petrilli
| petrilli@amber.org