Py2.3: Feedback on Sets

Jimmy Retzlaff jimmy at
Tue Aug 12 13:15:50 CEST 2003

Raymond Hettinger (vze4rx4y at wrote:
>* Are you overjoyed/outraged by the choice of | and &
>   as set operators (instead of + and *)?

I never even thought about it until I saw this question. Either set of
operators makes sense to me.

>* Is the support for sets of sets necessary for your work
>   and, if so, then is the implementation sufficiently
>   powerful?

I have not used a set of sets.

>* Is there a compelling need for additional set methods like
>   Set.powerset() and Set.isdisjoint(s) or are the current
>   offerings sufficient?

It's been sufficient for my needs so far.

>* Does the performance meet your expectations?

My expectations have been met for everything I've thrown at it so far.
The biggest set I've dealt with so far was several hundred thousand
tuples, each containing 2 floats. For massive data sets (tens of
millions) requiring high performance I've used kjBuckets in the past. I
haven't had a chance to compare the performance of kjBuckets and sets
yet, but kjBuckets will be going away for me if the performance of sets
is even close.

>* Do you care that sets can only contain hashable elements?

Nope, everything I tend to use is either numeric/string types or tuples
of those types.

>* How about the design constraint that the argument to most
>   set methods must be another Set (as opposed to any iterable)?

This caught me the first time or two, but it immediately seemed

>* Are the docs clear?  Can you suggest improvements?

I haven't read them. I sucked out of CVS a while ago for use
with Python 2.2.x and so I used the source as docs. I haven't even
looked in the source for quite a while as things seem pretty obvious to
me at this point.

>* Are sets helpful in your daily work or does the need arise
>   only rarely?

The need for them arises a couple times a week for me. I do a fair
amount of data manipulation in SQL databases with Python. There are
things that are natural in SQL and other things that are natural in
Python. Having an easy to use sets type affords a greater overlap
between SQL and Python which gives me more choices when deciding how to
manipulate my data.

Thanks to everyone involved for the good work.


More information about the Python-list mailing list