Py2.3: Feedback on Sets
ialbert at mailblocks.com
Tue Aug 12 18:24:40 CEST 2003
Raymond Hettinger wrote:
First of all, thanks for the work on it, I need to use sets
in my work all the time. I had written my own
(simplistic) implementation but that adds another layer
of headaches when distributing programs since then
I have to distribute multiple modules.
Sometimes I ended up with a little set function in every
big module. Pretty silly. For me sets are a greatly useful
> * Is the support for sets of sets necessary for your work
> and, if so, then is the implementation sufficiently
One pattern that I constantly need is to remove duplicates from
a sequence. I don't know if this an often enough used pattern to
warrant an API change, for me it would be most useful if I could
get the contents of a set as a sequence right away, without having to
explicitly code it.
> * Are you overjoyed/outraged by the choice of | and & as
> set operators (instead of + and *)?
I think that since you have have - as a difference operator it
would make sense to also have + as a union operator. Takes nothing
away from |. The & operator is the right one, * would not be appropriate
> * Do you care that sets can only contain hashable elements?
I don't really care, on the other hand, it might be better to call the
class HashSet, so that it conveys right away that it uses hashing
to store the elements.
> * Are the docs clear? Can you suggest improvements?
I wondered whether it would be better to specify the immutability
of the class at the constructor level.
Then there is the update method. It feels a little bit redundant
since there is an add() method that seems to be doing the same thing
only that add() adds only one element at a time.
Would it be possible to have add() handle all additions, iterable or
not, then scrap update() altogether.
Then just by looking at the docs, it feels a little bit confusing to
have discard() and remove() do essentially the same thing but only one
of them raising an exception. Which one? I already forgot. I don't know
which one I would prefer though.
Another aspect that I did not understand, what is difference between
update() and union_update().
The long winded method names, such as difference_update() also feel
redundant when one can achieve the same thing with the -= operator. I
would drop these and instead show in the docs how to accomplish these
with the operators. Would considerably cut down on the documentation,
and apparent complexity.
I'm a big fan of having the minimal number of methods as long it is
easy to obtain the result.
For example methods like x.issubset(y) is the same as bool(x-y) so may
not be all that necessary, just a thought.
> * Are sets helpful in your daily work or does the need arise
> only rarely?
I use them very often and they are extremely useful.
More information about the Python-list