[Python-Dev] PEP 218 (sets); moving set.py to Lib
Raymond Hettinger
python@rcn.com
Tue, 20 Aug 2002 19:23:30 -0400
[GvR]
> > > I am still perplexed that I receoved *no* feedback on the sets module
> > > except on this issue of sort order (which I consider solved by adding
> > > a method _repr() that takes an optional 'sorted' argument).
[RH]
> > P.S. More comments are on the way as we play with, profile, review,
> > optimize, and document the module ;)
[GvR]
> Didn't you submit a SF patch/bug? I think I replied to that.
Yes. I've now revised the patch accordingly.
More thoughts:
1. Rename .remove() to __del__(). Its usage is inconsistent with list.remove(element) which can leave other instances of element
in the list. It is more consistent with 'del adict[element]'.
2. discard() looks like a useful standard API. Perhaps it shoulds be added to the dictionary API.
3. Should we add .as_temporarily_immutable to dictionaries and lists so that they will also be potential elements of a set?
4. remove(), update(), add(), and __contains__() all work hard to check for .as_temporarily_immutable(). Should this propagated
to other methods that add set members(i.e. replace all instances of data[element] = value with self.add(element) or use
self.update() in the code for __init__())?
The answer is tough because it causes an enormous slowdown in the common use cases of uniquifying a sequence. OTOH, why check in
some places but not others -- why is .add(aSetInstance) okay but not Set([aSetInstance]).
If the answer is yes, then the code for update() should be super-optimized by taking moving the try/except outside the for-loop
and wrapping the whole thing in a while 1. Also, we could bypass the slower .add() method when incoming source of elements is
known to be an instance of BaseSet.
5. Add a quick pre-check to issubset() and issuperset() along the lines of:
def issubset(self, other):
"""Report whether another set contains this set."""
self._binary_sanity_check(other)
if len(self) > len(other): return False # Fast check for the obvious case
for elt in self:
if elt not in other:
return False
return True
6. For clarity and foolish consistency, replace all occurrences of 'elt' with 'element'.
Raymond Hettinger