1. Rename .remove() to __del__(). Its usage is inconsistent with list.remove(element) which can leave other instances of element in the list. It is more consistent with 'del adict[element]'.
(You mean __delete__.) -1. Sets don't support the x[y] notation for accessing or setting elements, so it would be weird to use that for deleting. You're not deleting the value corresponding to the key or index y (like you are when using del x[y] on a list or dict), you're deleting y itself. That's more like x.remove(y) for lists.
2. discard() looks like a useful standard API. Perhaps it shoulds be added to the dictionary API.
Perhaps. But the dict API already has so many things. And why not to the list API? I'm -0 on this.
3. Should we add .as_temporarily_immutable to dictionaries and lists so that they will also be potential elements of a set?
Hm, I think this is premature. I'd like to see a use case for a set of dicts and a set of lists first. Then you can code list and dict subclasses in Python that implement _as_temporarily_immutable (and _as_immutable too, I suppose). Then we'll see how often this ends up being used. For now, I'd say YAGNI. (I've said YAGNI to sets for years, but the realization that as a result lots of people independently invented using the keys of a dict to represent a set made me change my mind. Sort of like how I changed my mind on a Boolean type, too, after 12 years of thinking it wasn't needed. :-)
4. remove(), update(), add(), and __contains__() all work hard to check for .as_temporarily_immutable(). Should this propagated to other methods that add set members(i.e. replace all instances of data[element] = value with self.add(element) or use self.update() in the code for __init__())?
I've been thinking the same thing. I think that the only case where this could apply is __init__(), by the way.
The answer is tough because it causes an enormous slowdown in the common use cases of uniquifying a sequence. OTOH, why check in some places but not others -- why is .add(aSetInstance) okay but not Set([aSetInstance]).
Really? Why the slowdown? I was thinking of simply changing __init__ into if seq is not None: self.update(seq) If that's too slow, perhaps update() could be changed to the following: it = iter(seq) try: for elt in it: data[elt] = value except TypeError: transform = getattr(elt, '_as_immutable', None) if transform is None: raise data[transform()] = value self.update(it) That is, if there are no elements that require transformation, the added cost is a single try/except setup (plus an extra call to iter()). If any element requires transformation, the rest of the elements are dealt with as fast as update() can. Hm, maybe this could be applied to update() too (except it shouldn't call itself recursively but simply write the loop out a second time, with a try/except around each element).
If the answer is yes, then the code for update() should be super-optimized by taking moving the try/except outside the for-loop and wrapping the whole thing in a while 1.
That's a similar idea as I just sketched. Can you email me a proposed patch? (Let's skip SF for this.)
Also, we could bypass the slower .add() method when incoming source of elements is known to be an instance of BaseSet.
Huh? Nobody calls add() internally AFAIK.
5. Add a quick pre-check to issubset() and issuperset() along the lines of:
def issubset(self, other): """Report whether another set contains this set.""" self._binary_sanity_check(other) if len(self) > len(other): return False # Fast check for the obvious case for elt in self: if elt not in other: return False return True
Sure. Check it in.
6. For clarity and foolish consistency, replace all occurrences of 'elt' with 'element'.
Hm, no. 'element' for a loop control variable seems too long (I'd be happy with 'x' but Greg Wilson used 'element'). However I like 'element' as the argument name because it can be used as a keyword argument and then it's better spelled out in full. I think Greg Wilson used 'item' most of the time; I prefer to be consistent and say 'element' all the time since that's the accepted set terminology. --Guido van Rossum (home page: http://www.python.org/~guido/)