[Python-Dev] Re: PEP 218 (sets); moving set.py to Lib

Raymond Hettinger python@rcn.com
Thu, 29 Aug 2002 10:53:07 -0400

> > 2.  Eliminate the binary sanity checks which verify for operators that 'other' is a BaseSet. If 'other' isn't a BaseSet, try
> > it, directly or by coercing to a set, as an iterable:
> >
> > >>> Set('abracadabra') | 'alacazam'
> > Set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'])
> >
> > This improves usability because the second argument did not have to be pre-wrapped with Set.  It improves speed, for some
> > operations, by using the iterable directly and not having to build an equivalent dictionary.
> No.  This has been proposed before.  I think it's a bad idea, just as
>    [1,2,3] + "abc"
> is a bad idea.

I see the wisdom in preventing weirdness.  The real motivation was to get sets.py to play nicely with other set implementations.
Right now, it can only interact with instances of BaseClass.  And, even if someone subclasses BaseClass, they currently *must*
have a self._data attribute that is a dictionary.  This prevents non-dictionary based extensions.

> > 3.  Have ImmutableSet keep a reference to the original iterable.  Add an ImmutableSet.refresh() method that rebuilds ._data
> > the iterable.  Add a Set.refresh() method that triggers ImmutableSet.refresh() where possible.  The goal is to improve the
> > usability of sets of sets where the inner sets have been updated after the outer set was created.
> >
> > >>> inner = Set('abracadabra')
> > >>> outer = Set([inner])
> > >>> inner.add('z')                 # now the outer set is out-of-date
> > >>> outer.refresh()               # now it is current
> > >>> outer
> > Set([ImmutableSet(['a', 'c', 'r', 'z', 'b', 'd'])])
> >
> > This would only work for restartable iterables -- a file object would not be so easily refreshed.
> This *appears* to be messing with the immutability.  If I wrote:
>   a = range(3)
>   s1 = ImmutableSet(a)
>   s2 = Set([s1])
>   a.append(4)
>   s2.refresh()
> What would the value of s1 be?

Hmm, I intended to have s1.refresh() return a new object for use in s2 while leaving s1 alone (being immutable and all).  Now, I
wonder if that was the right thing to do.  The answer lies in use cases for algorithms that need sets of sets.  If anyone knows
off the top of their head that would be great; otherwise, I seem to remember that some of that business was found in compiler
algorithms and graph packages.