[Python-Dev] Re: PEP 218 (sets); moving set.py to Lib

Raymond Hettinger python@rcn.com
Thu, 29 Aug 2002 01:23:07 -0400


Ideas for the day:

1. Optimize BaseSet._update(iterable) by checking for two special cases where a C-speed update method is already available and the
entries are known in advance to be immutable:

            . . .
            if isinstance(iterable, BaseSet):
                self._data.update(iterable._data)
                return
            if isinstance(iterable, dict):
                self._data.update(iterable)
                return
            . . .


2.  Eliminate the binary sanity checks which verify for operators that 'other' is a BaseSet. If 'other' isn't a BaseSet, try using
it, directly or by coercing to a set, as an iterable:

>>> Set('abracadabra') | 'alacazam'
Set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'])

This improves usability because the second argument did not have to be pre-wrapped with Set.  It improves speed, for some
operations, by using the iterable directly and not having to build an equivalent dictionary.


3.  Have ImmutableSet keep a reference to the original iterable.  Add an ImmutableSet.refresh() method that rebuilds ._data from
the iterable.  Add a Set.refresh() method that triggers ImmutableSet.refresh() where possible.  The goal is to improve the
usability of sets of sets where the inner sets have been updated after the outer set was created.

>>> inner = Set('abracadabra')
>>> outer = Set([inner])
>>> inner.add('z')                 # now the outer set is out-of-date
>>> outer.refresh()               # now it is current
>>> outer
Set(['a', 'c', 'r', 'z', 'b', 'd'])

This would only work for restartable iterables -- a file object would not be so easily refreshed.


Raymond Hettinger