[Python-Dev] Re: PEP 218 (sets); moving set.py to Lib
Raymond Hettinger
python@rcn.com
Thu, 29 Aug 2002 01:23:07 -0400
Ideas for the day:
1. Optimize BaseSet._update(iterable) by checking for two special cases where a C-speed update method is already available and the
entries are known in advance to be immutable:
. . .
if isinstance(iterable, BaseSet):
self._data.update(iterable._data)
return
if isinstance(iterable, dict):
self._data.update(iterable)
return
. . .
2. Eliminate the binary sanity checks which verify for operators that 'other' is a BaseSet. If 'other' isn't a BaseSet, try using
it, directly or by coercing to a set, as an iterable:
>>> Set('abracadabra') | 'alacazam'
Set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'])
This improves usability because the second argument did not have to be pre-wrapped with Set. It improves speed, for some
operations, by using the iterable directly and not having to build an equivalent dictionary.
3. Have ImmutableSet keep a reference to the original iterable. Add an ImmutableSet.refresh() method that rebuilds ._data from
the iterable. Add a Set.refresh() method that triggers ImmutableSet.refresh() where possible. The goal is to improve the
usability of sets of sets where the inner sets have been updated after the outer set was created.
>>> inner = Set('abracadabra')
>>> outer = Set([inner])
>>> inner.add('z') # now the outer set is out-of-date
>>> outer.refresh() # now it is current
>>> outer
Set(['a', 'c', 'r', 'z', 'b', 'd'])
This would only work for restartable iterables -- a file object would not be so easily refreshed.
Raymond Hettinger