[Python-checkins] r54881 - peps/trunk/pep-3119.txt
guido.van.rossum
python-checkins at python.org
Fri Apr 20 00:47:11 CEST 2007
Author: guido.van.rossum
Date: Fri Apr 20 00:47:08 2007
New Revision: 54881
Modified:
peps/trunk/pep-3119.txt
Log:
Fix a blatant bug in the definition of hash().
Add ComposableSet, which defines union etc.
Modified: peps/trunk/pep-3119.txt
==============================================================================
--- peps/trunk/pep-3119.txt (original)
+++ peps/trunk/pep-3119.txt Fri Apr 20 00:47:08 2007
@@ -208,8 +208,17 @@
below). The abstract ``__hash__`` method always returns 0, which
is a valid (albeit inefficient) implementation. **Invariant:** If
classes ``C1`` and ``C2`` both derive from ``Hashable``, the
- invariant ``hash(o1) == hash(o2)`` must imply ``o1 == o2`` for all
+ condition ``o1 == o2`` must imply ``hash(o1) == hash(o2)`` for all
instances ``o1`` of ``C1`` and all instances ``o2`` of ``C2``.
+ IOW, two objects shouldn't compare equal but have different hash
+ values.
+
+ Another constraint is that hashable objects, once created, should
+ never change their value (as compared by ``==``) or their hash
+ value. If a class cannot guarantee this, it should not derive
+ from ``Hashable``; if it cannot guarantee this for certain
+ instances only, ``__hash__`` for those instances should raise an
+ exception.
Note: being an instance of this class does not imply that an
object is immutable; e.g. a tuple containing a list as a member is
@@ -287,8 +296,8 @@
``Finite``, ``Iterable`` and ``Container``. Not every subset of
those three classes is a set though! Sets have the additional
invariant that each element occurs only once (as can be determined
- by iteration), and in addition sets implement rich comparisons
- defined as subclass/superclass tests.
+ by iteration), and in addition sets define concrete operators that
+ implement rich comparisons defined as subclass/superclass tests.
Sets with different implementations can be compared safely,
efficiently and correctly. Because ``Set`` derives from
@@ -302,29 +311,33 @@
to mappings or sequences, but they can be compared for equality
(and then they always compare unequal).
- **Open issue:** Should we also implement the ``issubset`` and
- ``issuperset`` methods found on the set type in Python 2? As
- these are just aliases for ``__le__`` and ``__ge__``, I'm tempted
- to leave these out.
-
- **Open issue:** Should this class also implement the operators
- and/or methods that compute union, intersection, symmetric and
- asymmetric difference? The problem with those (unlike the
- comparison operators) is what should be the type of the return
- value. I'm tentatively leaving these out -- if you need them, you
- can test for a ``Set`` instance that implements e.g. ``__and__``.
- Some alternatives: make these abstract methods (even though the
- semantics apart from the type are well-defined); or make them
- concrete methods that return a specific concrete set type; or make
- them concrete methods that assume the class constructor always
- accepts an iterable of elements; or add a new class method that
- accepts an iterable of elements and that creates a new instance.
- (I originally considered a "view" alternative, but the problem is
- that computing ``len(a&b)`` requires iterating over ``a`` or
- ``b``, and that pretty much kills the idea.)
+ Note: the ``issubset`` and ``issuperset`` methods found on the set
+ type in Python 2 are not supported, as these are mostly just
+ aliases for ``__le__`` and ``__ge__``.
+
+ **Open Issues:** Should I spell out the invariants and method
+ definitions?
+
+``ComposableSet``
+ This is a subclass of ``Set`` that defines abstract operators to
+ compute union, intersection, symmetric and asymmetric difference,
+ respectively ``__or__``, ``__and__``, ``__xor__`` and ``__sub__``.
+ These operators should return instances of ``ComposableSet``. The
+ abstract implementations return no meaningful values but raise
+ ``NotImplementedError``; this is because any generic
+ implementation would have to create new instances but the ABCs
+ don't (and shouldn't, IMO) provide an API for creating new
+ instances. **Invariants:** The implementations of these operators
+ should ensure that the results match the mathematical definition
+ of set composition.
+
+ **Open Issues:** Should I spell out the invariants? Should we
+ define an API for creating new instances (e.g. a class method or a
+ fixed constructor signature)? Should we just pick a concrete
+ return type (e.g. ``set``)? Should we add the ``copy`` method?
``HashableSet``
- This is a subclass of both ``Set`` and ``Hashable``. It
+ This is a subclass of both ``ComposableSet`` and ``Hashable``. It
implements a concrete ``__hash__`` method that subclasses should
not override; or if they do, the subclass should compute the same
hash value. This is so that sets with different implementations
@@ -332,17 +345,21 @@
as dictionary keys. (A similar constraint exists on the hash
values for different types of numbers and strings.)
+ **Open Issues:** Should I spell out the hash algorithm? Should
+ there be another ABC that derives from Set and Hashable (but not
+ from Composable)?
+
``MutableSet``
- This is a subclass of ``Set`` implementing additional operations
- to add and remove elements. The supported methods have the
- semantics known from the ``set`` type in Python 2:
+
+ This is a subclass of ``ComposableSet`` implementing additional
+ operations to add and remove elements. The supported methods have
+ the semantics known from the ``set`` type in Python 2:
``.add(x)``
Abstract method that adds the element ``x``, if it isn't
already in the set.
``.remove(x)``
-
Abstract method that removes the element ``x``; raises
``KeyError`` if ``x`` is not in the set.
@@ -355,16 +372,18 @@
Abstract method that empties the set. (Making this concrete
would just add a slow, cumbersome default implementation.)
- **Open issue:** Should we support all the operations implemented
- by the Python 2 ``set`` type? I.e. union, update, __or__,
- __ror__, __ior__, intersection, intersection_update, __and__,
- __rand__, __iand__, difference, difference_update, __xor__,
- __rxor__, __ixor__, symmetric_difference,
- symmetric_difference_update, __sub__, __rsub__, __isub__. Note
- that in Python 2, ``a.update(b)`` is not exactly the same as ``a
- |= b``, since ``update()`` takes any iterable for an argument,
+ **Open issues:** Should we support more operations implemented by
+ the Python 2 ``set`` type? E.g. pop, update, __ior__,
+ intersection_update, __iand__, difference_update, __ixor__,
+ symmetric_difference_update, __isub__. Should we unify ``remove``
+ and ``discard``, a la Java (which has a single method returning
+ a boolean indicating whether it was removed or not)?
+
+ Note that in Python 2, ``a.update(b)`` is not exactly the same as
+ ``a |= b``, since ``update()`` takes any iterable for an argument,
while ``|=`` requires another set; similar for the other
- operators.
+ operators. What to do about this? Do we really want the method
+ explosion that comes from this distinction?
Mappings
More information about the Python-checkins
mailing list