[Python-checkins] r54881 - peps/trunk/pep-3119.txt

guido.van.rossum python-checkins at python.org
Fri Apr 20 00:47:11 CEST 2007


Author: guido.van.rossum
Date: Fri Apr 20 00:47:08 2007
New Revision: 54881

Modified:
   peps/trunk/pep-3119.txt
Log:
Fix a blatant bug in the definition of hash().
Add ComposableSet, which defines union etc.


Modified: peps/trunk/pep-3119.txt
==============================================================================
--- peps/trunk/pep-3119.txt	(original)
+++ peps/trunk/pep-3119.txt	Fri Apr 20 00:47:08 2007
@@ -208,8 +208,17 @@
     below).  The abstract ``__hash__`` method always returns 0, which
     is a valid (albeit inefficient) implementation.  **Invariant:** If
     classes ``C1`` and ``C2`` both derive from ``Hashable``, the
-    invariant ``hash(o1) == hash(o2)`` must imply ``o1 == o2`` for all
+    condition ``o1 == o2`` must imply ``hash(o1) == hash(o2)`` for all
     instances ``o1`` of ``C1`` and all instances ``o2`` of ``C2``.
+    IOW, two objects shouldn't compare equal but have different hash
+    values.
+
+    Another constraint is that hashable objects, once created, should
+    never change their value (as compared by ``==``) or their hash
+    value.  If a class cannot guarantee this, it should not derive
+    from ``Hashable``; if it cannot guarantee this for certain
+    instances only, ``__hash__`` for those instances should raise an
+    exception.
 
     Note: being an instance of this class does not imply that an
     object is immutable; e.g. a tuple containing a list as a member is
@@ -287,8 +296,8 @@
     ``Finite``, ``Iterable`` and ``Container``.  Not every subset of
     those three classes is a set though!  Sets have the additional
     invariant that each element occurs only once (as can be determined
-    by iteration), and in addition sets implement rich comparisons
-    defined as subclass/superclass tests.
+    by iteration), and in addition sets define concrete operators that
+    implement rich comparisons defined as subclass/superclass tests.
 
     Sets with different implementations can be compared safely,
     efficiently and correctly.  Because ``Set`` derives from
@@ -302,29 +311,33 @@
     to mappings or sequences, but they can be compared for equality
     (and then they always compare unequal).
 
-    **Open issue:** Should we also implement the ``issubset`` and
-    ``issuperset`` methods found on the set type in Python 2?  As
-    these are just aliases for ``__le__`` and ``__ge__``, I'm tempted
-    to leave these out.
-
-    **Open issue:** Should this class also implement the operators
-    and/or methods that compute union, intersection, symmetric and
-    asymmetric difference?  The problem with those (unlike the
-    comparison operators) is what should be the type of the return
-    value.  I'm tentatively leaving these out -- if you need them, you
-    can test for a ``Set`` instance that implements e.g. ``__and__``.
-    Some alternatives: make these abstract methods (even though the
-    semantics apart from the type are well-defined); or make them
-    concrete methods that return a specific concrete set type; or make
-    them concrete methods that assume the class constructor always
-    accepts an iterable of elements; or add a new class method that
-    accepts an iterable of elements and that creates a new instance.
-    (I originally considered a "view" alternative, but the problem is
-    that computing ``len(a&b)`` requires iterating over ``a`` or
-    ``b``, and that pretty much kills the idea.)
+    Note: the ``issubset`` and ``issuperset`` methods found on the set
+    type in Python 2 are not supported, as these are mostly just
+    aliases for ``__le__`` and ``__ge__``.
+
+    **Open Issues:** Should I spell out the invariants and method
+    definitions?
+
+``ComposableSet``
+    This is a subclass of ``Set`` that defines abstract operators to
+    compute union, intersection, symmetric and asymmetric difference,
+    respectively ``__or__``, ``__and__``, ``__xor__`` and ``__sub__``.
+    These operators should return instances of ``ComposableSet``.  The
+    abstract implementations return no meaningful values but raise
+    ``NotImplementedError``; this is because any generic
+    implementation would have to create new instances but the ABCs
+    don't (and shouldn't, IMO) provide an API for creating new
+    instances.  **Invariants:** The implementations of these operators
+    should ensure that the results match the mathematical definition
+    of set composition.
+
+    **Open Issues:** Should I spell out the invariants?  Should we
+    define an API for creating new instances (e.g. a class method or a
+    fixed constructor signature)?  Should we just pick a concrete
+    return type (e.g. ``set``)?  Should we add the ``copy`` method?
 
 ``HashableSet``
-    This is a subclass of both ``Set`` and ``Hashable``.  It
+    This is a subclass of both ``ComposableSet`` and ``Hashable``.  It
     implements a concrete ``__hash__`` method that subclasses should
     not override; or if they do, the subclass should compute the same
     hash value.  This is so that sets with different implementations
@@ -332,17 +345,21 @@
     as dictionary keys.  (A similar constraint exists on the hash
     values for different types of numbers and strings.)
 
+    **Open Issues:** Should I spell out the hash algorithm?  Should
+    there be another ABC that derives from Set and Hashable (but not
+    from Composable)?
+
 ``MutableSet``
-    This is a subclass of ``Set`` implementing additional operations
-    to add and remove elements.  The supported methods have the
-    semantics known from the ``set`` type in Python 2:
+
+    This is a subclass of ``ComposableSet`` implementing additional
+    operations to add and remove elements.  The supported methods have
+    the semantics known from the ``set`` type in Python 2:
 
     ``.add(x)``
         Abstract method that adds the element ``x``, if it isn't
         already in the set.
 
     ``.remove(x)``
-
         Abstract method that removes the element ``x``; raises
         ``KeyError`` if ``x`` is not in the set.
 
@@ -355,16 +372,18 @@
         Abstract method that empties the set.  (Making this concrete
         would just add a slow, cumbersome default implementation.)
 
-    **Open issue:** Should we support all the operations implemented
-    by the Python 2 ``set`` type?  I.e. union, update, __or__,
-    __ror__, __ior__, intersection, intersection_update, __and__,
-    __rand__, __iand__, difference, difference_update, __xor__,
-    __rxor__, __ixor__, symmetric_difference,
-    symmetric_difference_update, __sub__, __rsub__, __isub__.  Note
-    that in Python 2, ``a.update(b)`` is not exactly the same as ``a
-    |= b``, since ``update()`` takes any iterable for an argument,
+    **Open issues:** Should we support more operations implemented by
+    the Python 2 ``set`` type?  E.g. pop, update, __ior__,
+    intersection_update, __iand__, difference_update, __ixor__,
+    symmetric_difference_update, __isub__.  Should we unify ``remove``
+    and ``discard``, a la Java (which has a single method returning
+    a boolean indicating whether it was removed or not)?
+
+    Note that in Python 2, ``a.update(b)`` is not exactly the same as
+    ``a |= b``, since ``update()`` takes any iterable for an argument,
     while ``|=`` requires another set; similar for the other
-    operators.
+    operators.  What to do about this?  Do we really want the method
+    explosion that comes from this distinction?
 
 
 Mappings


More information about the Python-checkins mailing list