[Python-3000] PEP 3119 - Introducing Abstract Base Classes

Fri Apr 27 19:05:29 CEST 2007

On 4/26/07, Guido van Rossum <guido at python.org> wrote:

> - Where should PartiallyOrdered and TotallyOrdered live?

The abc module seems reasonable, even though the rest are containers.

> - Should we support comparison of different concrete set types?
> - Ditto for mapping types?
> - Ditto for sequence types?
> - Should Sequence derive from TotallyOrdered?

I think either choice would have been reasonable once upon a time, but
it is too late for "abc" = ("a", "b", "c"), because they can appear as
keys in the same dict.  I don't know if this constraint applies to
mappings or sets.

> - Do we need a non-composable hashable set type?
> - Ditto for a non-composable mutable set type?

Given iterable, non-composable is at best rare.  So long as it is
possible to override the isinstance and issubclass checks, I think
they can be left out of the stdlib.  (Though I think you use one as a
test case, which is a good idea.)

> No ABCs override ``__init__``, ``__new__``, ``__str__`` or
> ``__repr__``.  Defining a standard constructor signature would
> unnecessarily constrain custom container types, for example Patricia
> trees or gdbm files.  Defining a specific string representation for a
> collection is similarly left up to individual implementations.

Is this a style guide, or an actual constraint that should apply to
user-created ABCs?

> ``Hashable``
>     The base class for classes defining ``__hash__``.  The
>     ``__hash__`` method should return an ``Integer`` (see "Numbers"
>     below).  The abstract ``__hash__`` method always returns 0, which
>     is a valid (albeit inefficient) implementation.  **Invariant:** If
>     classes ``C1`` and ``C2`` both derive from ``Hashable``, the
>     condition ``o1 == o2`` must imply ``hash(o1) == hash(o2)`` for all
>     instances ``o1`` of ``C1`` and all instances ``o2`` of ``C2``.
>     IOW, two objects shouldn't compare equal but have different hash
>     values.

Is this really restricted to classes implementing Hashable?  I thought
it was an existing  general python rule that for all non-buggy objects
a and b

    (a == b)  ==> (hash(a) == hash(b))

unless at least one of  hash(a), hash(b) returned -1 or raised an error.

>     Another constraint is that hashable objects, once created, should
>     never change their value (as compared by ``==``) or their hash
>     value.

I've had a use case where objects were placed in a dict before they
were finalized.  Their value would change only in that certain
attributes could go (once) from undefined to defined.  Several
different objects might pass through the same indetermininate stage,
but only one would be in that state at a time.  (Yes, there was
probably a better architecture if I weren't dealing with legacy
formats.)

Basing hash on only the attributes always known from the start meant
that the hash would never change, and there weren't any equality tests
whose results could change, though the value as a whole (for complete
ordering) could change.

>     **Note:** the ``issubset`` and ``issuperset`` methods found on the
>     set type in Python 2 are not supported, as these are mostly just
>     aliases for ``__le__`` and ``__ge__``.

And I'll again lobby to include them, if only as concrete
implementations that say forward to self.__le__(other).

> Strings
> -------
>
> Python 3000 has two built-in string types: byte strings (``bytes``),
> deriving from ``MutableSequence``, and (Unicode) character strings
> (``str``), deriving from ``HashableSequence``.  They also derive from
> ``TotallyOrdered``.  If we were to introduce ``Searchable``, they
> would also derive from that.
>
> **Open issues:** define the base interfaces for these so alternative
> implementations and subclasses know what they are in for.  This may be
> the subject of a new PEP or PEPs (PEP 358 should be co-opted for the
> ``bytes`` type).

By Monday...?

-jJ