[Python-3000] Need help completing ABC pep

Fri Apr 20 09:48:28 CEST 2007

On 4/19/07, Guido van Rossum <guido at python.org> wrote:
> I've started a PEP on Abstract Base Classes (ABCs), PEP 3119:
>
>     http://www.python.org/dev/peps/pep-3119/
>
> While I'm not ready yet to answer tough questions about this compared
> to alternative proposals, I *am* ready for feedback on the various
> open issues sprinkled throughout the current text, especially
> concerning decisions regarding exactly which operations to include in
> the various ABCs, and also regarding the level of detail required in
> the PEP.

I have a bunch of comments that aren't all related to the open issues.
I'm more familiar with statically typed languages, so sorry if some of
these comments don't make sense in a Python context.

Hashable might want to specify the exception that should be thrown for
mutable objects.

I wouldn't have guessed what Finite was from the name, but I don't
mind it now that I know. I do prefer 'Sized'.

A possible name for the __contains__ ABC for sequences and strings is
'Searchable'. On the other hand, an annotation of some sort for the
running time of the method might do as well, but I don't know of any
prior art for that.

On Sets:
Do you have an example of a data structure that is a Set but not a
ComposableSet?

Set should inherit from a PartiallyOrdered ABC that defines the
comparison operators.

The final specification of Set should spell out the invariants, but
they're clear enough that the PEP probably doesn't need to. It's odd,
though, that you mention optimizations to __eq__ and __le__ but not
their semantics.

ComposableSet's open issues:
> Should we define an API for creating new instances (e.g. a class method or a fixed constructor signature)?

There are some set types that can't be constructed from arbitrary
sequences. For example, Patricia Tries
(http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-IntSet.html)
can be used to make very efficient sets of integers, but only
integers.

> Should we just pick a concrete return type (e.g. set)?

For set compositions, you should say whether the arguments need to be
uniform. (Is (IntSet(1,2,3) | set('a','b','c')) allowed?) If they do,
then you don't need a concrete return type; if they don't, then you
do, and the default implementations should use it. Even this may be a
problem if someone writes a TreeSet which requires
TotallyOrdered-but-not-Hashable elements. In any case, the ABC should
encourage implementors to return an instance of the same type as the
arguments when the arguments are of the same type.

> Should we add the copy method?
Not on ComposableSet, but maybe in a Copyable ABC, which it might inherit from.

MutableSet:
> Should we support more operations implemented by the Python 2 set type?

You have groups of equivalent operations (add, update, __ior__),
(discard, difference_update, __isub__), (??, intersection_update,
__iand__), and (??, symmetric_difference_update, __ixor__)

It would be nice to make these all uniform. Intersection is hard to
support one element at a time, but symmetric difference would work
with a toggle() method. The operations that work a group at a time are
likely be more efficient on Sized arguments, and much more efficient
on arguments of the same type. How much are you trying to support
these kinds of optimizations to implementations?

> Should we unify remove and discard, a la Java (which has a single method returning a boolean indicating whether it was removed or not)?

Yes. :)

Mappings:
Set-like unions, intersections, and differences are useful on maps
too, but you usually have to provide a function to say what to do with
colliding values. This is probably a different PEP though.

Sequence:
I guess you've already dealt with this, but allowing __getitem__ with
integer indices on str objects forces you to use UTF-32 or play
interesting games with caching. Most string algorithms are quite happy
with (bidirectional? multipass?) iterators, but programmers aren't
used to using them.

Numbers:
Haskell might be good to use for inspiration on the numeric classes:
http://www.haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#7.
Note that Haskell's "Real" doesn't imply that the number is
fractional, just that it's not complex. Some differences I see from
Bill Janssen's proposal:
 * A Haskell Num is constructible from any Integer (python's long).
 * A Haskell Fractional (including Complex) is constructible from any
Rational (which can be constructed from a float)
 * The operations are spread into more classes. I think that primarily
helps Complex and Fixed-point types fit into the hierarchy better. The
Haskell folks actually complain that the operations aren't spread into
_enough_ classes. They mostly want more group-theory related classes
below Num.
 * The bitwise operators are segregated into a Bits class, but I don't
see any benefit to that.
 * In a possible use for Cardinal, Integral**Cardinal=>Integral,
Fractional**Integral=>Fractional, and Floating**Floating=>Floating.
 * Complex isn't Orderable

Fortress (http://research.sun.com/projects/plrg/fortress.pdf#page=270)
has an extremely detailed hierarchy of algebraic traits, and it looks
like they plan a similarly detailed hierarchy of numbers, but the
numbers aren't yet specified beyond integers and rationals.

Cardinal is also useful as the argument to (Sequence * Cardinal).

Index (Ix) is used in Haskell primarily in the form of tuples of Ints
for multi-dimensional arrays.

-- 
Namasté,
Jeffrey Yasskin