[Python-3000] Need help completing ABC pep

Guido van Rossum guido at python.org
Sat Apr 21 17:01:41 CEST 2007


On 4/21/07, Jeffrey Yasskin <jyasskin at gmail.com> wrote:
> On 4/20/07, Guido van Rossum <guido at python.org> wrote:
> > On 4/20/07, Jeffrey Yasskin <jyasskin at gmail.com> wrote:
> > > On Sets:
> > > Do you have an example of a data structure that is a Set but not a
> > > ComposableSet?
> >
> > Take for example a set that is a contiguous range of integers,
> > represented by a (lo, hi) tuple. In many cases the return type of
> > union and symmetric difference cannot be represented as an object of
> > the same type. I don't want to force the implementer of such a set to
> > provide implementations for operations they don't need.
>
> Ah, good example. I think that PEPs that propose ABCs should be
> encouraged to describe a few possible implementations, and should be
> particularly encouraged to give examples of the implementations that
> made them split two classes, or omit a method from a particular class.

Yeah, the PEP is short on examples.

> > > Set should inherit from a PartiallyOrdered ABC that defines the
> > > comparison operators.
> >
> > I guess then there would also be a TotallyOrdered ABC deriving from
> > PartiallyOrdered which adds no operations but does add a constraint.
> > I've added both, and made Set derive from PartiallyOrdered.
>
> Cool. Here's some language for their invariants (checking myself with
> the Fortress spec, and stealing its properties nearly verbatim):
>
> PartiallyOrdered:
> This ABC defines the 5 inequality operations <, <=, >=, >, and cmp().

Actually, I'm hoping to eliminate cmp() completely. Also, even if it
doesn't get eliminated, the existence of cmp() suggests a total
ordering.

> (Note that == and != are defined by object.) Classes deriving from
> this ABC should ensure that '<' and '>' form strict partial orders,
> and '<=' and '>=' form weak partial orders, in the sense of
> http://en.wikipedia.org/wiki/Partially_ordered_set. These should be
> compatible with each other and with ==:
>
> Where PO is a subtype of PartiallyOrdered, and a and b are instances of PO,
> cmp(a, b) < 0 ↔ cmp(b, a) > 0
> cmp(a, b) == 0 ↔ cmp(b, a) == 0
> cmp(a, b) == None ↔ cmp(b, a) == None
> a < b ↔ cmp(a, b) < 0
> a <= b ↔ cmp(a, b) <= 0
> a == b ↔ cmp(a, b) == 0
> a >=b ↔ cmp(a, b) >= 0
> a > b ↔ cmp(a, b) > 0

> Subclasses of PartiallyOrdered must define one of cmp() or <=.

Actually, Python typically uses < as the single comparison operator
that must be defined (e.g. list.sort() uses < exclusively).

> The other operators are defaulted as follows:
>   def <=(a,b): return cmp(a,b) <= 0
>   def cmp(a,b):
>     if a<=b:
>       if b<=a: return 0
>       else: return -1
>     else:
>       if b<=a: return 1
>       else: return None
>   def ==(a,b): return a<=b and b<=a
>   def <(a,b): return a<=b and not b<=a
>   def >=(a,b): return b<=a
>   def >(a,b): return b<a
>
> Open Issues: 1) Is it reasonable to let cmp() return None for
> unordered pairs, or should we instead define a 4-value enumeration for
> it to return?

As I said, I'd rather kill it or at least ignore it for this part.

> 2) The @abstractmethod decorator isn't enough to specify
> that subclasses must override at least one of a group of methods, and
> there are classes with even more complex requirements like "subclasses
> must override a or (b and (c or d))".

I guess those requirements will have to be expressed in English. I
don't see the point to invent a decorator mini-language or other
mechanism that's powerful enough to express such possibilities. If
someone *really* wants they can implement a new metaclass that checks
these, but IMO it's a rather silly waste of time since the metaclass
can't check that any of the methods are implemented correcttly, so the
exercise is kind of pointless (super-strict in one area while 100%
lenient in another).

> 3) http://docs.python.org/ref/customization.html#l2h-187 still implies
> that < defines a total order over all python objects. Has that changed
> for py3k? 4) How should we handle the reflected forms of these
> operators?

Yes, in Py3k < <= > >= are not defined by default.  We may not have
updated the docs though.

> TotallyOrdered:
> This ABC derives from PartiallyOrdered. It adds no new operations but
> implies a promise that <, <=, >, and >= are total orders in the sense
> of http://en.wikipedia.org/wiki/Totally_ordered_set, and that cmp(a,b)
> never returns None. Given this additional restriction, subclasses only
> need to define one of <, <=, or cmp(), and the rest of the methods can
> be defaulted.
>
> Open Issue: How does the TotallyOrdered property inherit? If Real (an
> abstract class) inherits from TotallyOrdered, do different subclasses
> need to be totally comparable with each other? If long (a concrete
> class) inherits from TotallyOrdered, and I subclass long, do I have to
> ensure that my instances are totally comparable with other longs? I
> think 'yes' to the second, to obey the Liskov substitution principle.
> I don't know about the first: for Real it seems fine, but it might
> inhibit refactoring the ABC hierarchy.

Perhaps this is more appropriate for the Numeric ABCs PEP. IMO the
only concern is the realities of floating point arithmetic (whether
binary or decimal); I would be fine with some language that explains
that numeric types computed/represented with limited precision may
violate the rules. The only alternative I see is to force float and
decimal (and hence Real, Complex and Number) to inherit from
PartiallyOrdered instead of TotallyOrdered, which sounds like a shame
-- or is it?

> > > The final specification of Set should spell out the invariants, but
> > > they're clear enough that the PEP probably doesn't need to. It's odd,
> > > though, that you mention optimizations to __eq__ and __le__ but not
> > > their semantics.
> >
> > The semantics are kind of obvious. :-)
> >
> > Text spelling out the invariants in more detail that I can just paste
> > into the PEP is most welcome!
>
> Bah! ;)
>
> Set:
> Sets should implement an efficient (no worse than O(log N))
> __contains__ operation. Sets are defined by their extensions.

What's an extension?

> Subclasses may override the following default definitions of __eq__
> and __le__ with more efficient versions, but they must respect the
> same semantics:
>   def __eq__(a, b): return len(a) == len(b) and all(x in a for x in b)
>   def __le__(a, b): return len(a) <= len(b) and all(x in a for x in b)
>
> ComposableSet:
> ... mathematical meaning of set composition:
> Where a and b are instances of ComposableSet and __op__(a,b) is defined:
>   x in (a | b) ↔ x in a or x in b
>   x in (a & b) ↔ x in a and x in b
>   x in (a ^ b) ↔ x in a != x in b
>   x in (a - b) ↔ x in a and not x in b
>
> > > ComposableSet's open issues:
> > > > Should we just pick a concrete return type (e.g. set)?
> > >
> > > For set compositions, you should say whether the arguments need to be
> > > uniform. (Is (IntSet(1,2,3) | set('a','b','c')) allowed?) If they do,
> > > then you don't need a concrete return type; if they don't, then you
> > > do, and the default implementations should use it.
> >
> > You may be misunderstanding what I meant. I certainly don't want to
> > constrain the potential return type of the composition operations
> > (except that they should themselves be composable sets). The question
> > about picking a specific return type was meant as a way out to
> > providing a concrete default implementation. I'm still on the fence of
> > this, as it could be a big waste of time (I imagine that composing
> > Patricia trees can be done more efficiently than iterating over the
> > elements).
>
> Borrowing some syntax from Java generics, which of:
>   ComposableSet __or__(ComposableSet a, ComposableSet b);
>   <CS extends ComposableSet> ComposableSet __or__(CS a, CS b);
>   <CS extends ComposableSet> CS __or__(CS a, CS b);
> do you mean to define?

Can't say I understand that. The minimum requirement is that the
return type is a subclass of ComposableSet. *Some* subclasses CS may
themselves guarantee that the result is always a subclass of CS, but I
don't want to force that. Returning a 'set' in the default
implementation satisfies the former but not the latter. This is in
itself fine (CS can override __or__ to return a CS). But it could be
expensive -- even if the author of CS doesn't care about the return
type being a CS itself, they might still care about th efficiency.

> I think they have different implications about
> what defaults you need to provide. The second is probably out because
> it would mean (a|b|c) would likely break. This gets even more complex
> if we consider the types of the contents of the sets... Here's a
> possible overloaded signature for __or__. (Do you already have a
> better notation for this?)

I think this is more appropriate for other languages.

>   <C super A, C super B, C extends Hashable> ComposableSet<C>
> __or__(ComposableSet<A> a, ComposableSet<B> b);
>   <C super A, C super B, (for x in A,B,C: CS<x> extends
> ComposableSet<x>)> CS<C> __or__(CS<A> a, CS<B> b);

It's Saturday AM, I'm not even going to try to parse that. :-)

> The first would probably be optimized when its arguments are of the
> same type, but could fall back on the abstract methods from
> ComposableSet, which would return a built-in (frozen?)set.
>
> > > Numbers:
> >
> > Would you mind helping to write a PEP for numbers? I think I'm running
> > out of round tuits just for the collections and the ABC
> > infrastructure.
>
> Bah! again. Yes, I'll try to throw something together with some help
> from Neal. Who else has been thinking about the Numbers PEP?
>
> > > Cardinal is also useful as the argument to (Sequence * Cardinal).
> >
> > I think there's little value for this in a dynamically typed language;
> > the implementation can raise ValueError or treat negative numbers as
> > zero (which is what Python currently does).
>
> True, and any hypothetical type checkers that want to use such
> information can define their own subclasses of Integer.
>
> --
> Namasté,
> Jeffrey Yasskin
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-3000 mailing list