[Types-sig] RE: PRE-PROPOSAL: Python Interfaces

Tim Peters tim_one@email.msn.com
Sun, 22 Nov 1998 17:00:25 -0500


[Paul Prescod]
> I don't entirely understand how there can be people who really love
> Python and yet are against the idea of interfaces. What is the
> "sequence protocol" if not a language-builtin interface? What is the
> "mapping protocol"? etc. I don't understand why someone would object
> to generalizing this fantastically successful idea.

[Gordon McMillan]
> I don't object, but I worry. For example, the "sequence protocol" is
> a very fuzzy thing.

Indeed it is!  And e.g. I have any number of "sequence types" that implement
only __getitem__, and even then only for purposes of the for/in protocol:
ignoring __getitem__'s argument, just pumping out "the next one" each time
__getitem__ is called.

Abstractly these types are really just ForInImplementers, but because we
have no formal rules (neither names known to Python nor even accepted
conventions) now, your generic function is likely to try

    hasattr(timsThing, "__getitem__")

and draw wrong conclusions from the result.

> If you start with the abstract sequence protocol (as defined by the C
> API), only List is an implementation of which B. Meyer would approve,
> (SetItem is in the abstract interface, for example).
>
> OK, so lets pretend that's cleaned up. Now we're getting methods on
> List so it can be treated like a Stack. Stack is certainly not
> another name for List, nor for Sequence (whatever that means). It's
> another protocol, right? But it's one that is automatically
> implemented if you implement the List protocol.

Under the proposal, though, conformance isn't a matter of checking
structural equivalence of method collections, it's a matter of looking for
specific interface objects in constrained places.  So if you have

    class Stack(Interface):
        def push(self, thing): MysteryGoesHere
        def pop(self): MysteryGoesHere

then it's simply consequence-free coincidence if "class List(Interface)"
contains methods with matching names and signatures -- unless
List.__implements__ contains specifically Stack.

IOW, Lists do *not* implement the Stack protocol just because "it looks
like" they do -- if they do at all, it's only by virtue of List explicitly
saying it implements specifically Stack.  Today, though, there's no way to
draw the distinction at all.

OTOH, unlike e.g. JimF I have nothing against stuffing some default
implementation into interfaces.  To the contrary, if I *wanted* to say that
all List implementers implement Stack too, I think

   class List(Interface, Stack):  #??? spelling of interface
      __implements__ = Stack      #??? ... hierarchy is unclear
      def pop(self): raiseDeferredError()
      def append(self, thing): raiseDeferredError()
	...
      def push(self, thing): self.append(thing)

is the obvious way to do it.  This doesn't preclude a different
implementation of push, it simply reflects the truth that List *can* give
you an implementation of push guaranteed to meet push's spec, assuming only
that the implementations of List.append and List.pop meet *their* specs.

For a simpler example, after rich comparisons are in

    class SaneOrdering(Interface):
        def __eq__(self, other): raiseDeferredError()
        def __ne__(self, other): return not self.__eq__(other)

would save 99% of the SaneOrderings ever created the pointless and
error-prone bother of implementing __ne__ themselves.  That is, if the
post-condition for SaneOrdering.__ne__(self, other) is

    assert Result == not self.__eq__(other)

what better way to encourage correct software than to provide a default
implementation that guarantees it without further ado?  Many methods in rich
interfaces exist simply for user convenience, having obvious (but clumsy to
type out) implementations in terms of other methods.

> How about FileInput? That's a sequenceish protocol. A read-only,
> forward-only, sequential-only sequence. How do we (sensibly and
> comprehensibly) capture the fact that most sequenceish protocols will
> suffice where a FileInput protocol is required, with the exception of
> those stack-ish style sequences?
>
> In fact (as Barry was supposed to point out at the conference for me,
> harrumph, harrumph), the set of all interfaces / protocols is the set
> of all subsets of the set of all signatures - without taking behavior
> into account. That is, the set of all interfaces has a cardinality
> that I don't particularly want to contemplate.

I don't think the proposal is rich enough to capture everything people do in
Python -- and don't think it needs to be.  I'll attach my original essaylet
on this topic, one of the 95 Fece^H^H^H^HTheses I nailed to the egroups
door.  The point of all this to me is to make the simple things simpler --
wrestling with anonymous folklore protocols is a real barrier in Python
practice, and a *needless* one much of the time.

> At the moment "sequence protocol" means 2 different things. It means
> Py_SequenceCheck(o) == TRUE (which I hope the type / class
> unification fixes), and it means that if you squint a little and wave
> your hands alot, this thing could be considered a sequence. My worry
> is that if you attempt to make the latter definition more precise,
> you in fact increase complexity.

I think you *do* increase complexity for experts (see below), but reduce it
for everyone else.  "Sequence" can be defined in an "overly restrictive" way
and still leave 99% of people happy 99% of the time.

> I know, worrying is Tim's job, not mine.

I can't make enough time for it, though, so appreciate your help <wink>.  On
c.l.py you raised the specter of suddenly needing to mark all your classes
as implementing other peoples' interfaces in order to use their modules at
all, and that's something I worry about more now than I did when I wrote the
attached.  Python derives much power from "folklore protocols" now, and
indeed that cannot be captured in full short of (as you say) giving at least
one name to each member of the powerset of all signatures.

good-thing-python-doesn't-restrict-identifier-length<wink>-ly y'rs  - tim


[point 14 from Timmy's egroups manifesto]

14. [folklore is purged from the core] Enough sensible builtin "marker
classes" are provided so that the documentation for every builtin function
and builtin method can, for every argument, say that (at least) instances of
such-and-such a builtin class are accepted.

Long note:  I expect this to be disliked by someone who matters <0.7 wink>,
but I think it's important.  The "don't check the type, check the methods"
philosophy of Python is very powerful, yet rarely needed in most peoples'
code -- but gets in the way when it's not needed because there's no other
choice but to use it.

People simply don't know what "pass something that acts like a sequence" or
"acts like a file" etc *means*, and it turns out an accurate answer is
sometimes darned hard to give.

In my own code I'd almost always be happy e.g. checking isinstance(x,
Sequence) at the start of a function -- except lists, tuples and strings
won't cooperate.  If they have classes down the road, and I can change their
classes' bases, I can insert my own "Sequence" marker in the builtins.  But
I believe enough other people would do likewise that we'd end up with a
Tower of Babel.

Python can cut that off, and simultaneously build a foundation for formal
interfaces and optional "type" declarations, by taking a stand and treating
classes in the core as if they meant something.

This does not break current, or preclude people from writing new,
hyper-general methods and functions using ad hoc mixtures of isinstance,
hasattr, and "just try it and see whether it blows up".  In fact, the C
implementation still needs to be written that way.  It does send a message
that such convolutions support advanced capabilities, though, and usually
aren't needed in everyday code.  That in turn makes Python friendlier for
all save a tiny elite.

Short note:  Peeking ahead in other position papers, I see that "execution
context" (something suitable to pass to eval/exec) is yet another piece of
folklore in the making.  Boo!  Good idea,  but let's use classes to give it
a real name (== one Python understands too).  In fact, brand new features
like that can *require* derivation from a specific builtin marker class, and
so enjoy simpler & clearer documentation (albeit not necessarily simpler
implementation).