[Types-sig] Interface PEP

Sverker Nilsson sverker.is@home.se
Wed, 21 Mar 2001 17:55:57 +0100


Marcin 'Qrczak' Kowalczyk wrote:
> 
> Thu, 15 Mar 2001 01:31:22 +0100, Sverker Nilsson <sverker.is@home.se> pisze:
> 
> > What not only many but perhaps all people would agree on, including
> > me, is that the builtin types or classes should be unified with
> > user-defined classes in this way: So that user-defined classes can
> > inherit from the classes that the objects with builtin types have.
> 
> I agree.

Ok.

> 
> > That doesn't necessary mean unifying types with classes. It could just
> > as well mean that we defined the classes of the builtin objects.
> 
> Why not to unify then?

I'm not sure how that will work or what is really meant. But I suppose
that the unification you are thinking of, would make types and classes
the same. In every context you say type you could likewise say class
and vice versa.

I don't see how this could be done or how useful it is.

How about the type() builtin. I assume we can't remove it or rename it
to class() because it may break too much code. What should it return
for an instance of a user-defined class then? If it still returned
InstanceType, I don't see that the type and classes have been
logically unified. Because what would InstanceType be. If all types
would be classes, InstanceType would also be a class. But it would be
separate from the class of the object itself.

That is in contrast with type() of a builtin object which would be
it's class. For other objects it would be InstanceType. So there is
no unification in this respect.

On the other hand if type() of user-defined objects would return its
class, it would break unacceptably much code, I (strongly) believe.

Therefore, it seems to me type and class can not be really unified.

However, we can define the classes of builtin objects/types, without
the above suggested conceptual problems. 

I said previously we could have e.g. BuiltinClasses.List etc. Now I
have seen that Guido has previously suggested something similar in a
mail on Python-Dev.

http://mail.python.org/pipermail/python-dev/2000-November/010417.html

Guido:
"""

As long as we're proposing hacks like this that don't allow smooth
subclassing yet but let you get at least some of the desired effects,
I'd rather propose to introduce some kind of metaclass that will allow
you to use a class statement to define this.  Thinking aloud:


import types
filemetaclass = metaclass(types.FileType)

class myfile(filemetaclass):
	
"""

Granted, he seem to regard this mostly as a temporary solution, but I
don't see why it can't be part of the permanent solution.

> 
> I understand that there are technical reasons, like the speed
> resulting from avoiding looking up operations by name, or the fact

[snip yes I am not referring to technical implementation]

> But from the point of users of the language there is no reason why
> files should be objects of FileType and not objects of InstanceType
> with class File. It's an artifact of the implementation.

Are you saying that type() should return InstanceType for ALL types,
builtin or otherwise? That would break a lot of code! Totally
unacceptable, I would say.

> 
> > Why would it be cleaner to unify type and class, given the many
> > variants that exist, that you describe (some of) yourself below?
> 
> Because concepts for which many other languages use terms "type"
> and "class" are more different than concepts for which Python uses
> these terms (except C++, but let's not follow this crazy language).
> 
> These words have different meanings in those languages and whether
> types and classes can be unified in Python is independent from whether
> they can be unified in Haskell. It's just a terminology clash.

Not going into this much deeper for now. Both interface, type and
class are overloaded terms. Builtin Python types happen to specify
specific implementations. I still don't see why we can't have
something more we call types, user define types, that happen to
specify another implementation (such as the implementation shared by
all InstanceType objects) but mostly are used to specify a
protocol/interface.

It's about 'type checking', after all.

[snip]

> [ About Haskell static type system, I wrote: ]
> > And I read some paper that showed how impossible it was to define
> > a natural join... in some context.
> 
> I don't know what is natural join.

I'll try to explain the problem - maybe it is of some interest for the
Python efforts too. [ Disclaimer: I don't have the original paper
handy so this is guesses from memory, I may be missing their point
;-/]

>From p.82 in "Database System Concepts", Silberschatz et al, 3:d
edition:

   The natural-join operation forms a Cartesian product of its two
   arguments, performs a selection forcing equality on those
   attributes that appear in both relation schemas, and finally removes
   duplicate attributes.


I suppose it's the "removes duplicate attributes" that would create
some of the (worst) problems... but on second thought it seems more
complicated than that. To illustrate... a Python natural-join
procedure might look like:

def natural_join((schema_a, relation_a), (schema_b, relation_b)):
    # schema_[ab] are tuples of attribute names 
    # relation_[ab] are list of tuples of attribute values
    # each tuple in relation_[ab] has same length as schema_[ab]

    # The types of attribute values can be different for each attribute.

    # The types of attribute values should support comparison at
    # least for equality; support for ordering (<) or hashing will
    # allow for faster algorithms to be applied. This, however, only
    # really applies to the attribute values that are really compared,
    # that is, the attributes that have names that occur in both
    # schema_a and schema_b.

    # The type of attribute names should support comparison for
equality.
    # (Specifying anything more would not speed up anything
significantly
    # except for pathological cases.)

    # Return: (schema_ret, relation_ret), where:
    #
    # schema_ret is a tuple of all unique attribute names from schema_ab
    # relation_ret is a list of tuples with values corresponding to
schema_ret
    #
    # Typecheck and implementation left as an excercise <wink>


The type of the parameter objects are dependent in complicated ways.
If we want to 'automate' the dynamic type check, it won't suffice to
give a type after each parameter. We need to give a type for all the
parameters together, or suitable combinations of them.

Paul Prescod's system has a dict with types of parameters separately
specified. When the parameters are dependent, however, it seems we
need something more general. We might want type check functions that
take any combination of parameters.

Doing the check dynamically should then be possible, 'automated' or
not, given suitable user-defined type-check functions. But statically
I don't know if the compiler can deduce anything useful in complicated
cases like this.

This is not a critique of Haskell per se - this check may well be
impossible to do statically at all... Although on the other hand that
is not the impression I remember from the original paper that was
about Haskell. Maybe I misunderstood something, maybe there could be
some practical way to do typing like this statically.

[snip]

> Types define the representation. Interfaces define the usage. Classes
> are somewhat between.

Not necessarily. I have seen some other views expressed.

> In dynamically typed languages they are much
> closer to types, because usage is not checked wrt. declared classes -
> in Python subclassing is only used to inherit behavior, not to create
> subtypes.
> 
> Sorry, unifying types and interfaces while leaving classes alone is
> complete nonsense.

Please, I think blank statements like that don't contribute well to a
discussion where we are considering various ideas in an open manner...
with various degrees of expertise of course but trying our best.

That said, I believe it would be good to have a single way to refer to
what we put as the type-indication. If we have this syntax for
example:

def f(x:y)

What's the y? Most languages call it a type, I think.

However, as I found myself indicating in the natural-join example, it
will not suffice to specify the types separately for each parameter.
Something more generally is needed. Only because it's more general
doesnt mean we can't consider it to be a type though, IMHO.

In Haskell you could specify interfaces AND types, when defining the
type of an object. They are specified in different ways. You can say,
if I remember correctly, for example:

f :: Integral a, Integral b => (a->b)->Tree a->Tree b

where Integral is a class (interface) and Tree is a parameterized
type.

Type specifications can get long. But you can't define a type that
contains all the above - this is not allowed, IIRC:

type atype = Integral a, Integral b => (a->b)->Tree a->Tree b

You have to repeat all of 'Integral a, Integral b => (a->b)->Tree
a->Tree'
for each definition of functions of the same type.

That's an unnecessary restriction, I think. Even though it may be
necessary in the context of the specific Haskell system of types,
it should not be necessary in general to keep interfaces and types
that rigidly separated.
		
tongue-in-cheek-ly yours,

Sverker