[Types-sig] expression-based type assertions

Wed, 15 Dec 1999 19:59:10 +0000

[Due to confusion on my part, this begins by echoing other stuff]
[First: something I sent to Greg while forgetting it wasn't to all]

A reason why I want not-just-dotted-names as the type checking object
... a generator for type-checking tuples (say), which takes some
parameters (checkers for the individual members of a tuple) and returns
a checker for that flavour of tuple.

Of course, a compiler can only *exploit* this if it `knows' what the
relevant type checker `really will be'.  However, I see no reason
against *allowing* the checker construct to be applied using checkers
that no compiler could hope to recognise - these will buy you 
robustness but no speed - while the spec for checking can say that
compilers are at liberty to exploit any checkers they *do* recognise.

Even dotted names, if the compiler doesn't `know' the relevant
checker, won't prevent this: and parameterised checkers, like the
tuple example above, can be known (by some compilers) despite 
involving more than just dotted names.

Note: one common type for functions is the whatever-or-None used
for default arguments, notably in the case where the default for 
the argument is an empty dict (or list), but the standard gotcha
about using {} or [] as default obliges one to use None and begin
the function with 

    if arg is None: arg = { } # or [ ] as may be

This is going to provide irritation for the syntax of checking
unless something sensible is dreamed up (or it's not done in the
argument list).  A possibility would be `if a default value is
given, this value is also tolerated for the argument, even if it
fails the type check' subject to a presumption that the function
must begin with something which copes with the default, replacing
it with something that matches the type spec ...

[To which Greg replied (I've shrunk my excerpts here):]

EW> A reason why I want ... returns a checker for that flavour of tuple.

Sure. Syntactical type declarators are fine. But arbitrary expressions
would prevent the compiler from understanding what was going on.

In fact, I proposed this exact kind of thin in the "GFS proposal". Did you
read that?  For example:

x = foo() ! Int
x, y = foo() ! (Int, String)

In the first case, you have a dotted name. In the second, the parser and
compiler understand that the parens mean tuple-of-these-types.

EW> Of course, a compiler ... exploit any checkers they *do* recognise.

A dotted name allows the following construct:

MyChecker = SomeCheckerGenerator(...)
x = foo() ! MyChecker

Again, this was in the GFS proposal. Since you can always do an assignment
such as above, I felt it was quite fine to say "only dotted names."

EW> Even dotted names, ... involving more than just dotted names.

Well... it gets pretty tough for the compiler, the further you move from
simple dotted names. Worst case, the compiler can issue a warning saying
it doesn't understand how to do the compile-time check, and then insert a
runtime check.

EW> Note: ... whatever-or-None ... something that matches the type spec ...

True. Syntactic markers can created and used to state "<type> or None".

[OK, sorry about that, back to the present]

Two issues are developing in the list: one is name-checking, the other
is value-checking.  The two are mostly seperable - however, the
parameters of a function provide a clash: value-checking says that the
parameter is tagged with the type of value the *caller* may supply,
name-checking tags it with a constraint that stops the function
subsequently using that name to hold any other type (while incidentally
doing the value-checker's check when the function is invoked).  This is
an area of difficulties.

I wish to state explicitly that:

  A quite natural consolidation of the existing python object model will
  leave us in a position where attribute modification is *always* done
  by thing.__setattr__(key, val) or thing.__delattr__(key) for any thing
  you care to mention (albeit with some subtleties).

  The resulting framework *will* make it easy to:

    * set up a name-space such that the suite executed to initialise it
      performs sensible attribute modification (without any fancy syntax
      of fuss within the suite), yet: once it is initialised it has no
      attribute modification methods - if you really want, access to its
      __dict__ can be unavailable.

    * police any restrictions you can specify on what values may be
      stored against which names in a given namespace: you do this with
      a __setattr__ hack.

    * have your locals and globals just be namespaces, so there's
      nothing special about them - i.e. you can do the above with them.

It is unnecessary to add syntax to the language for the purpose of
specifying what you can arrange to have __setattr__ do.  In particular,
use of setattr hacks is the right way to implement any fascist policy a
name-space wants to use in controlling modifications to itself - not new
grammar.

However, that only affects the robustness side of matters - it doesn't
provide for the compiler to know it can presume the relevant hacks are
in place: but value-checking can be used here too, with care.  (Have a
checker which corresponds to the assertion: this object has a namespace
containing the name foo, with value matching bah-spec; optionally
specifying that this can't be changed - which the checker can check by
trying to, or by looking for the attribute modification methods.)  So
I'm falling into the purely value-checked camp.

Type-specifications on values (of which function argument/return
type-specs are examples) are valuable: my one reservation about them is
that they are syntactic sugar for assertions - but I accept they are a
good scheme which will genuinely increase the extent to which folk will
make assertions: which improves robustness and maintainability.

I will suppose that ! is the operator to be used for this (since : is so
widely used already that another use would be irksome), but I dont'
really care what the symbol is (though: the type assertion makes sense
as an exclamation, e.g. `7 + (x ! IntType)' reads quite neatly as
   7 plus x, which *is* an int !
as it were).  I have some sympathy with someone's suggestion that the
return-type of a function gets a different spelling for !, e.g. ->
I *really* like TimP's Haskellish spelling of type-specifiers.

The function of inline assertions is then to ensure that the compiler
can perform (rudimentary) type inference based on that which is
asserted.  There are two ways a compiler can respond to !-assertions
(indeed, it has the same choice with assert):

  * exploit - generate faster code, presuming the asserted truth
  * check - verify the assertion

The latter is obvious - perform the test, raise the exception on
failure.  One of the easiest ways to exploit an assertion is to infer
that some subsequent assertion is guaranteed true and elide its check.

The inferencer has other sources of data than the !-assertions - for
instance, immediately after the expression `x+7' has been evaluated, we
know that x's value supports addition, at least with integers, at least
this way round - either that, or we've raised an exception and aren't
executing the code which followed.

I believe it is philosophically valid for the inferencer to presume the
truth of anything that is asserted (by existing assert statements) even
if __debug__ is false.  How much umbrage does that raise ?

The inferencer takes all these sources of information and infers what it
will, the compiler exploits the inferred truths in hunting for efficient
ways to implement the given code.  Any part of the code that the
inferencer doesn't know how to exploit, it simply (silently) doesn't
exploit.  Whether it bothers to check will depend on optimise/debug
flags its been given and how anal the compiler-writer is.

On the dotted-names issue: so long as it's valid to say

> MyChecker = SomeCheckerGenerator(...)
> x = foo() ! MyChecker

there is no mileage in forbidding

x = foo() ! SomeCheckerGenerator(...)

it just obliges me to polute my namespace (unless, of course, I intend
to re-use the checker).  The objection that the compiler has a hard time
coping with this is without substance:

  * the difficulties involved in recognising SomeCheckerGenerator(...)
    are present in both of the above and in no way reduced by storing
    the result in a variable in the mean time: the compiler's knowledge
    of what `! MyChecker' buys it is entirely dependent on making sense
    of the code it's just seen which gives MyChecker a value.

  * each compiler is only going to recognise a sub-set of the
    type-specifiers deployed, even with the dotted-name constraint.

  * when the compiler recognises a checker, it can generate code which
    exploits the truth asserted.

  * the right thing for a compiler to do about unrecognised checks is to
    not exploit them.  After all, it has nothing to exploit.

and, in any case, dotted names (may) involve function calls:
__getattr__.

> I think altering isinstance() to accept a callable is preferable to
> introducing a __check__ method.

Ah, have I mis-understood you ... I thought you said that isinstance
would take a third argument which is callable ... were you saying that
it accepts a third kind of thing as its second argument ?  In which case
I see where you're going and that sounds great.

def isinstance(thing, *modes):
    for mode in modes:
        if type(mode) is TypeType:
            if type(thing) is mode: return 1
        elif type(mode) is ClassType:
            try:
                if issubclass(mode.__class__, mode): return 1
            except AttributeError: pass
        else:
            # mode is presumed to be a callable
            if mode(thing): return 1
    return 0

Associativity:

7 ! IntType ! (Prime ! TypeChecker)

7 is an integer, indeed it's a prime (oh, and Prime is a type-checker).
So ! groups to the left (I think that's what + et al. do).

Paul said:
> There is also something deeply elegant and useful about a separation
> of interface from implementation.

Conceptual separation - yes.
Physical separation - quite the opposite.
When programs and documentation live apart, they drift apart.

The only way the interface spec can live apart from the implementation
is when the implementation can be checked for compatibility with the
interface (so that the implementation change which changes the interface
gets recognised as such the first time the changed code is used).  If we
can check the implementation for compatibility with its interface spec,
we're already specifying the interface in the implementation, so why
bother having a second copy in a separate file ?

This list is too busy.
Time I went home, before I make it any worse.

	Eddy.
--
Yes, I did read the GFS proposal.