[Types-sig] Type Inference I

Greg Stein gstein@lyra.org
Sat, 18 Dec 1999 00:43:45 -0800 (PST)

Whatever you want to call it: inference or deduction or type analysis. I
think we will be doing "bottom up type analysis" to use your phrasing. I
don't think we need any top-down inferencing at all.

In a function, we get our initial type inputs from the arguments and from
function return values. With those, we compute the types of each item.
Those are the types we pass to functions or return from the function.

I still don't see your point. You've gone on at length about exceptions,
semantics, and what kinds of inference can or can't be done. What is the
point? At the end of this you say something about adding "static-y" stuff
to Python? What do you mean? Honestly, the previous email and this one
just seems to be a lot of gobbledygook. Long words, short on applicable,
useful content. Again: call me small-minded, but I think you're being
overly obtuse.

Comments below...

On Sat, 18 Dec 1999, skaller wrote:
> Greg Stein wrote:
> > On Sat, 18 Dec 1999, skaller wrote:
> > > ... long post about exceptions and semantic definition ...
> > 
> > Sorry John, call me dense, but I really don't see what you're talking
> > about. :-(
> 	It takes a while to understand the impact of conformance
> and specifications on semantics .. and that this is not just a
> matter of language lawyering, but a real, pragmatic, issue.

I'm not an idiot. If it takes me a while, then it is going to take
everybody a while. Phrase your discussion so that you're actually saying
something, rather than speaking so much in the abstract. You bring up
points about boundary cases and how they throw exceptions: great, but
nobody cares about those boundary cases (I'm never going to feed my .emacs
file into Python).

> > I don't see a problem with exceptions. That is part of Python. I don't see
> > that it causes any problems with type inference, either (it just
> > introduces interesting items into the control/data flow graph).
> 	The problem arises roughly as follows:
> type inference works by examining an expression like:
> 	x + 1
> and _deducing_ that x MUST (**) be an integer. It cannot
> be a file, because it isn't allowed to add a file to an integer.
> But in Python you CAN add a file to an integer. It is perfectly
> legal, it just throws an exception.

You can't deduce/infer anything from x+1. x could be a class instance, in
which case you're totally screwed. Otherwise, it could be any numeric
type. But even then: as you point out, it could be a string.

This is entirely the wrong direction. We aren't trying to figure out what
x *should* be. We're trying to say "x is <this>. will it cause an error?"

> Do you see? This means we cannot deduce ANYTHING about 'x'
> in the example snippet given above.
> Of course, the _expression_ x+1 can only be an integer,
> we _can_ deduce that. But that isn't enough. Python

You can't deduce that at all.

class foo:
  def __add__(self, value):
    return "hello"

> is too dynamic. We need more constraints to be able
> to do effective inference.

Not at all. As I mentioned: we'll be doing bottoms-up. We don't need
constraints: we just need some type annotations on the input values
(e.g. arguments and return values).

> (**) This example ignores class instances with __add__ methods,
> to make the argument easier to follow.

You argument is easy to follow, but I don't see *why* you're making the
argument. I don't care what x is from "x+1". I know what x is from where
it got assigned a value.

> > This whole tangent about feeding an email to Python and claiming it is a
> > valid Python program with defined semantics (raise SyntaxError). I
> > understand your explanation, but I totally miss the point. So what?
> 	See above. We cannot infer anything, unless there are rules.
> That is, there MUST be set of permitted signatures for
> functions/operators,
> in order to do inference at all.
> 	It is possible to do synthetic (bottom up) type analysis, such as:
> 	x = 1 + 1
> Here, we know that Int + Int -> Int, and so x (at least at this
> point in the program) must be an Int. But that is only
> the 'deductive' part of inference, the 'inferential' part
> infers the types of _arguments_ from the set of allowable
> signatures of functions. That is, we must do  the inference
> top down, not just bottom up.

No. We're only going to do bottom up, as far as I know. Nobody has even
ventured a request to have any kind of top-down inference.

In fact, most people don't want anything beyond simple statement-level
inference (achievable by declaring the types of all names used). I'd
rather see no local declarations because we can infer/deduce the types of
all local names.

> > Type inferencing for the "1 + file" case is easy. You know the two types,
> > and you know they can't be added. Bam. Error.
> 	but you're wrong, the result of applying the addition
> operation is, in fact, well defined: it is NOT an error in the program,
> it just throws an exception, rather than returning a value.
> If you throw an exception deliberately, that is hardly an error, is it?

All right. Now you're just being silly. The entire purpose of this
discussion is to local those exceptions at compile time. THAT IS THE
PURPOSE HERE. By definition, we are saying it is wrong.

Argue semantics all you want about what is correct or not, but raising an
exception is exactly what we want to avoid. We want to know about it
before we run the program.

> > It was a long email, but what exactly were you trying to say? "Define the
> > semantics" isn't very clear. I feel Python has very clear semantics. What
> > exactly is wrong with them?
> 	There is no distinction made between 'incorrect' code,
> and 'correct' code for which an exception is thrown.

What?! Of course there is a distinction. We want to filter out the
incorrect code (i.e. that which uses types incorrectly and would throw
errors at runtime).

> 	In compiled code, we need the distinction, because there
> is a lot of overhead in doing the dynamic type checking required
> to throw the exception. The whole point of compilation is to
> eliminate the overhead of run time type checking.

In Python, the whole point of compilation is to transform source code into
something that the PVM can execute. Done.

>... more stuff ...

> Hope this makes sense: to compile python code effectively,
> we need to add some reasonable 'static-y' restrictions.
> Where, 'reasonable' means 'suitably pythonic', but not
> quite as dynamic as the current CPython 1.5.2 implementation
> allows.

No, it doesn't make sense. I see that we can do this with declarations and
without the need for restrictions.

I'm getting the feeling that you are trying to solve an entirely different
problem from what we've been discussing over the past week. Your
discussions about what is correct and incorrect just doesn't seem to have
any basis in the problem being worked on. We want to detect incorrect code
before runtime, where "incorrect" is defined as throwing an (unexpected)
exception. And it is actually pretty easy to tell whether something is
expected or not: did the developer put in a try/catch?

Your discussion seems to saying something about removing exceptions. But
honestly, I really don't know what you're advocating.

I'm sorry, but I'm obviously a little bit tweaked. As Guido said, maybe
too much sugar lately :-). More likely, not enough sleep. How about if you
write a short email with a concrete suggestion for a change? That may help
to define what exactly you're suggesting should happen. All this
background "theory" is just noise to me.


Greg Stein, http://www.lyra.org/