Determining types of variables and function params by parsing the source code

Sun Aug 25 12:08:24 EDT 2002

Stefan,

maybe the terminology I used was unclear. I will give some examples
below to illustrate my thoughts. My view is that it doesn't matter for
determining types if objects are referenced or copied, or if they are
classes or primitive types. All that matters is that objects have some
type information at creation and, as the object is copied or
referenced, the same type information applies to the copy and/or
reference of the object.

> > Variables (objects) are instantiated in Python by using one of the
> > following mechanisms (am I missing something?):
> > 1) Explicit instantion of basic types (numbers, strings, ...)
a=1
mystring="hello world"

> > 2) Construction by the object constructor
myapp = MyApplication()

> > 3) Assignment of another variable
referencetoapp = myapp

> > 4) Assignment of the return value of a function
newstring = replace(oldstring, "old", "new")
=> We can assume that replace() returns a string here.

> > 5) Assignment of the result of an operator action (f.e. list
> > operators)
partofstring = mystring[5:10]
=> Because mystring is a string, partofstring will be a string, too.

> Perhaps I don't understand you correctly but assignments per se don't create
> new objects but build new references to existing objects.
I understand that, see my above remark why I don't think this matters
here.

> On the other hand, no one disallows this:
> 
> def print13(s):
>     t = s[1:3]
>     print t
> 
> s = 'hello'
> print13(s)
> 
> print13( [1, 2, 3, 4] )
> 
> In the second invocation, s in print13 is a list.

So the parser would accept that this function can be called with
objects of more than one type. F.e., if the parser would be used to
display auto-complete (tooltip) information, it would say something
like
"print13(s [string, list])".

> I think you can get some type information, but it will rather be a good guess than
> a safe bet. Also consider dynamic source/code generation via exec, eval and
> execfile. You can't safely determine the type by static analysis, even without
> exec etc., e. g.

I'm totally aware of that. The described system isn't intended to do
that, anyway. Just imagine a library like wxPython. If you could parse
the whole library plus the demo programs supplied with it, and would
be able to obtain type information for, say, about 80% of the classes
and functions in that library, this information would be invaluable
for auto-completes in tooltips and automatic sanity checking of code.

PyChecker already does some of that to some extend, but AFAIK it is
far from complete (maybe someone with much experience with PyChecker
could add a remark here?).

Markus