From jeremy@cnri.reston.va.us  Sun Mar  5 17:58:12 2000
From: jeremy@cnri.reston.va.us (Jeremy Hylton)
Date: Sun, 5 Mar 2000 12:58:12 -0500 (EST)
Subject: [Compiler-sig] copyright/license BS (was: P2C stuff)
In-Reply-To: <Pine.LNX.4.10.10002241521520.28177-100000@nebula.lyra.org>
References: <ECEPKNMJLHAPFFJHDOJBGEGDCFAA.mhammond@skippinet.com.au>
 <Pine.LNX.4.10.10002241521520.28177-100000@nebula.lyra.org>
Message-ID: <14530.41012.487618.285485@bitdiddle.cnri.reston.va.us>

Getting back to this thread a little late...

>>>>> "GS" == Greg Stein <gstein@lyra.org> writes:

  GS> No... this is saying "do whatever. I don't care." In no way do I
  GS> believe anybody *is* trying to claim ownership. I'm simply
  GS> saying that Jeremy (and/or whoever) can do what they want. Do
  GS> whatever. No need to check with me.

I guess I'm not clear on the status of p2c.  The main reason I asked
was to avoid having two out-of-sync copies of transformer.py in the
world.

  GS> Heck... many of the modules that I've written, I call Public
  GS> Domain. In other words: I'm not even asserting a copyright!

Except that P2C has a copyright notice, and is not in the public
domain. That's the other reason I asked (though a secondary reason). 

Jeremy






From jeremy@cnri.reston.va.us  Mon Mar  6 19:28:12 2000
From: jeremy@cnri.reston.va.us (Jeremy Hylton)
Date: Mon, 6 Mar 2000 14:28:12 -0500 (EST)
Subject: [Compiler-sig] example checkers based on compiler package
Message-ID: <14532.1740.90292.440395@goon.cnri.reston.va.us>

There was some discussion on python-dev over the weekend about
generating warnings, and Moshe Zadke posted a selfnanny that warned
about methods that didn't have self as the first argument.

I think these kinds of warnings are useful, and I'd like to see a more
general framework for them built are Python abstract syntax originally
from P2C.  Ideally, they would be available as command line tools and
integrated into GUIs like IDLE in some useful way.

I've included a couple of quick examples I coded up last night based
on the compiler package (recently re-factored) that is resident in
python/nondist/src/Compiler.  The analysis on the one that checks for
name errors is a bit of a mess, but the overall structure seems right.

I'm hoping to collect a few more examples of checkers and generalize
from them to develop a framework for checking for errors and reporting
them.

Jeremy

------------ checkself.py ------------
"""Check for methods that do not have self as the first argument"""

from compiler import parseFile, walk, ast, misc

class Warning:
    def __init__(self, filename, klass, method, lineno, msg):
        self.filename = filename
        self.klass = klass
        self.method = method
        self.lineno = lineno
        self.msg = msg

    _template = "%(filename)s:%(lineno)s %(klass)s.%(method)s: %(msg)s"

    def __str__(self):
        return  self._template % self.__dict__

class NoArgsWarning(Warning):
    super_init = Warning.__init__
    
    def __init__(self, filename, klass, method, lineno):
        self.super_init(filename, klass, method, lineno,
                        "no arguments")

class NotSelfWarning(Warning):
    super_init = Warning.__init__
    
    def __init__(self, filename, klass, method, lineno, argname):
        self.super_init(filename, klass, method, lineno,
                        "self slot is named %s" % argname)

class CheckSelf:
    def __init__(self, filename):
        self.filename = filename
        self.warnings = []
        self.scope = misc.Stack()

    def inClass(self):
        if self.scope:
            return isinstance(self.scope.top(), ast.Class)
        return 0        

    def visitClass(self, klass):
        self.scope.push(klass)
        self.visit(klass.code)
        self.scope.pop()
        return 1

    def visitFunction(self, func):
        if self.inClass():
            classname = self.scope.top().name
            if len(func.argnames) == 0:
                w = NoArgsWarning(self.filename, classname, func.name,
                                  func.lineno)
                self.warnings.append(w)
            elif func.argnames[0] != "self":
                w = NotSelfWarning(self.filename, classname, func.name,
                                   func.lineno, func.argnames[0])
                self.warnings.append(w)
        self.scope.push(func)
        self.visit(func.code)
        self.scope.pop()
        return 1

def check(filename):
    global p, check
    p = parseFile(filename)
    check = CheckSelf(filename)
    walk(p, check)
    for w in check.warnings:
        print w

if __name__ == "__main__":
    import sys

    # XXX need to do real arg processing
    check(sys.argv[1])

------------ badself.py ------------
def foo():
    return 12

class Foo:
    def __init__():
        pass

    def foo(self, foo):
        pass

    def bar(this, that):
        def baz(this=that):
            return this
        return baz

def bar():
    class Quux:
        def __init__(self):
            self.sum = 1
        def quam(x, y):
            self.sum = self.sum + (x * y)
    return Quux()

------------ checknames.py ------------
"""Check for NameErrors"""

from compiler import parseFile, walk
from compiler.misc import Stack, Set

import __builtin__
from UserDict import UserDict

class Warning:
    def __init__(self, filename, funcname, lineno):
        self.filename = filename
        self.funcname = funcname
        self.lineno = lineno

    def __str__(self):
        return self._template % self.__dict__

class UndefinedLocal(Warning):
    super_init = Warning.__init__
    
    def __init__(self, filename, funcname, lineno, name):
        self.super_init(filename, funcname, lineno)
        self.name = name

    _template = "%(filename)s:%(lineno)s  %(funcname)s undefined local %(name)s"

class NameError(UndefinedLocal):
    _template = "%(filename)s:%(lineno)s  %(funcname)s undefined name %(name)s"

class NameSet(UserDict):
    """Track names and the line numbers where they are referenced"""
    def __init__(self):
        self.data = self.names = {}

    def add(self, name, lineno):
        l = self.names.get(name, [])
        l.append(lineno)
        self.names[name] = l

class CheckNames:
    def __init__(self, filename):
        self.filename = filename
        self.warnings = []
        self.scope = Stack()
        self.gUse = NameSet()
        self.gDef = NameSet()
        # _locals is the stack of local namespaces
        # locals is the top of the stack
        self._locals = Stack()
        self.lUse = None
        self.lDef = None
        self.lGlobals = None # var declared global
        # holds scope,def,use,global triples for later analysis
        self.todo = []

    def enterNamespace(self, node):
##        print node.name
        self.scope.push(node)
        self.lUse = use = NameSet()
        self.lDef = _def = NameSet()
        self.lGlobals = gbl = NameSet()
        self._locals.push((use, _def, gbl))

    def exitNamespace(self):
##        print
        self.todo.append((self.scope.top(), self.lDef, self.lUse,
                          self.lGlobals))
        self.scope.pop()
        self._locals.pop()
        if self._locals:
            self.lUse, self.lDef, self.lGlobals = self._locals.top()
        else:
            self.lUse = self.lDef = self.lGlobals = None

    def warn(self, warning, funcname, lineno, *args):
        args = (self.filename, funcname, lineno) + args
        self.warnings.append(apply(warning, args))

    def defName(self, name, lineno, local=1):
##        print "defName(%s, %s, local=%s)" % (name, lineno, local)
        if self.lUse is None:
            self.gDef.add(name, lineno)
        elif local == 0:
            self.gDef.add(name, lineno)
            self.lGlobals.add(name, lineno)
        else:
            self.lDef.add(name, lineno)

    def useName(self, name, lineno, local=1):
##        print "useName(%s, %s, local=%s)" % (name, lineno, local)
        if self.lUse is None:
            self.gUse.add(name, lineno)
        elif local == 0:
            self.gUse.add(name, lineno)
            self.lUse.add(name, lineno)            
        else:
            self.lUse.add(name, lineno)

    def check(self):
        for s, d, u, g in self.todo:
            self._check(s, d, u, g, self.gDef)
        # XXX then check the globals

    def _check(self, scope, _def, use, gbl, globals):
        # check for NameError
        # a name is defined iff it is in def.keys()
        # a name is global iff it is in gdefs.keys()
        gdefs = UserDict()
        gdefs.update(globals)
        gdefs.update(__builtin__.__dict__)
        defs = UserDict()
        defs.update(gdefs)
        defs.update(_def)
        errors = Set()
        for name in use.keys():
            if not defs.has_key(name):
                firstuse = use[name][0]
                self.warn(NameError, scope.name, firstuse, name)
                errors.add(name)

        # check for UndefinedLocalNameError
        # order == use & def sorted by lineno
        # elements are lineno, flag, name
        # flag = 0 if use, flag = 1 if def
        order = []
        for name, lines in use.items():
            if gdefs.has_key(name) and not _def.has_key(name):
                # this is a global ref, we can skip it
                continue
            for lineno in lines:
                order.append(lineno, 0, name)
        for name, lines in _def.items():
            for lineno in lines:
                order.append(lineno, 1, name)
        order.sort()
        # ready contains names that have been defined or warned about
        ready = Set()
        for lineno, flag, name in order:
            if flag == 0: # use
                if not ready.has_elt(name) and not errors.has_elt(name):
                    self.warn(UndefinedLocal, scope.name, lineno, name)
                    ready.add(name) # don't warn again
            else:
                ready.add(name)

    # below are visitor methods
        

    def visitFunction(self, node, noname=0):
        for expr in node.defaults:
            self.visit(expr)
        if not noname:
            self.defName(node.name, node.lineno)
        self.enterNamespace(node)
        for name in node.argnames:
            self.defName(name, node.lineno)
        self.visit(node.code)
        self.exitNamespace()
        return 1

    def visitLambda(self, node):
        return self.visitFunction(node, noname=1)

    def visitClass(self, node):
        for expr in node.bases:
            self.visit(expr)
        self.defName(node.name, node.lineno)
        self.enterNamespace(node)
        self.visit(node.code)
        self.exitNamespace()
        return 1

    def visitName(self, node):
        self.useName(node.name, node.lineno)

    def visitGlobal(self, node):
        for name in node.names:
            self.defName(name, node.lineno, local=0)

    def visitImport(self, node):
        for name in node.names:
            self.defName(name, node.lineno)

    visitFrom = visitImport

    def visitAssName(self, node):
        self.defName(node.name, node.lineno)
    
def check(filename):
    global p, checker
    p = parseFile(filename)
    checker = CheckNames(filename)
    walk(p, checker)
    checker.check()
    for w in checker.warnings:
        print w

if __name__ == "__main__":
    import sys

    # XXX need to do real arg processing
    check(sys.argv[1])

------------ badnames.py ------------
# XXX can we detect race conditions on accesses to global variables?
#     probably can (conservatively) by noting variables _created_ by
#     global decls in funcs
import string
import time

def foo(x):
    return x + y

def foo2(x):
    return x + z

a = 4

def foo3(x):
    a, b = x, a

def bar(x):
    z = x
    global z

def bar2(x):
    f = string.strip
    a = f(x)
    import string
    return string.lower(a)

def baz(x, y):
    return x + y + z

def outer(x):
    def inner(y):
        return x + y
    return inner




From Moshe Zadka <mzadka@geocities.com>  Tue Mar  7 05:25:43 2000
From: Moshe Zadka <mzadka@geocities.com> (Moshe Zadka)
Date: Tue, 7 Mar 2000 07:25:43 +0200 (IST)
Subject: [Compiler-sig] Re: example checkers based on compiler package
In-Reply-To: <14532.1740.90292.440395@goon.cnri.reston.va.us>
Message-ID: <Pine.GSO.4.10.10003070712480.4496-100000@sundial>

On Mon, 6 Mar 2000, Jeremy Hylton wrote:

> I think these kinds of warnings are useful, and I'd like to see a more
> general framework for them built are Python abstract syntax originally
> from P2C.  Ideally, they would be available as command line tools and
> integrated into GUIs like IDLE in some useful way.

Yes! Guido already suggested we have a standard API to them. One thing
I suggested was that the abstract API include not only the input (one form
or another of an AST), but the output: so IDE's wouldn't have to parse
strings, but get a warning class. Something like a:

An output of a warning can be a subclass of GeneralWarning, and should
implemented the following methods:

	1. line-no() -- returns an integer
	2. columns() -- returns either a pair of integers, or None
        3. message() -- returns a string containing a message
	4. __str__() -- comes for free if inheriting GeneralWarning,
	                and formats the warning message.

> I've included a couple of quick examples I coded up last night based
> on the compiler package (recently re-factored) that is resident in
> python/nondist/src/Compiler.  The analysis on the one that checks for
> name errors is a bit of a mess, but the overall structure seems right.

One thing I had trouble with is that in my implementation of selfnanny,
I used Python's stack for recursion while you used an explicit stack.
It's probably because of the visitor pattern, which is just another
argument for co-routines and generators.

> I'm hoping to collect a few more examples of checkers and generalize
> from them to develop a framework for checking for errors and reporting
> them.

Cool! 
Brainstorming: what kind of warnings would people find useful? In
selfnanny, I wanted to include checking for assigment to self, and
checking for "possible use before definition of local variables" sounds
good. Another check could be a CP4E "checking that no two identifiers
differ only by case". I might code up a few if I have the time...

What I'd really want (but it sounds really hard) is a framework for
partial ASTs: warning people as they write code.

--
Moshe Zadka <mzadka@geocities.com>. 
http://www.oreilly.com/news/prescod_0300.html



From mwh21@cam.ac.uk  Tue Mar  7 08:31:23 2000
From: mwh21@cam.ac.uk (Michael Hudson)
Date: 07 Mar 2000 08:31:23 +0000
Subject: [Compiler-sig] Re: example checkers based on compiler package
In-Reply-To: Moshe Zadka's message of "Tue, 7 Mar 2000 07:25:43 +0200 (IST)"
References: <Pine.GSO.4.10.10003070712480.4496-100000@sundial>
Message-ID: <m3u2ij89lw.fsf@atrus.jesus.cam.ac.uk>

Moshe Zadka <moshez@math.huji.ac.il> writes:

> On Mon, 6 Mar 2000, Jeremy Hylton wrote:
> 
> > I think these kinds of warnings are useful, and I'd like to see a more
> > general framework for them built are Python abstract syntax originally
> > from P2C.  Ideally, they would be available as command line tools and
> > integrated into GUIs like IDLE in some useful way.
> 
> Yes! Guido already suggested we have a standard API to them. One thing
> I suggested was that the abstract API include not only the input (one form
> or another of an AST), but the output: so IDE's wouldn't have to parse
> strings, but get a warning class. 

That would be seriously cool.

> Something like a:
> 
> An output of a warning can be a subclass of GeneralWarning, and should
> implemented the following methods:
> 
> 	1. line-no() -- returns an integer
> 	2. columns() -- returns either a pair of integers, or None
>         3. message() -- returns a string containing a message
> 	4. __str__() -- comes for free if inheriting GeneralWarning,
> 	                and formats the warning message.

Wouldn't it make sense to include function/class name here too?  A
checker is likely to now, and it would save reparsing to find it out.

[little snip]
 
> > I'm hoping to collect a few more examples of checkers and generalize
> > from them to develop a framework for checking for errors and reporting
> > them.
> 
> Cool! 
> Brainstorming: what kind of warnings would people find useful? In
> selfnanny, I wanted to include checking for assigment to self, and
> checking for "possible use before definition of local variables" sounds
> good. Another check could be a CP4E "checking that no two identifiers
> differ only by case". I might code up a few if I have the time...

Is there stuff in the current Compiler code to do control flow
analysis?  You'd need that to check for use before definition in
meaningful cases, and also if you ever want to do any optimisation...

> What I'd really want (but it sounds really hard) is a framework for
> partial ASTs: warning people as they write code.

I agree (on both points).

Cheers,
M.

-- 
very few people approach me in real life and insist on proving they are
drooling idiots.                         -- Erik Naggum, comp.lang.lisp



From DavidA@ActiveState.com  Wed Mar  8 00:24:12 2000
From: DavidA@ActiveState.com (David Ascher)
Date: Tue, 7 Mar 2000 16:24:12 -0800
Subject: [Compiler-sig] FYI: python CVS snapshots now include nondist subtree
Message-ID: <NDBBJPNCJLKKIOBLDOMJMEGNCBAA.DavidA@ActiveState.com>

If I didn't screw up, the nightly CVS snapshots available at
http://starship.python.net/crew/da/pythondists/ should now include the
nondist subtree..

--david



From jstok@bluedog.apana.org.au  Mon Mar 13 10:43:20 2000
From: jstok@bluedog.apana.org.au (Jason Stokes)
Date: Mon, 13 Mar 2000 21:43:20 +1100
Subject: [Compiler-sig] Is this the place to talk about the CNRI Python implementation?
Message-ID: <000001bf8cdb$41417ac0$4be60ecb@jstok>

That is to say, the main Python interpreter?  I know this is the compiler
sig, but there doesn't appear to be a list for hacking on the main
implementation.  Yet there must be.  Can anyone help me out?



From guido@python.org  Mon Mar 13 14:47:23 2000
From: guido@python.org (Guido van Rossum)
Date: Mon, 13 Mar 2000 09:47:23 -0500
Subject: [Compiler-sig] Is this the place to talk about the CNRI Python implementation?
In-Reply-To: Your message of "Mon, 13 Mar 2000 21:43:20 +1100."
 <000001bf8cdb$41417ac0$4be60ecb@jstok>
References: <000001bf8cdb$41417ac0$4be60ecb@jstok>
Message-ID: <200003131447.JAA19202@eric.cnri.reston.va.us>

> That is to say, the main Python interpreter?  I know this is the compiler
> sig, but there doesn't appear to be a list for hacking on the main
> implementation.  Yet there must be.  Can anyone help me out?

The Python newsgroup is the best place to start.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mhammond@skippinet.com.au  Wed Mar 15 02:12:11 2000
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Wed, 15 Mar 2000 13:12:11 +1100
Subject: [Compiler-sig] Update for list.append change
Message-ID: <ECEPKNMJLHAPFFJHDOJBMEINCGAA.mhammond@skippinet.com.au>

The P2C file "transformer.py" was bitten by the list.append change.

Here is a diff for the version in the CVS tree of the compiler - Bill or
Greg will also need to update P2C itself...

Mark.

RCS file:
/projects/cvsroot/python/nondist/src/Compiler/compiler/transformer.py,v
retrieving revision 1.8
diff -r1.8 transformer.py
572c572
<       results.append(type, self.com_node(nodelist[i]))
---
>       results.append( (type, self.com_node(nodelist[i])) )
839c839
<         clauses.append(expr1, expr2, self.com_node(nodelist[i+2]))
---
>         clauses.append( (expr1, expr2, self.com_node(nodelist[i+2])) )
961c961
<       items.append(self.com_node(nodelist[i]),
self.com_node(nodelist[i+2]))
---
>       items.append( (self.com_node(nodelist[i]),
self.com_node(nodelist[i+2])) )



From gstein@lyra.org  Thu Mar 16 12:17:30 2000
From: gstein@lyra.org (Greg Stein)
Date: Thu, 16 Mar 2000 04:17:30 -0800 (PST)
Subject: [Compiler-sig] Update for list.append change
In-Reply-To: <ECEPKNMJLHAPFFJHDOJBMEINCGAA.mhammond@skippinet.com.au>
Message-ID: <Pine.LNX.4.10.10003160417210.2258-100000@nebula.lyra.org>

Fixed and checked in. Thanx!

-g

On Wed, 15 Mar 2000, Mark Hammond wrote:

> The P2C file "transformer.py" was bitten by the list.append change.
> 
> Here is a diff for the version in the CVS tree of the compiler - Bill or
> Greg will also need to update P2C itself...
> 
> Mark.
> 
> RCS file:
> /projects/cvsroot/python/nondist/src/Compiler/compiler/transformer.py,v
> retrieving revision 1.8
> diff -r1.8 transformer.py
> 572c572
> <       results.append(type, self.com_node(nodelist[i]))
> ---
> >       results.append( (type, self.com_node(nodelist[i])) )
> 839c839
> <         clauses.append(expr1, expr2, self.com_node(nodelist[i+2]))
> ---
> >         clauses.append( (expr1, expr2, self.com_node(nodelist[i+2])) )
> 961c961
> <       items.append(self.com_node(nodelist[i]),
> self.com_node(nodelist[i+2]))
> ---
> >       items.append( (self.com_node(nodelist[i]),
> self.com_node(nodelist[i+2])) )
> 
> 
> _______________________________________________
> Compiler-sig mailing list
> Compiler-sig@python.org
> http://www.python.org/mailman/listinfo/compiler-sig
> 

-- 
Greg Stein, http://www.lyra.org/



From ludvig.svenonius@excosoft.se  Fri Mar 31 17:17:39 2000
From: ludvig.svenonius@excosoft.se (Ludvig Svenonius)
Date: Fri, 31 Mar 2000 19:17:39 +0200
Subject: [Compiler-sig] __getattr__ inflexibility
Message-ID: <NDBBKBOHGLKCLICFELPAIECHCAAA.ludvig.svenonius@excosoft.se>

I was wondering about the __getattr__-built-in method. Currently it is
called only if the attribute could not be found in the instance dictionary.
Would it not be more flexible to -always- call it upon referencing an
attribute, thus allowing programmers to override the default behaviour of
simply returning the value matching the name (for example, the instance
could dispatch an event before returning the value, or update it from an
outside source). What I'm missing in Python is a feature to define derived
member fields that don't simply contain static values, but rather dynamic
ones (like method return values) but in every other respect behave like a
normal member (included in the dir() listing, but referenced without using
parentheses).

The reason I'm asking for this is that I am trying to create an API as
syntactically similar to the W3X XML DOM Core
(http://www.w3.org/TR/REC-DOM-Level-1/) as possible, but the actual objects
behind the interface are only stubs that reference a dynamic instance tree
inside a C++-application embedding the Python interpreter. Thus, I would
like to be able to reference DOM members such as Node.nodeValue using the
ordinary Python syntax (n.nodeValue) but the actual value cannot be
represented as a normal Python member, since it must be fetched from within
the C++ application (via a supplied extension). I could accomplish this by
defining methods to retrieve the value instead of members, but the DOM
standard defines that these values should be member fields, not methods, so
in order to achieve syntactical compliance with DOM, I have to somehow
intervene when the member is being referenced, and manually update its value
before it is returned. I tried using __getattr__ for this, but because of
the limitation that it is only called if the attribute is not found in the
instance dictionary, it didn't work. I could get the reference syntax to
work by just manually checking the attribute names in __getattr__, calling
the extension functions and returning the values, but then the dir()
function will not list the "simulated" members, because they are not
actually in the dictionary, and if I try to get around this by putting them
there, then __getattr__ won't be called.

Perhaps there is another way to do what I'm trying to accomplish, but if
not, would it not be a good idea to change the semantics of __getattr__ to
be more similar with __setattr_, so attribute referencing behaviour can be
overridden to allow things like composite and derived members?

==================================================================
  Ludvig Svenonius - Researcher
  Excosoft AB * Electrum 420 * Isafjordsgatan 32c
  SE-164 40 Kista * Sweden
  Phones: +46 8 633 29 58 * +46 70 789 16 85
  mailto:ludvig.svenonius@excosoft.se
==================================================================



From thomas.heller@ion-tof.com  Fri Mar 31 19:48:45 2000
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 31 Mar 2000 21:48:45 +0200
Subject: [Compiler-sig] __getattr__ inflexibility
References: <NDBBKBOHGLKCLICFELPAIECHCAAA.ludvig.svenonius@excosoft.se>
Message-ID: <046701bf9b4a$20919440$4500a8c0@thomasnotebook>

> I was wondering about the __getattr__-built-in method. Currently it is
> called only if the attribute could not be found in the instance dictionary.
> Would it not be more flexible to -always- call it upon referencing an
> attribute, thus allowing programmers to override the default behaviour of
> simply returning the value matching the name (for example, the instance
> could dispatch an event before returning the value, or update it from an
> outside source). What I'm missing in Python is a feature to define derived
> member fields that don't simply contain static values, but rather dynamic
> ones (like method return values) but in every other respect behave like a
> normal member (included in the dir() listing, but referenced without using
> parentheses).

This could be achieved by simply allowing mapping objects instead of only
dictionaries. As I pointed out in a post to python-dev,
(see http://www.python.org/pipermail/python-dev/2000-March/004448.html)
the changes to Objects/classobject.c would be very small and would have
nearly no impact on performance.

The requirements I have are somewhat similar the what you describe.

Thomas Heller