RE: [Patches] selfnanny.py: checking for "self" in every method

[Guido van Rossum]
Before we all start writing nannies and checkers, how about a standard API design first? I will want to call various nannies from a "Check" command that I plan to add to IDLE. I already did this with tabnanny, and found that it's barely possible -- it's really written to run like a script.
I like Moshe's suggestion fine, except with an abstract base class named Nanny with a virtual method named check_ast. Nannies should (of course) derive from that.
Since parsing is expensive, we probably want to share the parse tree.
What parse tree? Python's parser module produces an AST not nearly "A enough" for reasonably productive nanny writing. GregS & BillT have improved on that, but it's not in the std distrib. Other "problems" include the lack of original source lines in the trees, and lack of column-number info. Note that by the time Python has produced a parse tree, all evidence of the very thing tabnanny is looking for has been removed. That's why she used the tokenize module to begin with. God knows tokenize is too funky to use too when life gets harder (check out checkappend.py's tokeneater state machine for a preliminary taste of that). So the *only* solution is to adopt Christian's Stackless so I can rewrite tokenize as a coroutine like God intended <wink>. Seriously, I don't know of anything that produces a reasonably usable (for nannies) parse tree now, except via modifying a Python grammar for use with John Aycock's SPARK; the latter also comes with very pleasant & powerful tree pattern-matching abilities. But it's probably too slow for everyday "just folks" use. Grabbing the GregS/BillT enhancement is probably the most practical thing we could build on right now (but tabnanny will have to remain a special case). unsure-about-the-state-of-simpleparse-on-mxtexttools-for-this-ly y'rs - tim

On Sat, 4 Mar 2000, Tim Peters wrote:
I like Moshe's suggestion fine, except with an abstract base class named Nanny with a virtual method named check_ast. Nannies should (of course) derive from that.
Why? The C++ you're programming damaged your common sense cycles?
Since parsing is expensive, we probably want to share the parse tree.
What parse tree? Python's parser module produces an AST not nearly "A enough" for reasonably productive nanny writing.
As a note, selfnanny uses the parser module AST.
GregS & BillT have improved on that, but it's not in the std distrib. Other "problems" include the lack of original source lines in the trees,
The parser module has source lines.
and lack of column-number info.
Yes, that sucks.
Note that by the time Python has produced a parse tree, all evidence of the very thing tabnanny is looking for has been removed. That's why she used the tokenize module to begin with.
Well, it's one of the few nannies which would be in that position.
God knows tokenize is too funky to use too when life gets harder (check out checkappend.py's tokeneater state machine for a preliminary taste of that).
Why doesn't checkappend.py uses the parser module?
Grabbing the GregS/BillT enhancement is probably the most practical thing we could build on right now
You got some pointers?
(but tabnanny will have to remain a special case).
tim-will-always-be-a-special-case-in-our-hearts-ly y'rs, Z. -- Moshe Zadka <mzadka@geocities.com>. http://www.oreilly.com/news/prescod_0300.html

[Tim]
[make Nanny a base class]
[Moshe Zadka]
Why?
Because it's an obvious application for OO design. A common base class formalizes the interface and can provide useful utilities for subclasses.
The C++ you're programming damaged your common sense cycles?
Yes, very, but that isn't relevant here <wink>. It's good Python sense too.
[parser module produces trees far too concrete for comfort]
As a note, selfnanny uses the parser module AST.
Understood, but selfnanny has a relatively trivial task. Hassling with tuples nested dozens deep for even relatively simple stmts is both a PITA and a time sink.
[parser doesn't give source lines]
The parser module has source lines.
No, it does not (it only returns terminals, as isolated strings). The tokenize module does deliver original source lines in their entirety (as well as terminals, as isolated strings; and column numbers).
and lack of column-number info.
Yes, that sucks.
... Why doesn't checkappend.py uses the parser module?
Because it wanted to display the acutal source line containing an offending "append" (which, again, the parse module does not supply). Besides, it was a trivial variation on tabnanny.py, of which I have approximately 300 copies on my disk <wink>.
Grabbing the GregS/BillT enhancement is probably the most practical thing we could build on right now
You got some pointers?
Download python2c (http://www.mudlib.org/~rassilon/p2c/) and grab transformer.py from the zip file. The latter supplies a very useful post-processing pass over the parse module's output, squashing it *way* down.

On Sun, 5 Mar 2000, Tim Peters wrote:
[Tim]
[make Nanny a base class]
[Moshe Zadka]
Why?
Because it's an obvious application for OO design. A common base class formalizes the interface and can provide useful utilities for subclasses.
The interface is just one function. You're welcome to have a do-nothing nanny that people *can* derive from: I see no point in making them derive from a base class.
As a note, selfnanny uses the parser module AST.
Understood, but selfnanny has a relatively trivial task.
That it does, and it was painful.
[parser doesn't give source lines]
The parser module has source lines.
No, it does not (it only returns terminals, as isolated strings).
Sorry, misunderstanding: it seemed obvious to me you wanted line numbers. For lines, use the linecache module...
You got some pointers?
Download python2c (http://www.mudlib.org/~rassilon/p2c/) and grab transformer.py from the zip file.
I'll have a look. Moshe Zadka <mzadka@geocities.com>. http://www.oreilly.com/news/prescod_0300.html

[parser doesn't give source lines]
The parser module has source lines.
No, it does not (it only returns terminals, as isolated strings). The tokenize module does deliver original source lines in their entirety (as well as terminals, as isolated strings; and column numbers).
Moshe meant line numbers - -it has those.
Why doesn't checkappend.py uses the parser module?
Because it wanted to display the acutal source line containing an offending "append" (which, again, the parse module does not supply). Besides, it was a trivial variation on tabnanny.py, of which I have approximately 300 copies on my disk <wink>.
Of course another argument for making things more OO. (The code used in tabnanny.py to process files and recursively directories fronm sys.argv is replicated a thousand times in various scripts of mine -- Tim took it from my now-defunct takpolice.py. This should be in the std library somehow...)
Grabbing the GregS/BillT enhancement is probably the most practical thing we could build on right now
You got some pointers?
Download python2c (http://www.mudlib.org/~rassilon/p2c/) and grab transformer.py from the zip file. The latter supplies a very useful post-processing pass over the parse module's output, squashing it *way* down.
Those of you who have seen the compiler-sig should know that Jeremy made an improvement which will find its way into p2c. It's currently on display in the Python CVS tree in the nondist branch: see http://www.python.org/pipermail/compiler-sig/2000-February/000011.html and the ensuing thread for more details. --Guido van Rossum (home page: http://www.python.org/~guido/)

"TP" == Tim Peters <tim_one@email.msn.com> writes:
Grabbing the GregS/BillT enhancement is probably the most practical thing we could build on right now
You got some pointers?
TP> Download python2c (http://www.mudlib.org/~rassilon/p2c/) and TP> grab transformer.py from the zip file. The latter supplies a TP> very useful post-processing pass over the parse module's output, TP> squashing it *way* down. The compiler tools in python/nondist/src/Compiler include Bill & Greg's transformer code, a class-based AST (each node is a subclass of the generic node), and a visitor framework for walking the AST. The APIs and organization are in a bit of flux; Mark Hammond suggested some reorganization that I've not finished yet. I may finish it up this evening. The transformer module does a good job of incuding line numbers, but I've occasionally run into a node that didn't have a lineno attribute when I expected it would. I haven't taken the time to figure out if my expection was unreasonable or if the transformer should be fixed. The compiler-sig might be a good place to discuss this further. A warning framework was one of my original goals for the SIG. I imagine we could convince Guido to move warnings + compiler tools into the standard library if they end up being useful. Jeremy
participants (4)
-
Guido van Rossum
-
Jeremy Hylton
-
Moshe Zadka
-
Tim Peters