Topy news - and a request for Python pretty-printing

Sun Apr 14 19:51:30 EDT 2002

[Tim Peters]

> [François Pinard]
> > ...
> > Would someone happens to know if there is a pretty-printer for Python
> > source _lines_, where I would find some inspiration maybe?

> The closest I've seen is the "block comment" reflowing facilities in
> various Python editors.

Comments are not so bad.  When Topy moves comments over from the
original source to the generated Python code, it does not make them longer.
The comments coming from the `-C' option, which asks for non-comment source
lines to be re-inserted as comments near the corresponding Python code, have
`#LINE: ' added at their beginning, this causes overflow only for lines
which were long already, I consider this bearable.  Diagnostic warnings,
when inserted as comments, currently have a full FILE:LINE:COLUMN reference,
and when FILE is long, overflow may occur.  But I also take this as bearable.

> For very long lines with clear structure (lists, tuples and dict
> constructors; function calls), I suppose you could do a lot worse than to
> feed them to Emacs, injecting a newline after every comma.  Then they'll
> line up the way Guido would type them by hand <wink>.

I am surely not going to call Emacs from Topy :-).  Does Guido use Emacs?
Just curious!  In Emacs Python mode, there are many good ideas which inspired
my own writing style, thanks to those who wrote it.  On the other hand,
some choices are not satisfactory, and I saw cases when Emacs does not
properly reflect structure imbrication -- I do not have examples in head
-- but when this occurs, I prefer rewriting the little affected parts, so
Emacs Python mode behaves more nicely.  It's a give and take, and that's OK!

Of course, I've been interested in the `pprint' module, at least
for structures, but not enough to dive and extend it to statements and
expressions.  The main point is that `pprint' works on in-memory structures,
while I need to beautify surface, textual Python.  Or almost: at the time
Topy needs beautifying, the output surface line has already been generated
as a list of textual fragments, which only need to be conceptually joined
and written out.  For now, I merely kludged something quick to decide
how to split these fragments between continued lines, driven by simple
heuristics looking at the contents of each fragment (is it a string?
how many characters?  any parentheses?  any initial or trailing space?).

To do real beauty work, I should change the Topy back-end and many parts in
front-ends to generate Python structures instead of textual fragments, add
a lot more specs to the tree language between front-ends and the back-end,
and probably add one more pass in the back-end, or nearly.  I'll postpone all
this, as I do not want to be overly distracted by pretty-printing now, and
there are much more pressing things to address before Topy exists for real.

Speaking of, I started to retrofit my previous uncompleted `perl2py'
as a new Topy front-end.  Currently, Topy uses SPARK, which I find easy
and satisfying.  But for Perl, I had two problems with SPARK: it was
inordinately slow (I might not be writing good grammars), and I did not
find how to express the assymetrical precedence for Perl named operators
(they bind lousely to their left, and rather aggressively to their right).
I still have more retrofitting work to do before being really ready for
revisiting these two areas, though.

For Scheme and Milou, the two languages which Topy is likely to handle
first, I tried something which might have been fruitful speedwise, I guess,
I'm not sure.  Once the scanner has produced tokens for the whole source,
I have a quick and simple pass which segments tokens into capsules, a
capsule merely represents a single definition or expression at the outer
level of the source.  Then, I call the parser on each capsule in turn.
I have the prejudice that I help the parser by giving it smaller chunks
than the whole source.  Moreover, since the number of parse calls is known
before we begin them all, Topy may even entertain the user with progress
counters :-).  Topy collect all the parse results together, as this is
useful in subsequent passes.  Those are speedy enough over the whole thing.

The Topy grammar for Perl is significantly bigger, so I'm not sure what
will happen, even if I try to implement special parser help like above.
Another possibility would be that I consider PLY instead of SPARK, or
maybe use both at once depending on the front-end, yet I would expect a
few intricacies at the nitty-gritty level.  I did not really look at PLY
yet, but I've been told that if I like SPARK as a programmer (and I do!),
PLY might please me as well, as PLY reuses many of SPARK good ideas from
the programmers's viewpoint.  PLY grammars are weaker than SPARK's, but
they might be much faster on Perl.  Sigh!  One sure thing is that I like
SPARK generality, elegance, and unobtrusiveness.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard