[XML-SIG] Re: XML DTD for Python source?

Paul Prescod paul@prescod.net
Thu, 02 Mar 2000 21:49:15 -0800

I'm going to approach your problem philosophically because technically
the issue is pretty easy to solve using Neel's algorithm.

Summary, having an XML encoding cannot hurt but IMHO it is mostly a hack
to get around flaws in various tools that in a perfect world would be
more directly addressed.

Greg Wilson wrote:
> ...
> I'm interested in exploring what would happen if I could
> do with
> programs what I do with hypertext:
> - apply a DTD to switch between Scheme-style parenthesizing, Python-style
>   indentation, or C-style bracing

I think you mean "apply a stylesheet". But you could easily write a
Python program that walks the AST and generates different styles. So the
first question is, what benefit do you get from using a stylesheet
language instead of Python?

Second question: what makes Python not a stylesheet language?

Third question: Why can't your stylesheet language walk a Python AST?
Python's object-within-object data model is pretty universal in the
computing world.

http://www.prescod.net/groves/shorttut See section 2.4
> - embed arbitrary information (images, optimization hints for the compiler,
> etc.) in

Sounds like structured comments to me.

>   a way that third-party browsers and processors can handle
> (specially-formatted
>   comments are *not* the answer)

Third party browsers and processors can handle a Python AST if you give
them a plug-in. And what is an XSLT stylesheet if not a plug-in in a
funny syntax? :)

Anyhow, if you are really interested in third-party browsers, the best
thing would be to convert into colorized *HTML*.

> I realize that human beings would not want to type:
>     if (i < 10):
> instead of:
>     if (i < 10):
> but that's what editors are for.  

Dozens of people have tried and given up on programming in XML. Let's
start with the problem of verbosity. Your if statement above is not even
close to what a semantic "Python DTD" would require. It would be more

<if><less-than><variable name="i"/><integer val="10"/></less-than></if>

That's pretty trick to read. So what we really need is for the editor to
compress it to something more readable -- like the original notation. So
now the editor needs detailed understanding of Python (perhaps as a
plugin or configuration file). Given this, why don't you use a
Python-smart *text* editor?

Seriously, I get paid to customize XMetaL and its ilk for document
types. It's hard enough making them usable for the document-world they
were designed for. Making them good programming environments would be
more work than adding multimedia hooks to Emacs pymode (IMHO).

> Yes, but not one that's easily accessible to (for example) XMetal, or the
> next
> generation of XML-aware 'diff' tools.

Why not look at the problem the other way? Trees are common in many
endeavors. What if the diff tools allowed plugins to adapt them to
Python syntax trees? Better that tools bend-over backwards to accomodate
the data than vice versa.

> One of the big motivations for my interest is that I don't expect kids my
> niece's
> age to put up with glass typewriters as a programming environment. They're
> already building web pages with images, their choices of color, etc.
> Existing IDEs
> mostly just put lipstick on this particular toad...

Maybe prograph?? http://www.pictorius.com/PrographWindows.html

> ...and then define a standard for specially-formatted comments that contain
> URLs
> and other enriched information, and turn the little parser into a browser
> plug-in so
> that it can translate .py files on the fly, and... That might be the only
> way forward,
> but I am interested in exploring what happens if we finally do to our
> programs what
> we're done to everyone else's documents :-).

The reason we need XML for documents is because document types multiple
like bunnies and writing parsers for all of them gets out of hand
quickly. Programming languages multiply more slowly and readability of
the source is of paramount importance.

If it were possible to "plug in" grammars for arbitrary notations and it
was easy to write the grammars (e.g. no rules against left recursion
etc.) then XML would not exist. Instead, parser writing is a major pain
in the ass, so XML allows us to avoid it. Blame Noam Chomsky (or, more
appropriately, the prime mover, Noam is just the weatherman).
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"We still do not know why mathematics is true and whether it is
certain. But we know what we do not know in an immeasurably richer way
than we did. And learning this has been a remarkable achievement,
among the greatest and least known of the modern era." 
        - from "Advent of the Algorithm" David Berlinski