XML DTD for Python source?

Neel Krishnaswami neelk at brick.cswv.com
Sat Mar 4 02:54:23 CET 2000

Paul Prescod <paul at prescod.net> wrote:
> Second question: what makes Python not a stylesheet language?
> Third question: Why can't your stylesheet language walk a Python AST?
> Python's object-within-object data model is pretty universal in the
> computing world.
> http://www.prescod.net/groves/shorttut See section 2.4

Thanks for this article; it helped me understand an experience I had
last year much better.

One of my hobbies is running role-playing games, and for one of these
games I wanted to build a database of characters, with indexes of
character statistics, and cross-references to who knew whom, and so
on. Thinking that this might be a good chance to learn about XML, I
proceeded to design a DTD to describe characters and then I tried
writing a program to grab characters, and emit output documents in the
formats I wanted. I didn't finish, because I grew frustrated with the
amount of work it took to simply read the character sheets and build a
nice representation of them as Python objects.

So I gave up and turned the character sheets into Common Lisp s-exps,
basically by turning every instance of "<tag>...</tag>" into "(tag
...)". I then used the Lisp reader to read the sheets as lists, and
then wrote a tiny, brainless little program to munge the character
lists into the output formats I wanted. At the time, I chalked it up
to Lisp having a 40-year head start in the whole symbolic manipulation
game, but having read your article I now believe it's because I was
able to define my own abstractions much more easily than I was able to
with SAX and the XML DOM.

Thanks: now I'll have to go learn more about groves, to see if there
are any more insights I can steal^H^H^H glean.

> If it were possible to "plug in" grammars for arbitrary notations
> and it was easy to write the grammars (e.g. no rules against left
> recursion etc.) then XML would not exist. Instead, parser writing is
> a major pain in the ass, so XML allows us to avoid it. Blame Noam
> Chomsky (or, more appropriately, the prime mover, Noam is just the
> weatherman).

I think I'm not as defeatist as you are, yet. :) Building a general
parsing toolkit with a Tomita-style parser would be a very worthy
activity, imo, and neatly avoids all the problems with left-recursion
and most of the problems with ambiguous grammars. (The monadic parser
combinator papers in the FPL world show a way to write parsers that
are barely more complicated than the grammar specification, too.)


More information about the Python-list mailing list