[Doc-SIG] DPS DTDs

Tony J Ibbs (Tibs) tony@lsl.co.uk
Mon, 10 Sep 2001 10:35:29 +0100


David Goodger wrote (in reply to me):
> > (but I'd appreciate waiting until I've done more of my work first,
> > as that's a significant contribution to the Python specific DTD)
>
> Could you explain the contribution to the DTD? (I haven't
> read your pydps modules yet, so maybe I'd best be quiet.)

Heh, asking questions is a Good Thing!

OK - pydps has three phases (broadly speaking):

1. Acquire a parse tree from a package/module and shove
   it into Jeremy's `compiler` tree structure.

2. Take the information from that tree and create a
   DPS nodes tree therefrom.

3. Take the information from *that* tree and produce
   HTML from it (currently in a rather naff manner for
   speed and to prove the principle).

Obviously the structure of tree 1 is fixed (by the `compiler` module).
The structure of tree 2 is fixed *for the docstrings*, so that bit is
easy. The structure for the "Python" parts of the tree is not fixed -
that is, you wrote a DTD for it, but I'm afraid I've strayed from it
rather (erm, yes, well), and am building a new ad-hoc structure instead,
that seems to fit three broad criteria (isn't "3" such a good number!):

i. It vaguely builds outward from the DPS node tree structure.

ii. Its XML representation seems to me to be vaguely reasonable
    (to the extent that I understand such things) - this isn't
    a *requirement* of the DPS system, but has obviously
    informed your design of the DPS node tree as well).

iii. It's not too hard to generate HTML (and, of course, in the
     back of my mind, LaTeX, reST, and any other odd formats one
     might want).

(i) is the vaguest of these, and you'll (eventually) have to be the
ultimate judge on that. (ii) is based to some extent on my experiences
in the GML world, and mainly comes down to [a] not being scared to have
nested elements and [b] trying to decide when an attribute is sensible
instead of an element. (iii) is mostly to do with when things are a list
element containing "atomic" elements. They all sort of play in the same
direction.

What this means is that the schema is *not* written down anywhere except
implicitly in the code (and in my head), and at some point I need to
write the appropriate XML schema and generate a DTD from it.

As a simple example, here is something that shows what I was working on
last night. Given the Python::

    a = 'b'
    class Fred:
        """A *silly* demonstration."""
        def __init__(self, b=1, c='jim', d=None, f={'a':1,a:1},
                     g=[x for x in [1,2,3] if x > 2]):
            self.list = g

we can produce the XML tree (using the normal methods of doing such from
a DPS nodes tree)::

  <?xml version="1.0" ?>
  <document>
    <py_module filename="U:\reST\pydps\testsimp.py"
               fullname="testsimp" name="testsimp">
      <py_namelist>
        <py_name name="a"/>
      </py_namelist>
      <py_class fullname="testsimp.Fred" name="Fred">
        <py_docstring>
          <paragraph>
            A
            <emphasis>
              silly
            </emphasis>
             demonstration.
          </paragraph>
        </py_docstring>
        <py_method fullname="testsimp.Fred.__init__"
                   name="__init__">
          <py_namelist>
            <py_name name="list"/>
          </py_namelist>
          <py_param_list>
            <py_param>
              self
            </py_param>
            <py_param>
              b=1
            </py_param>
            <py_param>
              c='jim'
            </py_param>
            <py_param>
              d=None
            </py_param>
            <py_param>
              f={'a':1,a:1}
            </py_param>
            <py_param>
              g=[x for x in [1,2,3] if x &gt; 2]
            </py_param>
          </py_param_list>
        </py_method>
      </py_class>
    </py_module>
  </document>

Typically verbose (this *is* XML), but I think it makes sense. As you
might guess, this weekend has been spent working on representation of
the RHS of assignments (I made the mistake of trying to represent
``restructuredtext/states.py``, which has a Getattr node in one of the
argument lists). That work isn't finished yet (it copes with list
comprehensions, but not with, for instance, multiplication!), but it's
actually a good way of getting a better understanding of how the
compiler module works.

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
You said "run as root" and "securely" in the same sentence relating to
CGI. You're funny! -- Ignacio Vazquez-Abrams, on the Python list
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)