Re: [Python-Dev] ConfigParser shootout, preliminary entry

21 Oct 2004

      On Tue, 2004-10-19 at 11:00, Guido van Rossum wrote:
...
2. We're handling modest amounts of XML, all using home-grown DTDs and
with no specific requirements to interface to other apps or XML tools.
I wrote a metaclass which lets me specify the DTD using Python syntax.
Sounds like my recent situation.  I've done enough custom XML-ing lately
that I've been thinking alone similar lines as you.  Note that most of
what I've written lately uses minidom, although I do have one particular
application that uses sax.  Both are powerful enough to do the job, but
neither are that intuitive, IMO.
...
Again, my approach is slightly lower-level than previous proposals
here but has the advantage of letting you be explicit about the
mapping between Python and XML names, both for attributes and for
subelements. The metaclass handles reading and writing. It supports
elements containing text (is that CDATA? I never know)
I'm no XML guru, but I think they're different.  In the one case you
have something like:

<node>text for the node</node>

and in the other you have:

<node><![CDATA[cdata, er, um, data]]></node>

The differences being that the CDATA stuff shows up in a subnode of
<node> and has less restriction on what data can be included within the
delimiters.

My applications use both.
...
or
sub-elements, but not both. For sub-elements, it supports cases where
one element has any number of sub-elements of a certain type, which
are then collected in a list, so you can refer to them using Python
sequence indexing/slicing notation. It also supports elements that
have zero or one sub-element of a certain type; absence is indicated
by setting the corresponding attribute to None. I don't support
namespaces, although I expect it would be easy enough to add them. I
don't support unrecognized elements or attributes: while everything
can be omitted (and defaults to None), unrecognized attributes or
elements are always rejected. (I suppose that could be fixed too if
desired.)
I have use cases for both behaviors.  OT1H, I generally want to reject
unknown elements or attributes, reject duplicate elements where my "DTD"
doesn't allow them, etc.  In at least one case I'm doing something
that's probably evil, where sub-elements name email headers and the text
inside provide the data for the header.  I'm sure XML experts cringe at
that and suggest I use something like:

<header name="to">value</header>

or somesuch instead.
...
Here's an example:
[deleted]

That actually doesn't look too bad.  Do you think you'll be able to
release your stuff?  I don't have anything generic enough to be useful
yet, but I probably could release stuff if/when I do.
...
I'm undecided on whether I like the approach with lists of (name,
type) tuples better than the approach with property factories like in
the first example; the list approach allows me to order the attributes
and sub-elements consistently upon rendering, but I'm not particularly
keen on typing string quotes around Python identifiers.
The property factories are nice, and I have the same aversion to string
quoting Python identifiers.  I personally have not had a use case for
retaining sub-element order.

I may play with my own implementation of your spec and see how far I can
get.  I definitely would like to see /something/ at a higher abstraction
than minidom though.

-Barry

Re: [Python-Dev] ConfigParser shootout, preliminary entry

Barry Warsaw