[Tutor] Injecting Data into XML Files

Dave Kuhlman dkuhlman at rexx.com
Mon Sep 11 18:57:28 CEST 2006


On Mon, Sep 11, 2006 at 12:11:37PM -0400, William O'Higgins Witteman wrote:
> I am wrestling with the incredibly vast array of XML parsing and writing
> documentation, and I'm not seeing (or perhaps not understanding) what
> I'm looking for.  Here's the situation:
> 
> I have a large number of XML documents to add data to.  They are
> currently skeletal documents, looking like this:
> 
> <?xml version="1.0" ?>
> <!DOCTYPE rdf:RDF SYSTEM "local.dtd">
> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
>    <rdf:Description rdf:about="local_file">
>       <tagname></tagname>
>       <anothertagname></anothertagname>
>       ...
> 
> What I want is to open each document and inject some data between
> specific sets of tags.  I've been able to parse these documents, but I am
> not seeing how to inject data between tags so I can write it back to the
> file.  Any pointers are appreciated.  Thanks.

*How* did you parse your XML document?  If you parsed it and
produced a minidom tree or, better yet, an ElementTree tree,
you can modify the DOM tree, and then you can write that tree out
to disk.

Here is a bit of code to give you the idea with ElementTree (or
lxml, which uses the same API as ElementTree):

    from elementtree import ElementTree as etree
    doc = etree.parse('content.xml')
    root = doc.getroot())
    # Do something with the DOM tree here.
        o
        o
        o
    # Now write the tree back to disk.
    f = open('tmp.xml', 'w')
    doc.write(f)
    f.close()

Here is info on ElementTree -- Scroll down and look at the example
in the section titled "Usage", which seems to do something very
similar to what you ask about:

    http://effbot.org/zone/element-index.htm


And, lxml -- same API as ElementTree plus additional capabilities,
but requires installation of libxml:

    http://codespeak.net/lxml/

Also, minidom:

    http://docs.python.org/lib/module-xml.dom.minidom.html

Dave

-- 
Dave Kuhlman
http://www.rexx.com/~dkuhlman


More information about the Tutor mailing list