help needed creating xml document with xml.dom

Brian Lalor blalor at insight.com
Fri Sep 6 11:39:58 EDT 2002


Good morning, all.  Several times, now, I've used xml.dom.minidom to create
raw XML documents, but I don't think I've ever really done it the "right"
way.  I blame that on my seeming inability to find documentation on the
"right" way to do it in Python, or in any language, for that matter.  Pointers
to said documentation are welcomed. :-)

That out of the way, here's my current problem.  I'm trying to write a script
that will spin through a set of images taken with my digital camera, pull out
the EXIF data and save it to an XML document and also extract the JPEG
thumbnail from those images.  Gene Cash's excellent EXIF.py[1] does the dirty
work with getting the EXIF data.  I also want to create an XHTML document that
is the index of all of these thumbnails, and that's where things started going
downhill.  A "proper" XHTML document requires a well-formed XML document along
with a doctype definition(?) below the <?xml ?> declaration and an xmlns
attribute to the <html/> tag.  Using minidom, I cannot seem to get either the
namespace or doctype to print out.

Here's the relevant code snippet:

    import xml.dom
    
    # is this the right way, instead of directly importing xml.dom.minidom?
    DOM = xml.dom.getDOMImplementation()
    
    # now create the doctype and the document element
    doctype = DOM.createDocumentType('html',
    								 '-//W3C//DTD XHTML 1.0 Strict//EN',
    								 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd')
    
    index_page = DOM.createDocument(xml.dom.XHTML_NAMESPACE, 'html', doctype)

    index_page.documentElement.setAttribute('xml:lang', 'en')
    index_page.documentElement.setAttribute('lang', 'en')


So, now if I do a index_page.toprettyxml(), I only get
    <?xml version="1.0" ?>
    <html xml:lang="en" lang="en"/>

... but I *want*
    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE html
            PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
            "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
    
    <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"/>

Is this a bug in minidom, or (more likely) am I doing something wrong?  I'm
using Python 2.2.1 but I have PyXML 0.7 installed, so that's where minidom.py
is coming from.

Thanks,
B

[1] http://home.cfl.rr.com/genecash/digital_camera.html

--
      Brian Lalor                 |    http://introducingthelalors.org/
  blalor at ithacabands.org (email)  |  blalor at jabber.ithacabands.org (jabber)
                       N33°27.369' W111°56.304' (Earth)




More information about the Python-list mailing list