[XML-SIG] DOCTYPE problem loading XML file.
Brendon Costa
brendon at christian.net
Mon Apr 16 00:27:16 CEST 2007
Thanks that worked great (with a few minor modifications). The resulting
script that achieved it for reference was:
import sys
import amara
import commands
doc = amara.parse(sys.argv[1])
for pl in doc.xml_xpath(u'//programlisting[@id]'):
if pl.id[:7] == 'script_':
value = commands.getoutput(unicode(pl.id[7:]))
pl.xml_clear()
pl.xml_append(unicode(value))
print doc.xml()
Luis Miguel Morillas wrote:
> 2007/4/14, Brendon Costa <brendon at christian.net>:
>> Hi all,
>>
>> I have a manual i am writing for a project I have been developing in
>> docbook format. This manual contains "programlisting" nodes that show
>> output generated from some scripts.
>>
>> I want to write a small application using python XML libraries that will
>> load this docbook file and for each programlisting node with an id that
>> starts with script_... i want to execute the script ... and replace the
>> programlisting nodes value with the resulting output.
>>
>>
> try this quick example (using amara lib):
>
> {{{
> import sys
> import cStringIO
> import amara
> doc = amara.parse('doc.xml')
>
> fout_old = sys.stdout
> sys.stdout = cStringIO.StringIO()
> for pl in doc.xml_xpath(u'//programlisting[@id]'):
> if pl.id[:7]=='script_':
> exec(unicode(pl))
> pl.xml_clear()
> pl.xml_append_fragment(sys.stdout.getvalue())
> sys.stdout = fout_old
>
> print doc.xml()
> }}}
>
>
>
>>
>> Firstly does anyone know of an existing tool that could do this for me
>> (I haven't been successful in finding one)?
>>
>>
>>
>>
>> Otherwise i have been trying to create my own tool in python. The first
>> stage which is loading the docbook XML file into python using the DOM
>> parser. This is my first time dealing with python and XML.
>>
>> The code is so far VERY simple:
>>
>> import sys
>> from xml.dom.ext.reader import Sax2
>> reader = Sax2.Reader()
>> doc = reader.fromStream(sys.argv[1])
>>
>> Running that using:
>> python update_docbook.py manual.xml
>>
>> fails to load the manual.xml file. The XML file has a DOCTYPE. Now for
>> my needs in modifying the document is don't care about the DOCTYPE, i
>> just want to keep it intact as it is. Is there any way to tell the DOM
>> parser that i don't care about the DOCTYPE?
>>
>>
>> If this is not possible, following are the errors i get trying to load
>> the docbook xml file.
>>
>> Firstly without a DTD available at all:
>> ValueError: unknown url type: docbookx.dtd
>>
>>
>> If i then copy across my DTD data into the current directory (DOCTYPE
>> references a file in the current directory at the moment to avoid having
>> to go to the internet all the time) it seems to find it as i would
>> expect, but there are still other errors:
>> xml.Sax._exceptions.SAXParseException: dbnotnx.mod:60:80: error in
>> processing external entity reference
>>
>> and if i change the doctype back to the correct URL, i get the same
>> error but:
>> xml.Sax._exceptions.SAXParseException:
>> http://www.oasis-open.org/docbook/xml/4.5/dbnotnx.mod:60:80: error in
>> processing external entity reference
>>
>>
>> So how would i go about loading this docbook xml file in python using
>> DOM so i can then manipulate it? Would you recommend that i change to
>> use a Sax parser and if so can it be used to ignore the DOCTYPE?
>>
>>
>> Thanks for any info.
>> Brendon.
>>
>>
>>
>> _______________________________________________
>> XML-SIG maillist - XML-SIG at python.org
>> http://mail.python.org/mailman/listinfo/xml-sig
>>
>
>
More information about the XML-SIG
mailing list