[XML-SIG] DOCTYPE problem loading XML file.

Sun Apr 15 19:25:26 CEST 2007

2007/4/14, Brendon Costa <brendon at christian.net>:
> Hi all,
>
> I have a manual i am writing for a project I have been developing in
> docbook format. This manual contains "programlisting" nodes that show
> output generated from some scripts.
>
> I want to write a small application using python XML libraries that will
> load this docbook file and for each programlisting node with an id that
> starts with script_... i want to execute the script ... and replace the
> programlisting nodes value with the resulting output.
>
>
try this quick example (using amara lib):

{{{
import sys
import cStringIO
import amara
doc = amara.parse('doc.xml')

fout_old = sys.stdout
sys.stdout = cStringIO.StringIO()
for pl in doc.xml_xpath(u'//programlisting[@id]'):
    if pl.id[:7]=='script_':
        exec(unicode(pl))
        pl.xml_clear()
        pl.xml_append_fragment(sys.stdout.getvalue())
sys.stdout = fout_old

print doc.xml()
}}}


>
> Firstly does anyone know of an existing tool that could do this for me
> (I haven't been successful in finding one)?
>
>
>
>
> Otherwise i have been trying to create my own tool in python. The first
> stage which is loading the docbook XML file into python using the DOM
> parser. This is my first time dealing with python and XML.
>
> The code is so far VERY simple:
>
> import sys
> from xml.dom.ext.reader import Sax2
> reader = Sax2.Reader()
> doc = reader.fromStream(sys.argv[1])
>
> Running that using:
> python update_docbook.py manual.xml
>
> fails to load the manual.xml file. The XML file has a DOCTYPE. Now for
> my needs in modifying the document is don't care about the DOCTYPE, i
> just want to keep it intact as it is. Is there any way to tell the DOM
> parser that i don't care about the DOCTYPE?
>
>
> If this is not possible, following are the errors i get trying to load
> the docbook xml file.
>
> Firstly without a DTD available at all:
> ValueError: unknown url type: docbookx.dtd
>
>
> If i then copy across my DTD data into the current directory (DOCTYPE
> references a file in the current directory at the moment to avoid having
> to go to the internet all the time) it seems to find it as i would
> expect, but there are still other errors:
> xml.Sax._exceptions.SAXParseException: dbnotnx.mod:60:80: error in
> processing external entity reference
>
> and if i change the doctype back to the correct URL, i get the same
> error but:
> xml.Sax._exceptions.SAXParseException:
> http://www.oasis-open.org/docbook/xml/4.5/dbnotnx.mod:60:80: error in
> processing external entity reference
>
>
> So how would i go about loading this docbook xml file in python using
> DOM so i can then manipulate it? Would you recommend that i change to
> use a Sax parser and if so can it be used to ignore the DOCTYPE?
>
>
> Thanks for any info.
> Brendon.
>
>
>
> _______________________________________________
> XML-SIG maillist  -  XML-SIG at python.org
> http://mail.python.org/mailman/listinfo/xml-sig
>


-- 
Saludos,

--

Luis Miguel