But JBoss do care, and fails loading the XML document when DOCTYPE is changed from <!DOCTYPE log4j:configuration SYSTEM "log4j.dtd”> to <!DOCTYPE configuration SYSTEM "log4j.dtd”>, because “configuration” no longer matches the log4j:configuration root.

If I do the modification to the XML document with Perl XML::LibXML, then the DOCTYPE isn’t changed.
But because we using Ansible for application deployment, we need to use lxml. Unless I do some ugly stuff.   


=:-) Kim Grønborg Nielsen
E kgn+lxml@network-it.dk


Begin forwarded message:

From: Stefan Behnel <stefan_ml@behnel.de>
Subject: Re: [lxml] docinfo.doctype don't return the original doctype
Date: 28 December 2017 at 08.35.43 CET
To: lxml@lxml.de

Am 27. Dezember 2017 19:35:20 MEZ schrieb "Kim Grønborg Nielsen":
I’m using following code to extract DOCTYPE with python 2.7 and lxml
4.1.1:
from lxml import tree
from StringIO import StringIO

if __name__ == '__main__':
  doc = etree.parse(StringIO('''<?xml version="1.0"?>
<!DOCTYPE log4j:configuration SYSTEM "log4j.dtd">
<log4j:configuration xmlns:log4j = "http://jakarta.apache.org/log4j/"
debug="false">
 <a>tasty</a>
</log4j:configuration>'''))
  print "Type: {}\n".format(doc.docinfo.doctype)

But it returns: Type: <!DOCTYPE configuration SYSTEM "log4j.dtd”>
And not, as I expected: Type: <!DOCTYPE log4j:configuration SYSTEM
"log4j.dtd">

The DOCTYPE looks correct to me. DTDs are not namespace aware and do not know our care about prefixes.

Stefan
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml@lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml