externalDTDs are overwritten when distinct multiply specified
Hello lxml-ers! I have an existing DTD document that includes several external DTDs. Each of these external DTD will resolve( populating 'docinfo.internalDTD' ) if they are specified one at a time i.e. only one external DTD at a time in the DTD. However if two or more external DTDs are specified then only the entities from the first are loaded into the internalDTD.
Am I doing it wrong or is this a bug?
For an example, see the python program below. Note that I have local copies of xhtml-lat1.ent and xhtml-symbol.ent, which can be downloaded from
http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent and http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent respectively
# EXAMPLE START
from lxml import etree
from io import BytesIO
parser = etree.XMLParser(load_dtd=True)
xml = b'''\
<?xml version='1.1' encoding='utf-8' ?>
<!DOCTYPE root [
<!ENTITY % HTMLlat1 PUBLIC
"-//W3C//ENTITIES Latin 1 for XHTML//EN"
"xhtml-lat1.ent">
%HTMLlat1;
<!ENTITY % HTMLsymbol PUBLIC
"-//W3C//ENTITIES Symbols for XHTML//EN"
"xhtml-symbol.ent">
%HTMLsymbol;
]>
<root></root>'''
tree = etree.parse(BytesIO(xml), parser)
print(tree.docinfo.internalDTD.entities())
# EXAMPLE END
In the output I see all the entities that would be expected from xhtml-lat1.ent but xhtml-symbol.ent resolves to "
participants (1)
-
Ewan Willis