
Hi Keith,
(not having actually tried out any of this, but) what comes to mind is: - Parse with resolve_enttities=False and then handle the unresolved entities yourself. You should be able to get at the entity references in the tree e.g. using tree.iter(etree.Entity). The entity definitions can be accessed through a tree's docinfo attribute: tree.docinfo.internalDTD.entities() - If you have control over the XML file, maybe switch to XInclude instead of entities to include separate content? Again, instead of automatically processing inclusions (through tree.xinclude()) iterate the include nodes yourself for full control of what you want to achieve. - lxml supports custom resolvers. Both the DTD and XInclude approaches might be combinable with such custom resolvers to hook into the regular mechanics and get more control over the actual (see https://lxml.de/resolvers.html). Note: you're probably aware that resolve_entities=True can be a security risk if applied to untrusted XML input. Handle with care. Best regards, Holger Landesbank Baden-Wuerttemberg Anstalt des oeffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz HRA 12704 Amtsgericht Stuttgart HRA 4356, HRA 104 440 Amtsgericht Mannheim HRA 40687 Amtsgericht Mainz Die LBBW verarbeitet gemaess Erfordernissen der DSGVO Ihre personenbezogenen Daten. Informationen finden Sie unter https://www.lbbw.de/datenschutz.
participants (1)
-
Holger.Joukl@LBBW.de