[lxml-dev] etree.cleanup_namespaces() removes namespaces that are used in ns-prefixed attribute values

Hi, would that be a bug or is it just not supported/sanely supportable?
root = etree.fromstring('<root xmlns:xsd=" http://www.w3.org/2001/XMLSchema" xmlns:xsi=" http://www.w3.org/2001/XMLSchema-instance"><x xsi:type="xsd:integer">42</x></root>') print etree.tostring(root, pretty_print=True) <root xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi=" http://www.w3.org/2001/XMLSchema-instance"> <x xsi:type="xsd:integer">42</x> </root>
etree.cleanup_namespaces(root) print etree.tostring(root, pretty_print=True) <root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <x xsi:type="xsd:integer">42</x> </root>
Note how the XML Schema namespace as used in xsd:integer has been removed.
print etree.__version__ # oldschool ;-) 3.7.2
Holger Landesbank Baden-Wuerttemberg Anstalt des oeffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz HRA 12704 Amtsgericht Stuttgart HRA 4356, HRA 104 440 Amtsgericht Mannheim HRA 40687 Amtsgericht Mainz Die LBBW verarbeitet gemaess Erfordernissen der DSGVO Ihre personenbezogenen Daten. Informationen finden Sie unter https://www.lbbw.de/datenschutz.

Hi Holger, since the xsd: prefix is only part of a value (instead of a name), I think this is expected. --dirk Am 02.05.2019, 17:02 Uhr, schrieb Holger Joukl <Holger.Joukl@lbbw.de>:
Hi,
would that be a bug or is it just not supported/sanely supportable?
root = etree.fromstring('<root xmlns:xsd=" http://www.w3.org/2001/XMLSchema" xmlns:xsi=" http://www.w3.org/2001/XMLSchema-instance"><x xsi:type="xsd:integer">42</x></root>') print etree.tostring(root, pretty_print=True) <root xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi=" http://www.w3.org/2001/XMLSchema-instance"> <x xsi:type="xsd:integer">42</x> </root>
etree.cleanup_namespaces(root) print etree.tostring(root, pretty_print=True) <root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <x xsi:type="xsd:integer">42</x> </root>
Note how the XML Schema namespace as used in xsd:integer has been removed.
print etree.__version__ # oldschool ;-) 3.7.2
Holger
Landesbank Baden-Wuerttemberg Anstalt des oeffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz HRA 12704 Amtsgericht Stuttgart HRA 4356, HRA 104 440 Amtsgericht Mannheim HRA 40687 Amtsgericht Mainz
Die LBBW verarbeitet gemaess Erfordernissen der DSGVO Ihre personenbezogenen Daten. Informationen finden Sie unter https://www.lbbw.de/datenschutz.
_________________________________________________________________ Mailing list for the lxml Python XML toolkit - http://lxml.de/ lxml@lxml.de https://mailman-mail5.webfaction.com/listinfo/lxml

Holger Joukl schrieb am 02.05.19 um 17:02:
would that be a bug or is it just not supported/sanely supportable?
root = etree.fromstring('<root xmlns:xsd=" http://www.w3.org/2001/XMLSchema" xmlns:xsi=" http://www.w3.org/2001/XMLSchema-instance"><x xsi:type="xsd:integer">42</x></root>') print etree.tostring(root, pretty_print=True) <root xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi=" http://www.w3.org/2001/XMLSchema-instance"> <x xsi:type="xsd:integer">42</x> </root>
etree.cleanup_namespaces(root) print etree.tostring(root, pretty_print=True) <root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <x xsi:type="xsd:integer">42</x> </root>
Note how the XML Schema namespace as used in xsd:integer has been removed.
A missing feature, I'd say. C14N (2.0, will be in lxml 4.4) allows defining a set of elements and attributes where qualified tag names should be detected (and adapted) in their text. That might be a good approach here, too. https://github.com/lxml/lxml/blob/3f0db5d57940eebd418fe86bcbdad39ffe23211d/s...
print etree.__version__ # oldschool 3.7.2
Sorry, new feature, new release. :) Stefan
participants (3)
-
Dirk Rothe
-
Holger Joukl
-
Stefan Behnel