[lxml-dev] lxml namespaces

Hi All! I have a little problem with XML namespaces. In my application I have two XML processors, that process the same document, one after the other. The first one looks for nodes in 'ns1' namespace, and substitutes them, according to some algorithm. After this processor is finished, it is guaranteed that there are no more 'ns1' nodes left in the tree. 'ns1' namespace dclaration is still there, in the root node (well, I put it there manually). Now, when this namespace is no longer needed, I want to get rid of it, because it confuses some other processors (namely, my browser) So, the question is, how do I do that? del tree.getroot().nsmap['ns1'] does not seem to do the trick :( Thanks in advance and Happy Holidays! -- Maxim Sloyko

Hi All! Sorry for the repost, but my first message does not seem to reach this list... :( I have a little problem with XML namespaces. In my application I have two XML processors, that process the same document, one after the other. The first one looks for nodes in 'ns1' namespace, and substitutes them, according to some algorithm. After this processor is finished, it is guaranteed that there are no more 'ns1' nodes left in the tree. 'ns1' namespace dclaration is still there, in the root node (well, I put it there manually). Now, when this namespace is no longer needed, I want to get rid of it, because it confuses some other processors (namely, my browser) So, the question is, how do I do that? del tree.getroot().nsmap['ns1'] does not seem to do the trick :( Thanks in advance and Happy Holidays! -- Maxim Sloyko

Hi, On 2007-01-07 16:24:15 +0100, "Maxim Sloyko" <m.sloyko@gmail.com> said:
actually I'm curious, too how to remove namespaces. And I'm not sure at all how to do that. Clueless right now. -- Christian Zagrodnick gocept gmbh & co. kg · forsterstrasse 29 · 06112 halle/saale www.gocept.com · fon. +49 345 12298894 · fax. +49 345 12298891

Sounds a bit like a case for XSLT to me.
Hmmm, I think the easiest way to remove unused namespaces from a document is: new_nsmap = dict(p,n for p,n in root.nsmap.items() if n != NS_TO_REMOVE) new_root = etree.Element(root.tag, root.attrib, new_nsmap) new_root.text = root.text new_root.tail = root.tail new_root[:] = root[:] root = new_root That's somewhat costly, but it's a rare usecase anyway... or use XSLT. Honestly, assuring tree correctness if ".nsmap" was writable is not at all trivial. You'd have to - check which namespaces are being added and which are removed (incl. parental inheritance, prefix override issues, ...) - verify that removed namespaces are no longer used anywhere in the subtree - replace the namespace declarations on the node, keeping pointers to the old ones - fix all namespace references in the subtree - free the now-unused namespace declarations The "fix all namespaces" bit is easy (should just work with the usual moveNodeToDocument() dance), but I'm not feeling like implementing the first steps right now... Stefan

On 2008-02-19 17:24:14 +0100, "Stefan Behnel" <stefan_ml@behnel.de> said:
Yeah. http://cocoon.apache.org/2.0/faq/faq-xslt.html#faq-5 :)
Nono, the XSLT bit is fine. I don't quite understand why this works, but then I'm not that much into XSLT. Actually, seen that: http://www.patentstorm.us/patents/7120864.html :/ -- Christian Zagrodnick gocept gmbh & co. kg · forsterstrasse 29 · 06112 halle/saale www.gocept.com · fon. +49 345 12298894 · fax. +49 345 12298891

Christian Zagrodnick wrote:
XSLT is (mostly) about copying XML trees selectively. Things you do not copy will not appear in the result. In this case, you only copy plain (i.e. local) element names, not their namespaces (which may or may not be what you want).
Actually, seen that: http://www.patentstorm.us/patents/7120864.html :/
Hehe, read the title: "Eliminating superfluous namespace declarations and undeclaring default namespaces *in XML serialization processing*". This is about serialisation only. The (intermediate) result of an XSLT is a tree, not a byte stream, so the namespace fixing is not part of the serialisation process. :] (Not that I would agree that this is worth a patent...) Stefan

Hi All! Sorry for the repost, but my first message does not seem to reach this list... :( I have a little problem with XML namespaces. In my application I have two XML processors, that process the same document, one after the other. The first one looks for nodes in 'ns1' namespace, and substitutes them, according to some algorithm. After this processor is finished, it is guaranteed that there are no more 'ns1' nodes left in the tree. 'ns1' namespace dclaration is still there, in the root node (well, I put it there manually). Now, when this namespace is no longer needed, I want to get rid of it, because it confuses some other processors (namely, my browser) So, the question is, how do I do that? del tree.getroot().nsmap['ns1'] does not seem to do the trick :( Thanks in advance and Happy Holidays! -- Maxim Sloyko

Hi, On 2007-01-07 16:24:15 +0100, "Maxim Sloyko" <m.sloyko@gmail.com> said:
actually I'm curious, too how to remove namespaces. And I'm not sure at all how to do that. Clueless right now. -- Christian Zagrodnick gocept gmbh & co. kg · forsterstrasse 29 · 06112 halle/saale www.gocept.com · fon. +49 345 12298894 · fax. +49 345 12298891

Sounds a bit like a case for XSLT to me.
Hmmm, I think the easiest way to remove unused namespaces from a document is: new_nsmap = dict(p,n for p,n in root.nsmap.items() if n != NS_TO_REMOVE) new_root = etree.Element(root.tag, root.attrib, new_nsmap) new_root.text = root.text new_root.tail = root.tail new_root[:] = root[:] root = new_root That's somewhat costly, but it's a rare usecase anyway... or use XSLT. Honestly, assuring tree correctness if ".nsmap" was writable is not at all trivial. You'd have to - check which namespaces are being added and which are removed (incl. parental inheritance, prefix override issues, ...) - verify that removed namespaces are no longer used anywhere in the subtree - replace the namespace declarations on the node, keeping pointers to the old ones - fix all namespace references in the subtree - free the now-unused namespace declarations The "fix all namespaces" bit is easy (should just work with the usual moveNodeToDocument() dance), but I'm not feeling like implementing the first steps right now... Stefan

On 2008-02-19 17:24:14 +0100, "Stefan Behnel" <stefan_ml@behnel.de> said:
Yeah. http://cocoon.apache.org/2.0/faq/faq-xslt.html#faq-5 :)
Nono, the XSLT bit is fine. I don't quite understand why this works, but then I'm not that much into XSLT. Actually, seen that: http://www.patentstorm.us/patents/7120864.html :/ -- Christian Zagrodnick gocept gmbh & co. kg · forsterstrasse 29 · 06112 halle/saale www.gocept.com · fon. +49 345 12298894 · fax. +49 345 12298891

Christian Zagrodnick wrote:
XSLT is (mostly) about copying XML trees selectively. Things you do not copy will not appear in the result. In this case, you only copy plain (i.e. local) element names, not their namespaces (which may or may not be what you want).
Actually, seen that: http://www.patentstorm.us/patents/7120864.html :/
Hehe, read the title: "Eliminating superfluous namespace declarations and undeclaring default namespaces *in XML serialization processing*". This is about serialisation only. The (intermediate) result of an XSLT is a tree, not a byte stream, so the namespace fixing is not part of the serialisation process. :] (Not that I would agree that this is worth a patent...) Stefan
participants (3)
-
Christian Zagrodnick
-
Maxim Sloyko
-
Stefan Behnel