[XML-SIG] CDATA sections still not handled

Martin v. Loewis martin@mira.cs.tu-berlin.de
Wed, 17 Jan 2001 08:40:53 +0100


> Since one has NO interest in parsing the content, rendering, or
> interpreting it, but does have an interest in locating a particular
> node and adding a new fragment to it, then saving the modifed
> document, via ext.PrettyPrint(which I am using), to file again,

I understand you are not interested in parsing the document; if you
build a DOM tree, parsing of the document will happen as a side
effect. You cannot avoid this: this is the only way to get a DOM tree
from a document. So while you are not interested in the parsing, you
should accept that it is done.

> then one obviously does not want CDATA markers to be removed,
> because, 1) they may have not written the first document, and 2)
> they are not trying to interpret it,

Who is "they" here? The CDATA markers? or the users of your tool?

So somebody has not written the document, and that same
person/entity/whatever is not trying to interpret it. Why does it
follow that this person/entity does not want the CDATA markers to be
removed? If that person does not even look at the document, why is
there any harm done by removing the CDATA markers. They have *no*
meaning in the document.

> You missed the point entirely in that I don't care where they are in
> the document.

I assume "they" is the CDATA markers, here. If you don't care where
they are in the document, why is it a problem if there is no CDATA
marker in the output of PrettyPrint?

> maybe the following will explain why it is useful ..... which is the
> hack I use to get CDATA back into the file again.  Presumably you
> would think that if you opened an xml file into a DOM tree, then
> saved it again, then it would still be the same "kind" of document,

That I would think. It should still be the same "kind" of document,
i.e. have the same elements, the elements should have the same
attributes, and elements containing text should still contain the same
text.

> i.e. CDATA nodes would STILL be CDATA nodes.

No, I would not think that. Changing CDATA nodes to text does not
change the document; it is still the same one. Replacing CDATA
fragments with text is the same kind of transformation as replacing
< with < - this does not change the document.

> Yes I assume 1) the node name is unique and 2) that it's first child is a
> text node ......
> 
> def convertTextNodeToCDataNodeByName(doc,name):
>     node_list = doc.getElementsByTagNameNS('',name)
>     text_node = node_list[0].firstChild
>     text_data = retPrettyPrint(text_node)
>     new_cdata_node = makeCDataSection(doc,text_data)
>     text_node.parentNode.replaceChild(new_cdata_node,text_node)

That means you know in advance that you only have a single CDATA
fragment in the original document, you want to produce one in the
output in the same location (i.e. inside the same element as it was in
the original input).

What if there is more than one CDATA section in the original document?
What if there was none?

Regards,
Martin