[firstname.lastname@example.org: Re: [email@example.com: Re: [XML-SIG] Thought it was a bug, maybe XML is weirder than I thought]]
Sun, 1 Oct 2000 19:17:54 -0700
From: "Martin v. Loewis" <firstname.lastname@example.org>
>I think you have a point on splitting a text fragment into multiple
>Text nodes; the DOM spec says about the interface Text:
># If there is no markup inside an element's content, the text is
># contained in a single object implementing the Text interface that is
># the only child of the element. If there is markup, it is parsed into
># a list of elements and Text nodes that form the list of children of
># the element.
>I'm not sure what that means for parsing <hallo> - is it
>permitted that these are split into three Text nodes, is it required
>that they are split, or is it disallowed?
>According to section 2.4 of XML 1.0 [REC-xml-19980210] says that an
>entity reference is markup; 4.1 says that > is an entity reference
>(*not* a character reference) - so it appears permitted that multiple
>Text nodes are created.
Thanks, Martin. (And please accept my apologies for posting from a
state of abysmal ignorance regarding XML. Being a person who actually
enjoys reading standards documents, I'm going to read through the
document you referenced.)
>You *should* be able to merge them by calling normalize() on the tree;
>I'm not sure whether that worked in 0.5.5.1, it does work with 4DOM in
>PyXML 0.6. Please note that normalize won't merge CDATA sections.
It does work, at least on my test data.