[XML-SIG] DOM normalize() broken? entity refs lost?
A.M. Kuchling
Tue, 27 Apr 1999 22:41:53 -0400
Jeff.Johnson@icn.siemens.com writes:
> XmlWriter does not define .doOtherNode()
> so nothing gets written.
Eek! You're right. Try this patch:
Index: writer.py
RCS file: /home/cvsroot/xml/dom/writer.py,v
retrieving revision 1.8
diff -C2 -r1.8 writer.py
*** writer.py 1999/04/08 00:14:29 1.8
--- writer.py 1999/04/28 02:29:42
*** 119,123 ****
class XmlLineariser(XmlWriter):
--- 119,125 ----
! def doOtherNode(self, node):
! self.stream.write( node.toxml() )
class XmlLineariser(XmlWriter):
> <P>Text on multiple
> lines and with extra white space in the
> raw HTML doesn't change when dom.get_documentElement().normalize()
Careful; that isn't what normalize() does. Add another Text
node as a child of the TITLE element, to produce two Text nodes text
to each other. dom.dump() will then output:
<DOM Document; root=<Element 'HTML'> >
<Element 'TITLE'>
<Text node 'test'>
<Text node 'ADDED TEXT'>
<Text node '\012'>
After calling normalize:
<DOM Document; root=<Element 'HTML'> >
<Element 'TITLE'>
<Text node 'testADDED TEXT'>
<Text node '\012'>
See how the two text nodes have been merged? It doesn't do anything
about whitespace.
To strip out whitespace, look at strip_whitespace or
collapse_whitespace in xml.dom.utils; after collapse_whitespace(dom,
WS_INTERNAL), runs of whitespace are collapsed down to a single space.
A.M. Kuchling http://starship.python.net/crew/amk/
Guards! Guards! Stop this madman! He's turning everyone into monkeys!
-- A sudden intrusion, in ZOT! #1