[XML-SIG] DOM normalize() broken? entity refs lost?

A.M. Kuchling akuchlin@cnri.reston.va.us
Tue, 27 Apr 1999 22:41:53 -0400

Jeff.Johnson@icn.siemens.com writes:
 > XmlWriter does not define .doOtherNode()
 > so nothing gets written.  

	Eek! You're right.  Try this patch:

Index: writer.py
RCS file: /home/cvsroot/xml/dom/writer.py,v
retrieving revision 1.8
diff -C2 -r1.8 writer.py
*** writer.py	1999/04/08 00:14:29	1.8
--- writer.py	1999/04/28 02:29:42
*** 119,123 ****
  class XmlLineariser(XmlWriter):
--- 119,125 ----
!     def doOtherNode(self, node):
!         self.stream.write( node.toxml() )
  class XmlLineariser(XmlWriter):
 > <P>Text on multiple
 > lines and with extra white         space in the
 > raw HTML doesn't change when dom.get_documentElement().normalize()

	Careful; that isn't what normalize() does.  Add another Text
node as a child of the TITLE element, to produce two Text nodes text
to each other.  dom.dump() will then output:

<DOM Document; root=<Element 'HTML'> >
   <Element 'TITLE'>
    <Text node 'test'>
    <Text node 'ADDED TEXT'>
   <Text node '\012'>

After calling normalize:
<DOM Document; root=<Element 'HTML'> >
   <Element 'TITLE'>
    <Text node 'testADDED TEXT'>
   <Text node '\012'>

See how the two text nodes have been merged?  It doesn't do anything
about whitespace.

To strip out whitespace, look at strip_whitespace or
collapse_whitespace in xml.dom.utils; after collapse_whitespace(dom,
WS_INTERNAL), runs of whitespace are collapsed down to a single space.

A.M. Kuchling			http://starship.python.net/crew/amk/
Guards! Guards! Stop this madman! He's turning everyone into monkeys!
    -- A sudden intrusion, in ZOT! #1