[XML-SIG] PrettyPrint

Tue, 17 Dec 2002 09:26:23 -0700

> On Tue, 17 Dec 2002, Anders Bruun Olsen wrote:
> 
> > On Tue, Dec 17, 2002 at 04:54:30PM +0100, JS wrote:
> > > I've got the following problem:
> > > If I save the following xml file using PrettyPrint()
> > > <?xml version="1.0"?>
> > > <doc><raw><![CDATA[foo]]></raw><raw>bar</raw></doc>
> > >  I get:
> > > ##################PrettyPrint
> > > <?xml version='1.0' encoding='UTF-8'?>
> > > <doc>
> > >   <raw>
> > >     <![CDATA[foo]]>
> > >   </raw>
> > >   <raw>bar</raw>
> > > </doc>
> > > Now next time I read in the xml, a new text node is created in the DOM tree
> > > due to the indent an newline behind the first <raw> element.
> > > I could use Print since it doesn't add any additional data. But Pretty is more
> > > pretty ;-)
> > > I'm not sure about ignorable whitespace in xml, but I think, it is not
> > > correct.
> > > Shouldn't the output be ...<raw><![CDATA[foo]]></raw>...?
> >
> > I haven't tried this myself, but it sounds like what you are looking for
> > is a normalize function - that way you should be able to rid yourself of
> > the extra whitespace and linebreaks.
> >
> > Please correct me if I am wrong :)
> 
> normalize will not get rid of extra white space and line breaks, it will
> combine adjacent ones into a single text node.  This won't fix this
> problem as there still will be a text node before the "raw" element.

I think he means XPath normalize, which removes leading & trailing white 
space, as well as combining internal whitespace.

This used to be a stand-alone function in Ft/Xml/Xslt/XmlWriter.py as well, 
but I think Jeremy or I merged it into TranslateCdataAttr.

-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
A Python & XML Companion - http://www.xml.com/pub/a/2002/12/11/py-xml.html
XML class warfare - http://www.adtmag.com/article.asp?id=6965
MusicBrainz  metadata - http://www-106.ibm.com/developerworks/xml/library/x-thi
nk14.html