Tue, 17 Dec 2002 09:26:23 -0700
> On Tue, 17 Dec 2002, Anders Bruun Olsen wrote:
> > On Tue, Dec 17, 2002 at 04:54:30PM +0100, JS wrote:
> > > I've got the following problem:
> > > If I save the following xml file using PrettyPrint()
> > > <?xml version="1.0"?>
> > > <doc><raw><![CDATA[foo]]></raw><raw>bar</raw></doc>
> > > I get:
> > > ##################PrettyPrint
> > > <?xml version='1.0' encoding='UTF-8'?>
> > > <doc>
> > > <raw>
> > > <![CDATA[foo]]>
> > > </raw>
> > > <raw>bar</raw>
> > > </doc>
> > > Now next time I read in the xml, a new text node is created in the DOM tree
> > > due to the indent an newline behind the first <raw> element.
> > > I could use Print since it doesn't add any additional data. But Pretty is more
> > > pretty ;-)
> > > I'm not sure about ignorable whitespace in xml, but I think, it is not
> > > correct.
> > > Shouldn't the output be ...<raw><![CDATA[foo]]></raw>...?
> > I haven't tried this myself, but it sounds like what you are looking for
> > is a normalize function - that way you should be able to rid yourself of
> > the extra whitespace and linebreaks.
> > Please correct me if I am wrong :)
> normalize will not get rid of extra white space and line breaks, it will
> combine adjacent ones into a single text node. This won't fix this
> problem as there still will be a text node before the "raw" element.
I think he means XPath normalize, which removes leading & trailing white
space, as well as combining internal whitespace.
This used to be a stand-alone function in Ft/Xml/Xslt/XmlWriter.py as well,
but I think Jeremy or I merged it into TranslateCdataAttr.
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://4Suite.org http://fourthought.com
A Python & XML Companion - http://www.xml.com/pub/a/2002/12/11/py-xml.html
XML class warfare - http://www.adtmag.com/article.asp?id=6965
MusicBrainz metadata - http://www-106.ibm.com/developerworks/xml/library/x-thi