XML and newlines

Hans Nowak wurmy at earthlink.net
Sat Feb 16 12:17:06 EST 2002


"Martin v. Loewis" wrote:
> 
> Hans Nowak <wurmy at earthlink.net> writes:
> 
> > I manage to store that string just fine; only when I retrieve
> > the record, all the newlines are gone.
> 
> I can hardly believe this. Can you post a small script that
> demonstrates this problem?

That will be a bit difficult, because the XML comes back from
the server, but I'll try to emulate what it's doing:

#---begin---

from xml.dom.minidom import *   # who cares, it's only a test :-)

xml = """\
<root>
  <ticket foo="1" bar="2" image="
         one
         two
         three
         four" baz="4" />
</root>
"""

parsed_xml = parseString(xml)

print repr(parsed_xml.toxml())

for node in parsed_xml.getElementsByTagName("ticket"):
    print `node.getAttribute("foo")`
    print `node.getAttribute("image")`

#---end---

The multiline XML string is in roughly the same format as the
raw XML the server returns. That is, it does contain newlines
after 'one', 'two', and 'three' (assume these were in the
original multiline string). As soon as the XML is parsed,
these newlines disappear. Maybe this is normal and to be
expected when using xml.dom.minidom.parseString; I don't know.
Either way, my question is, is there a way to parse the XML
*and* somehow preserve the newlines in the string?

I have to do a comparison of the record data before posting
and the record retrieved from the database, and therefore the
data must be exactly the same. Unfortunately I don't have much
control over how the server returns its XML. :-(

-- 
Hans (base64.decodestring('d3VybXlAZWFydGhsaW5rLm5ldA==')) 
# decode for email address ;-)
The Pythonic Quarter:: http://www.awaretek.com/nowak/



More information about the Python-list mailing list