[XML-SIG] Parsing XML file with Minidom has problem with cr/lf

Peterson, Wayne WaynePeterson at SierraSystems.com
Mon May 10 16:04:05 CEST 2010


That's what I thought as well. I was expecting the parser to ignore all
forms of linefeed.

I believe I am accessing my files as text files. The documentation for
minidom.parse says you can pass it a file name or a file object and I
have tried it both ways with the same result. Here is the open statement
I am using.

infile = open(in_path_file, 'r')
in_xmldoc = minidom.parse(infile)

The input file contains cr/lf linefeeds x'0a0d'.

When I do something like,

surveys = form.childNodes

the surveys.firstChild node will contain x'0a' which I have to ignore.

Wayne  

-----Original Message-----
From: Dieter Maurer [mailto:dieter at handshake.de] 
Sent: Sunday, May 09, 2010 11:50 PM
To: Peterson, Wayne
Cc: xml-sig at python.org
Subject: Re: [XML-SIG] Parsing XML file with Minidom has problem with
cr/lf

Peterson, Wayne wrote at 2010-5-8 23:43 -0700:
>I am parsing an XML file with Python 2.6.5 minidom in Windows and it is
>mostly working but minidom seems to have problems dealing with Windows
>cr/lf characters. It creates an extra textnode that needs to be ignored
>instead of just returning the xml elements. I have tried different
>methods of opening the file but it doesn't seem to make a difference.
It
>is happiest when reading a file in Unix format.

The parser should not see these "cr/lf" characters at all.

Python strings itself use only "\n" (aka "lf") to delimite lines.
The "\r" (aka "cr") should only be introduced when those lines
are written to text files. And they should be removed when
those line are read in again.

Are you sure that you access your files as "text" files?



--
Dieter


----Notice Regarding Confidentiality----
This email, including any and all attachments, (this "Email") is intended only for the party to whom it is addressed and may contain information that is confidential or privileged.  Sierra Systems Group Inc. and its affiliates accept no responsibility for any loss or damage suffered by any person resulting from any unauthorized use of or reliance upon this Email.  If you are not the intended recipient, you are hereby notified that any dissemination, copying or other use of this Email is prohibited.  Please notify us of the error in communication by return email and destroy all copies of this Email.  Thank you.


More information about the XML-SIG mailing list