[XML-SIG] Parsing XML file with Minidom has problem with cr/lf
stefan_ml at behnel.de
Mon May 10 08:57:43 CEST 2010
Dieter Maurer, 10.05.2010 07:50:
> Peterson, Wayne wrote at 2010-5-8 23:43 -0700:
>> I am parsing an XML file with Python 2.6.5 minidom in Windows and it is
>> mostly working but minidom seems to have problems dealing with Windows
>> cr/lf characters. It creates an extra textnode that needs to be ignored
>> instead of just returning the xml elements. I have tried different
>> methods of opening the file but it doesn't seem to make a difference. It
>> is happiest when reading a file in Unix format.
> The parser should not see these "cr/lf" characters at all.
> Python strings itself use only "\n" (aka "lf") to delimite lines.
> The "\r" (aka "cr") should only be introduced when those lines
> are written to text files. And they should be removed when
> those line are read in again.
> Are you sure that you access your files as "text" files?
The correct way to parse XML files is as binary data.
More information about the XML-SIG