[New-bugs-announce] [issue9521] xml.etree.ElementTree strips XML declaration and procesing instructions

Mark Summerfield report at bugs.python.org
Thu Aug 5 11:25:43 CEST 2010


New submission from Mark Summerfield <mark at qtrac.eu>:

If you read in an XML file using xml.etree.ElementTree.parse() and then write it out again using xml.etree.ElementTree.write() what is written may not be the same as what was read. In particular any XML declaration and processing instructions are stripped.

It seems to me that the parser should at least preserve any declaration and processing instructions so that reading and writing match up.

Here's an example:

Python 3.1.2 (r312:79147, Jul 15 2010, 10:56:05) 
[GCC 4.4.4] on linux2
Type "copyright", "credits" or "license()" for more information.
>>> file = "control-center.xml"
>>> open(file).read()[:500]
'<?xml version="1.0" encoding="utf-8"?>\n<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" [\n<!ENTITY VERSION "1.5.7">\n]>\n<article id="index" lang="en_GB">\n  \n  <articleinfo>\n    <abstract role="description">\n      <para>The GNOME Control Centre provides a central place for the user to setup their GNOME experience. It can let you configure anything from the behaviour of your window borders to the default font type.</para>\n   '
>>> import xml.etree.ElementTree as etree
>>> xml = etree.parse(file)
>>> temp = "temp.xml"
>>> xml.write("temp.xml", encoding="utf-8")
>>> open(temp).read()[:500]
'<article id="index" lang="en_GB">\n  \n  <articleinfo>\n    <abstract role="description">\n      <para>The GNOME Control Centre provides a central place for the user to setup their GNOME experience. It can let you configure anything from the behaviour of your window borders to the default font type.</para>\n    </abstract>\n    <title>Control Centre</title>\n    <authorgroup>\n      <author>\n\t<firstname>Kevin</firstname><surname>Breit</surname>\n      </author>\n    </authorgroup>\n    <copyright>\n      <y'
>>>

----------
components: Library (Lib)
messages: 112961
nosy: mark
priority: normal
severity: normal
status: open
title: xml.etree.ElementTree strips XML declaration and procesing instructions
type: behavior
versions: Python 3.1

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue9521>
_______________________________________


More information about the New-bugs-announce mailing list