Minimally intrusive XML editing using Python

Chris Rebert clp2 at rebertia.com
Wed Nov 18 08:23:54 EST 2009


On Wed, Nov 18, 2009 at 4:55 AM, Thomas Lotze <thomas at thomas-lotze.de> wrote:
> I wonder what Python XML library is best for writing a program that makes
> small modifications to an XML file in a minimally intrusive way. By that I
> mean that information the program doesn't recognize is kept, as are
> comments and whitespace, the order of attributes and even whitespace
> around attributes. In short, I want to be able to change an XML file while
> producing minimal textual diffs.
>
> Most libraries don't allow controlling the order of and the whitespace
> around attributes, so what's generally left to do is store snippets of
> original text along with the model objects and re-use that for writing the
> edited XML if the model wasn't modified by the program. Does a library
> exist that helps with this? Does any XML library at all allow structured
> access to the text representation of a tag with its attributes?

Have you considered using an XML-specific diff tool such as:
* One off this list:
http://www.manageability.org/blog/stuff/open-source-xml-diff-in-java
* xmldiff (it's in Python even): http://www.logilab.org/859
* diffxml: http://diffxml.sourceforge.net/

[Note: I haven't actually used any of these.]

Cheers,
Chris
--
http://blog.rebertia.com



More information about the Python-list mailing list