I try to diff some files in a company internal xml format via lxml, and write the diff as xml back to disk. I do a double diff: file1+file2, and file3+file2, and then I diff the two diff files again and write that back to disk. I'm not using the html diff, I do the works myself.
In the end I do: etree.ElementTree(diff1).write("diff1.xml", pretty_print=True, xml_declaration=True) etree.ElementTree(diff2).write("diff2.xml", pretty_print=True, xml_declaration=True) result.addprevious(etree.PI('xml-stylesheet', 'type="text/xsl" href="uberdiff.xsl"')) resultTree = etree.ElementTree(result) resultTree.write("Udiff.xml", pretty_print=True, xml_declaration=True)
On inspection, the root of diff2 contains the children of the root of diff1 + his own children, and Udiff contains the children of diff2 (including those of diff1!) and his own. I am trying to debug this for days, I can't really find it yet.
I'm wondering if I'm leaking this myself or that there is a bug or internal thing in lxml that I do not understand yet. Please inspect my code on: https://github.com/Y3PP3R/difftooling
I'm working with python 2.7.3 32bit on windows 7 x64 and the lxml binaries 2.3. from http://www.lfd.uci.edu/~gohlke/pythonlibs/#lxml
Thank for any pointers in the right direction already.