![](https://secure.gravatar.com/avatar/8b97b5aad24c30e4a1357b38cc39aeaa.jpg?s=120&d=mm&r=g)
James Housden, 30.10.2012 20:35:
I have a large xml file that I need to modify and then store as a new xml file.
The file has a structure similar to <root> <header> <txt>header txt</txt> </header> <record> <field1>1.0</field1> <subrecord> <field2>A1</field2> <field3>C1</field3> <subrecord> </record> <record> <field1>1.0</field1> <subrecord> <field2>A2</field2> <field3>C3</field3> <subrecord> </record> <record> <field1>1.0</field1> <subrecord> <field2>A4</field2> <field3>B</field3> <subrecord> </record> </root>
I would like to modify the contents of the field3 tags.
Now, due to the file size, I cannot load the complete document into memory and so I intend to use 'iterparse'. Traversing the document and updating the fields is no problem. What I am not sure about is how to write the modified data to a new xml file. The root tag is only complete when I have processed the complete file. What I need to do is write the start of the root tag (<root>) then write the header and the records and finally the end of root tag (</root>). Is there functionality in lxml to do this or should I use standard python writes for the initial <root> and final >/root>?
It's certainly easiest to just write out the root tag yourself. Take care of encodings in that case - as long as you only use UTF-8, you should be fine. Otherwise, you also have to write out an appropriate XML declaration before the root element and properly get the serialised XML elements into the file. Stefan