Python script to optimize XML text

Gabriel Genellina gagsl-py2 at
Tue Sep 25 04:19:13 CEST 2007

En Mon, 24 Sep 2007 17:36:05 -0300, Robert Dailey <rcdailey at>  

> I'm currently seeking a python script that provides a way of optimizing  
> out
> useless characters in an XML document to provide the optimal size for the
> file. For example, assume the following XML script:
> <root>
>     <Test></Test>
>     <!-- <CommentedOutElement/> -->
>     <!-- Do Something Else -->
> </root>
> By running this through an XML optimizer, the file would appear as:
> <root><Test/></root>

ElementTree does that almost for free. I've just posted an example.

source = """<root>
      <!-- <CommentedOutElement/> -->

      <!-- Do Something Else -->

import xml.etree.ElementTree as ET
tree = ET.XML(source)
print ET.tostring(tree)

      <Test />


If you still want to remove all whitespace:

def stripws(node):
     if node.text:
         node.text = node.text.strip()
     if node.tail:
         node.tail = node.tail.strip()
     for child in node.getchildren():
print ET.tostring(tree)

<root><Test /></root>

Gabriel Genellina

More information about the Python-list mailing list