Problem with processing XML
jpcc at nowhere.org
Tue Jan 22 15:11:54 CET 2008
I'm new to Python and trying to use it to solve a specific problem. I
have an XML file in which I need to locate a specific text node and
replace the contents with some other text. The text in question is
actually about 70k of base64 encoded data.
I wrote some code that works on my Linux box using xml.dom.minidom, but
it will not run on the windows box that I really need it on. Python
2.5.1 on both.
On the windows machine, it's a clean install of the Python .msi from
python.org. The linux box is Ubuntu 7.10, which has some Python XML
packages installed which can't easily be removed (namely python-libxml2
I have boiled the code down to its simplest form which shows the problem:-
input_file = sys.argv;
output_file = sys.argv;
doc = xml.dom.minidom.parse(input_file)
file = open(output_file, "w")
The error is:-
$ python test2.py input2.xml output.xml
Traceback (most recent call last):
File "test2.py", line 9, in <module>
File "c:\Python25\lib\xml\dom\minidom.py", line 1744, in writexml
node.writexml(writer, indent, addindent, newl)
File "c:\Python25\lib\xml\dom\minidom.py", line 814, in writexml
File "c:\Python25\lib\xml\dom\minidom.py", line 809, in writexml
File "c:\Python25\lib\xml\dom\minidom.py", line 299, in _write_data
data = data.replace("&", "&").replace("<", "<")
AttributeError: 'NoneType' object has no attribute 'replace'
As I said, this code runs fine on the Ubuntu box. If I could work out
why the code runs on this box, that would help because then I call set
up the windows box the same way.
The input file contains an <xsd:schema> block which is what actually
causes the problem. If you remove that node and subnodes, it works
fine. For a while at least, you can view the input file at
Someone suggested that I should try xml.etree.ElementTree, however
writing the same type of simple code to import and then write the file
mangles the xsd:schema stuff because ElementTree does not understand
By the way, is pyxml a live project or not? Should it still be used?
It's odd that if you go to http://www.python.org/ and click the link
"Using python for..." XML, it leads you to
If you then follow the download links to
http://sourceforge.net/project/showfiles.php?group_id=6473 you see that
the latest file is 2004, and there are no versions for newer pythons.
It also says "PyXML is no longer maintained". Shouldn't the link be
removed from python.org?
Thanks in advance!
More information about the Python-list