[Tutor] trying to parse an xml file
Steven D'Aprano
steve at pearwood.info
Sat Dec 14 23:22:09 CET 2013
On Sat, Dec 14, 2013 at 09:29:00AM -0500, bruce wrote:
> Hi.
>
> Looking at a file -->>
> http://www.marquette.edu/mucentral/registrar/snapshot/fall13/xml/BIOL_bysubject.xml
>
> The file is generated via online/web url, and appears to be XML.
>
> However, when I use elementtree:
> document = ElementTree.parse( '/apps/parseapp2/testxml.xml' )
>
> I get an invalid error : not well-formed (invalid token):
I cannot reproduce that error. Perhaps you have inadvertently corrupted
the file when downloading it? What did you use to download the file?
I used the wget command under Linux:
wget http://www.marquette.edu/mucentral/registrar/snapshot/fall13/xml/BIOL_bysubject.xml
And then I tried parsing it using ElementTree two different ways, both
ways successfully with no errors:
py> import xml.etree.cElementTree as ET
py> tree = ET.ElementTree(file='BIOL_bysubject.xml')
py> root = tree.getroot()
py> for node in root:
... print node.tag, node.attrib
...
STAMP {}
RECORD {}
RECORD {}
RECORD {}
[... snip lots more output for brevity ...]
py> tree = ET.parse('BIOL_bysubject.xml')
py> for node in tree.iter():
... print node.tag, node.attrib
...
[... snip even more output ...]
Both worked fine and gave no errors. I'm using Python 2.7. If you need
additional help, I'm afraid that you're going to have to give more
detail on what you actually did. Please show how you downloaded the
file, what code you used to parse it, and the full error you receive.
Copy and paste the entire traceback.
--
Steven
More information about the Tutor
mailing list