[Tutor] Using Beautiful Soup to extract tag names
Kent Johnson
kent37 at tds.net
Tue Mar 14 16:38:46 CET 2006
Ed Singleton wrote:
> I have (unfortunately) received some data in XML format. I need to
> use it in Python, preferably as a list of dictionaries. The data is a
> flat representation of a table, in the style:
>
> <tablename>
> <fieldname1>Some Data</fieldname1>
> <fieldname2>Some Data</fieldname>
> ...
> </tablename>
> <tablename>
> <fieldname1>Some Data</fieldname1>
> <fieldname2>Some Data</fieldname>
> ...
>
> and so on (where tablename is always the same in one file).
ElementTree makes short work of this:
from elementtree import ElementTree
xml = '''
<data><tablename>
<fieldname1>Some Data1</fieldname1>
<fieldname2>Some Data2</fieldname2>
</tablename>
<tablename>
<fieldname3>Some Data3</fieldname3>
<fieldname4>Some Data4</fieldname4>
</tablename>
</data>'''
doc = ElementTree.fromstring(xml)
# use ElementTree.parse() to parse a file
for table in doc.findall('tablename'):
for field in table.getchildren():
print field.tag, field.text
prints:
fieldname1 Some Data1
fieldname2 Some Data2
fieldname3 Some Data3
fieldname4 Some Data4
If speed is an issue then look at cElementTree which has the same
interface and is blazingly fast.
http://effbot.org/zone/element.htm
Kent
More information about the Tutor
mailing list