[Tutor] identifying and parsing string in text file
Kent Johnson
kent37 at tds.net
Sat Mar 8 22:35:28 CET 2008
Bryan Fodness wrote:
> I have a large file that has many lines like this,
>
> <element tag="300a,0014" vr="CS" vm="1" len="4"
> name="DoseReferenceStructureType">SITE</element>
> I would like to identify the line by the tag (300a,0014) and then grab
> the name (DoseReferenceStructureType) and value (SITE).
>
> I would like to create a file that would have the structure,
>
> DoseReferenceStructureType = Site
Presuming that your source file is XML, I think I would use
ElementTree.iterparse() to process this.
http://effbot.org/zone/element-iterparse.htm
http://docs.python.org/lib/elementtree-functions.html
Something like this (untested):
from xml.etree.ElementTree import iterparse
source = open('mydata.xml').read()
out = open('myoutput.txt', 'w')
for event, elem in iterparse(source):
if elem.tag == "element":
name = elem['name']
text = elem.text
out.write('%s = %s\n' % (name, text)
elem.clear()
out.close()
More information about the Tutor
mailing list