[Tutor] Re: txt to xml using dom
Tom Brownlee
tompol@hotmail.com
Fri Jun 13 17:47:01 2003
hello all, ive posted b4 but it was mentioned my message was vague so.....
Im a tertiary student from new zealand doing python programming at the
moment. we have an assignment to do that involves using dom (minidom). if
possible can anybody help me.
i have to take a course outline in .txt format and pass it through a python
program that outputs the .txt document into xml tags etc. that is to
say...txt document in = xml out (in a .txt file)
The course outline has headings and some subheadings that must be 'tagged'
and the text in between left as is in the output file between the tags.
i have written the program but it doesnt quite work. i have attached it
below to see if you can make sense of why it doesnt work.
thankyou very much for your help.
the output i get is this:
<?xml version="1.0" ?>
<2003 Course Outline/>
...and thats it.
ive also included below the small course outline to be parsed through the
program.
p.s.
if possible can i have the corrected code. this may seem demanding and lazy
on my part but i have spent many hours on this problem and utterly
frustrated that it doesnt work, and as a beginner im losing faith in python
altogether :(
<start of code>
import re
from xml.dom.minidom import *
def main(arg):
try:
f = open(arg)
except:
print "cannot open file"
newdocument = Document()
rootElement = newdocument.createElement("2003 Course Outline")
newdocument.appendChild(rootElement)
tagSequence = re.compile("(^\d+)\t+")
while 1:
line = f.readline()
if len(line) == 0:
break
s = line
target = tagSequence.search(s)
if target:
s2 = re.search("\t", s)
result = s[s2.span()[1]:]
newElement = newdocument.createElement(result)
rootElement.appendChild(newElement)
x = newdocument.toxml()
f=open('CourseOutlineInXml.txt', 'w')
f.write(x)
print x
if __name__ == '__main__':
main("CourseOutline.txt")
<end of code>
<start course document>
1 COURSE STAFF MEMBERS
(a) Course Academic Staff Member
Rob Oliver - Room number S662. Contact number 940 8556
Email: oliverr@cpit.ac.nz
(b) Programme Leader
Trevor Nesbit, Room number N215. Contact number 940 8703
Email: nesbitt@cpit.ac.nz
(c) Course Co-ordinator
Dr Mike Lance - Room number S661 Contact number 940 8318
Email: lancem@cpit.ac.nz
(d) Head of School (Acting)
Janne Ross, Room number S176, Contact number 940 8537
Email: rossj@cpit.ac.nz
2 MATERIALS
NIL
3 CLASS HOURS AND TIMES
Day Time Room
Tuesday 10:00 - 12:00 X307
Thursday 10:00 - 12:00 L249
4 REFERENCE TO STUDENT HANDBOOKS
Students should obtain a copy of the following
Christchurch Polytechnic Student Handbook
Faculty of Commerce Student Handbook
Programme Handbook
Each of these contains information to students about a range of policies and
procedures.
<end of course document>
_________________________________________________________________
Gaming galore at http://xtramsn.co.nz/gaming !