<table cellspacing="0" cellpadding="0" border="0" ><tr><td valign="top" style="font: inherit;"><pre>>><i> I'm having problems with creating XML-documents, <br></i>>><i> because I don't seem to write it to a document correctly. <br></i><br>>Is that because you don't understand XML or because the <br>>output is not what you expect? How is the data being generated? <br>>Are you parsing an existing XML source or creating the XML <br>>from scratch? I'm not sure I understand your problem.<br><br>I know the theory of XML but have never used it really and<br>I'm a bit unsecure about it.<br><br>Basically I'm doing the following:<br><br>1. retrieve data from a database ( instance in q )<br>2. pass the data to an external java-program that requires file-input <br>3. the java-program modifies the inputfile and creates an outputfile based on the inputfile<br>4. I read the outputfile and try to parse it.<br><br>1 to 3 are performed by a
seperate program that creates the XML<br>4 is a program that tries to parse it (and then perform other<br>modifications using python)<br><br>When I try to parse the outputfile it creates different errors such as:<br> * ExpatError: not well-formed (invalid token):<br><br>Basically it ususally has something to do with not-well-formed XML. <br>Unfortunately the Java-program also alters the content on essential <br>points such as inserting spaces in tags (e.g. id="value" to id = " value " ),<br>which makes it even harder. The Java is really a b&%$#!, but I have<br>no alternatives because it is custommade (but very poorly imho).<br><br>Sorry, I've nog been clear, but it's very difficult and frustrating for <br>me to troubleshoot this properly because the Java-program is quite huge and <br>takes a long time to load before doing it's actions and when running<br>also requires a lot of time. The minimum is about 10 minutes per run.<br><br>This means trying
a few little things takes hours.<br>Because of the long load and processing time of the Java-program I'm forced<br>to store the output in a single file instead of processing it record by record.<br><br><br>Also each time I have to change something I have to modify functions in <br>different libraries that perform specific functions. This probably means<br>that I've not done it the right way in the first place.<br><br><br><br>>><i> text = str('<record id="' + str(instance.id)+ '">\n' + \<br></i>' <date>' + str(instance.datetime) + ' </date>\n' + \<br>' <order>' + instance.order + ' </order>\n' + \<br>'</record>\n')<br><br>>You can simplify this quite a lot. You almost certaionly don;t need <br>>the outer str() and you probably don;t need the \ characters either.<br><br>I use a very simplified text-variable here. In reality I also include <br>other fields which contain numeric values as well. I use the
\ to<br>keep each XML-tag on a seperate line to keep the overview.<br><br><br>>Also it might be easier to use a triple quoted string and format <br>>characters to insert the dasta values.<br><br>>><i> When I try to parse it, it keeps giving errors. <br></i><br>>Why do you need to parse it if you are creating it?<br>>Or is this after you read it back later? I don't understand the <br>>sequence of processing here.<br><br>>><i> So I tried to use an external library jaxml, <br></i><br>>Did you try to use the standard library tools that come with Python, <br>>like elementTree or even sax?<br><br>I've been trying to do this with minidom, but I'm not sure if this <br>is the right solution because I'm pretty unaware of XML-writing/parsing<br><br>At the moment I'm tempted to do a line-by-line parse and trigger on<br>an identifier-string that identifies the end and start of a record. <br>But that way I'll never learn
XML.<br><br><br>>I think we need a few more pointers to the root cause here.<br><br></pre></td></tr></table><br>