xml parsing with sax

Steve Holden sholden at holdenweb.com
Fri Apr 27 07:55:53 EDT 2001


"Harald Kirsch" <kirschh at lionbioscience.com> wrote in message
news:yv28zkm63gi.fsf at lionsp093.lion-ag.de...
>
> I have more (or less) XML data, however it is missing a root element
> or, put another way, it is like a sequence of individual XML documents
> similar to
>
>   <bla>...</bla>
>   <bla>...</bla>
>   <bla>...</bla>
>   <bla>...</bla>
>
> Trying to parse this with python's inbuilt sax results in an error
> message about 'junk after document element', i.e. after the first
> </bla>. In a way this is correct because a surrounding root element is
> missing. However, is there a way to instruct sax to keep going. Or is
> it possible to push a pseudo root element in front of a stream parsed
> with
>
>   xml.sax.parse(sys.stdin, ...)
>
Unless your XML is huge (always a possiblity...) you could use something
like:

myXML = sys.stdin.read()
xml.sax.parseString("<root>\n"+myXML+"</root>\n")

As you can see, this corrects the structural deformity. Obviously, you could
use more complex content around the <bla> ... </bla> if you needed it.

regards
 steVE





More information about the Python-list mailing list