sgmllib & parsing problem
Harvest T. Moon
h4rv3st at web.de
Thu Aug 30 09:05:21 EDT 2001
i'm writing a client for the JammerIM system in Python (for BeOS, if anyone
cares) and the system is based on XML-pieces.
i have subclassed SGMLParser from sgmllib and everything is working fine,
but _one_ thing screws up the whole parser:
mostly the pieces come in ordinary tags-structure like
<message from="..">
<body>Test</body>
<blabla>asd</blabla>
<nothing>important</nothing>
<really>stupid</really>
</message>
that works fine and i get myself a nice structured MsgObject with
.SubElements() etc, all right.
but somtimes some tags don't deliver content only attributes so they come as
<strange id="0815" thread="123" />
which is quite clear to me that there is no closing tag, but SGMLParser
doesn't see the ending '/' and assumes it's an ordinary start-tag, so it
never gets closed and the whole object is down the drain as "strange"
survives in the stack forever.
how can i make SGMLParser to see that '/' or handle the whole tag with '/'
at the end as a standalone-tag?
regards,
Harvest T. Moon
More information about the Python-list
mailing list