[Expat-discuss] parsing error
Nikolai Koudelia
nikoudel at gmail.com
Mon Oct 29 11:43:48 CET 2007
Hi!
I am trying to parse "xml" document with Python and Expat. I need to
scan through xml and collect values which match pattern. Example:
pattern:
<tr option1="GROUP1"><td>GROUP2</td><tr>
With pattern above I need to fetch "asdf" and "qwerty" from material below:
<table>
<tr option1="asdf"><td>qwerty</td></tr>
</table>
The problem is that the material may not be correct. It may look like this:
<table>
<tr option1="asdf"><td>qwerty</td></tr>
</brokentag>
<tr option1="rtyu"><td>fgh 16</td></tr>
</table>
When expat parser reaches </brokentag>, it throws an exception and
stops parsing. Is there a way to handle situation like that? Some
option telling expat to skip broken closing tags? Or should I repair
the material before parsing? Last one could be quite tricky, because
expat could not be used for that... Any ideas?
-NK
More information about the Expat-discuss
mailing list