XML validation / exception.

dieter dieter at handshake.de
Fri Jan 25 09:20:05 CET 2013


Andrew Robinson <andrew3 at r3dsolutions.com> writes:
> On xml.etree,
> When I scan in a handwritten XML file, and there are mismatched tags -- 
> it will throw an exception.
> and the exception will contain a line number of the closing tag which
> does not have a mate of the same kind.
>
> Is there a way to get the line number of the earlier tag which caused
> the XML parser to know the closing tag was mismatched, so I can narrow
> down the location of the mismatches for a manual repair?

This is parser dependent -- and likely not the case for the
standard parsers.

In order to check for the correspondence between opening and
closing tags, that parser must maintain a stack of open tags.
Your request can be fullfilled when the parser keeps associated
line numbers in this stack. I expect that most parser will not do that.

Python's "xml" framework is highly modularied - with each component
having only a minimal task. Especially, the parser is responsible
for parsing only: it parses and generated events for what is sees
(opening tag, closing tag, text, comment, entity, error, ...).
The events are consumend by a separate component (when I remember
right, a so called "handler"). Such a component is responsible
to create the "ETree" during the parsing.
You might be able to provide an alternative for this component
which captures (in addition) line information for opening tags.
Alternatively, you could provide an alternative parser.




More information about the Python-list mailing list