Newbie XML SAX Parsing: How do I ignore an invalid token?
"Martin v. Löwis"
martin at v.loewis.de
Sat Jan 6 09:30:12 CET 2007
scott at crybabymaternity.com schrieb:
> My original posting has a funky line break character (it appears as an
> ascii square) that blows up my program, but it may or may not show up
> when you view my message.
Looking at your document, it seems that this "funky line break
character" is character \x1E, which, in latin-1, means "record
separator". It's indeed ill-formed to use it in XML.
> Is there a way to account for the invalid token in the error handler?
Not with a standard XML parser, no. The error you describe is a "fatal
error", and that's not something parsing can recover from. I recommend
that you filter this character out before passing it to the XML parser.
You can use the IncrementalParser interface to do so.
More information about the Python-list