[Expat-discuss] & symbol workaround

Brad Causey bradcausey at gmail.com
Tue Feb 24 01:33:21 CET 2009


cu,

This will make trouble if you get some escaped symbol (eg. &).
> So, you'll have to find the &'s, check what comes after and then
> decide whether to fixup or let it pass.
>

Agreed. I further tuned the code later, but mainly wanted to give the list
an idea of my work-around. Also, I could be particularly lax in my
find/replace because all fields in my case that could contain ampersands
were getting thrown out of the report anyway. Lucky me! ;-p


>
> BTW: is there any way for hooking into the parser (some callback)
> to catch those errors and then continue parsing ?
> That would allow building an auto-fixing parser, especially for
> cases like Brad's.
>

Although python allows you to 'modify' the instance of the object, and any
part of it, I think its just easier to make a one time 'workaround' UDF and
move on. I guess it depends on how heavily you depended on performance and
other variables.

Building from your idea....

I think the community could benefit greatly from a parser that is less
strict than the ones out there today. Although XML does have strict rules,
many companies/programs/tools adapt unusual implementations of it. I know, I
know, everyone is going to say 'well they shouldn't do that' and 'then its
not really XML' but they do, and it is closer to XML than any other text
format. Thoughts?

-Brad


More information about the Expat-discuss mailing list