Re: [lxml-dev] [Question #65510]: How to set libxml:XML_PARSE_HUGE-option in lxml?
data:image/s3,"s3://crabby-images/4cf20/4cf20edf9c3655e7f5c4e7d874c5fdf3b39d715f" alt=""
bol wrote:
Ok, that gives you a) the bit of structure that you need and b) safe and portable encoding support (which I assume is critical here), so that's fine with me. After all, XML is used for all sorts of things these days...
The option XML_PARSE_HUGE should be as in libxml default off.
That's what I was wondering about. It's (sort of) on by default if you use libxml2 2.6.x and 2.7.[012], but it's supposed to be off by default if you use libxml2 2.7.3 and later. That's outside of the control of lxml. So you would get one behaviour on one system and a different behaviour on another system, even with the same version of lxml. However, this is meant as a security measure to prevent traps like the billion laughs attack. Therefore, I do understand that a) most people won't notice and b) having it on by default seems like the right setting. Is there any opposition to keeping the enforced parser restrictions (limited tree depth and text node length) enabled by default in newer libxml2 versions, and to provide a parser switch for disabling them? The alternative would be to disable them by default on all libxml2 versions, and to provide a switch that enables them if libxml2 supports it. But a safe default sounds a lot better. Stefan
data:image/s3,"s3://crabby-images/4cf20/4cf20edf9c3655e7f5c4e7d874c5fdf3b39d715f" alt=""
Dirk Rothe wrote:
There's a "huge_tree" option for now, defaulting to False. Let's see if it works out that way. https://codespeak.net/viewvc/?view=rev&revision=63399 Stefan
data:image/s3,"s3://crabby-images/4cf20/4cf20edf9c3655e7f5c4e7d874c5fdf3b39d715f" alt=""
Dirk Rothe wrote:
There's a "huge_tree" option for now, defaulting to False. Let's see if it works out that way. https://codespeak.net/viewvc/?view=rev&revision=63399 Stefan
participants (2)
-
Dirk Rothe
-
Stefan Behnel