
What are the options in lxml to prevent the parser to process DTDs, i.e. reject any XML that contains a DTD (for security reasons)? Best regards Rainer

Rainer Hoerbe schrieb am 07.06.2016 um 18:54:
What are the options in lxml to prevent the parser to process DTDs, i.e. reject any XML that contains a DTD (for security reasons)?
See https://pypi.python.org/pypi/defusedxml/ Stefan

Rainer Hoerbe schrieb am 08.06.2016 um 07:52:
You don't have to use defusedxml, I posted the link because it has all the details in it. lxml doesn't access any network resources by default, including DTDs. For internal subsets, libxml2 applies reasonable bounds on the content that a DTD is allowed to generate, which counters most attacks. I'm not aware of a way to disable DTD processing completely. But you can disable entity resolution, use incremental parsing, and then check for the existence of a DTD right after the start event of the root element. That's not entirely the same as not allowing any DTD processing at all, but it's just as good when it comes to content generation. For details, see the link above. Stefan

Rainer Hoerbe schrieb am 07.06.2016 um 18:54:
What are the options in lxml to prevent the parser to process DTDs, i.e. reject any XML that contains a DTD (for security reasons)?
See https://pypi.python.org/pypi/defusedxml/ Stefan

Rainer Hoerbe schrieb am 08.06.2016 um 07:52:
You don't have to use defusedxml, I posted the link because it has all the details in it. lxml doesn't access any network resources by default, including DTDs. For internal subsets, libxml2 applies reasonable bounds on the content that a DTD is allowed to generate, which counters most attacks. I'm not aware of a way to disable DTD processing completely. But you can disable entity resolution, use incremental parsing, and then check for the existence of a DTD right after the start event of the root element. That's not entirely the same as not allowing any DTD processing at all, but it's just as good when it comes to content generation. For details, see the link above. Stefan
participants (2)
-
Rainer Hoerbe
-
Stefan Behnel