[Python-Dev] XML DoS vulnerabilities and exploits in Python

Thu Feb 21 18:23:27 CET 2013

Jesse Noller writes:

 > I guess someone need to write a proof of concept exploit for you
 > and release it into the wild.

This is a bit ridiculous.  This stuff looks easy enough that surely
Christian's post informed any malicious body who didn't already know
how to do it.  If the exploit matters, it's already in the wild.
("Hey, didja know that an XML processor that expands entities does so
recursively?"  "Uh-oh ....")

Yeah, there's a problem here.  But ... as far as I can see all the
exploits suggested (including those Christian provided in
python-external.py) require either blindly processing text from
requests received off the Internet as XML, or an attacker capable of
doing something equivalent to replacing a Python library.

I certainly think defusedxml is a valuable contribution, and not just
for security nuts.  But to quote from Christian's own README (warning:
taken out of context to make *my* point):

    7. These are features but they may introduce exploitable holes, see
       `Other things to consider`_

I'd like to see a little (well, to be honest, a *lot*) more analysis
of the kind Fred Drake implicitly suggests:

    Doing so *will* be backward incompatible, and I'm not sure there's
    a good way to gauge the extent of the breakage.

before making these restrictions the default.  Eg, 40 entity
indirections in a single expansion (defusedxml's default maximum) may
seem like a lot, but I've seen some pretty complex expressions built
as entities that recurse three or four levels.  Of course, that was a
while ago, and today most of the entities would be replaced by actual
characters.  Nevertheless, I bet those legacy expressions break the 40
indirection limit, or, rather, the limit would break them.