[Python-Dev] Need help to fix urllib(.parse) vulnerabilities

Victor Stinner victor.stinner at gmail.com
Fri Jul 21 06:02:39 EDT 2017


Recently, two security vulnerabilities were reported in the urllib module:

=> already fixed in Python 3.6.2

=> not fixed yet

I also proposed a more general protection: "Reject newline character
(U+000A) in URLs in urllib.parse":

The problem with the urllib module is how we handle invalid URL. Right
now, we return the URL unmodified if we cannot parse it. Should we
raise an exception if an URL contains a newline for example?

It's very hard to harden the urllib module without the backward
compatibility. That's why it took 3 weeks to fix "urllib connects to a
wrong host": find how to fix the vulnerability without brekaing the
backward compatibility.

Another proposed approach is to reject invalid data earlier or later,
but not in urllib...

So if you understand URLs, HTTP, etc. : please join these issues to
help us to fix them!


More information about the Python-Dev mailing list