[Python-Dev] Need help to fix urllib(.parse) vulnerabilities
storchaka at gmail.com
Sat Jul 22 02:01:57 EDT 2017
21.07.17 13:02, Victor Stinner пише:
> Recently, two security vulnerabilities were reported in the urllib module:
> => already fixed in Python 3.6.2
> => not fixed yet
> I also proposed a more general protection: "Reject newline character
> (U+000A) in URLs in urllib.parse":
> The problem with the urllib module is how we handle invalid URL. Right
> now, we return the URL unmodified if we cannot parse it. Should we
> raise an exception if an URL contains a newline for example?
> It's very hard to harden the urllib module without the backward
> compatibility. That's why it took 3 weeks to fix "urllib connects to a
> wrong host": find how to fix the vulnerability without brekaing the
> backward compatibility.
> Another proposed approach is to reject invalid data earlier or later,
> but not in urllib...
Checking an URL in urllib.parse is too early and not enough. The urllib
module is general, and different protocols have different limitations.
There are other ways besides urllib to pass invalid parameters to
low-level protocol implementations.
I think the only reliable way of fixing the vulnerability is rejecting
or escaping (as specified in RFC 2640) CR and LF inside sent lines.
Adding the support of RFC 2640 is a new feature and can be added only in
3.7. And this feature should be optional since not all servers support
RFC 2640. https://github.com/python/cpython/pull/1214 does the right thing.
The other way of hardening the Python stdlib implementation of the FTP
server is making it accepting only CRLF as a line delimiter, not sole CR
Additional sanity checks can be added in FTP.login() for earlier
detecting and raising more specific errors.
Every protocol (FTP, HTTP, telnet, SMTP, POP3, IMAP, etc) should be
fixed separately. If they allow escaping special characters, they should
do this. Otherwise they should be rejected.
More information about the Python-Dev