[PATCH] Add an authorization header to the initial request.

Many websites (e.g. GitHub API) on the Internet are intentionally not following RFC with regards to the Basic Authorization and require Authorization header in the initial request and they never return 401 error. Therefore it is not possible to authorize with such websites just using urllib2.py HTTPBasicAuthHandler as described in documentation. However, RFC 2617, end of section 2 allows pre-authorization in the initial request:
(RFC also suggests preauthorization of proxy requests, but that is not part of this patch, however it could be trivially added) Also, generating HTTP BasicAuth header has been refactored into special method of AbstractBasicAuthHandler. Suggested fix for bug# 19494 This is my first attempt to contribute to Python itself, so please be gentle with me. Yes, I know that I miss unit tests and port to other branches of Python (this is against 2.7), but I would like first some feedback to see what I am missing (aside from the mentioned). Matěj Cepl --- Lib/urllib2.py | 35 +++++++++++++++++++++++++++-------- 1 file changed, 27 insertions(+), 8 deletions(-) diff --git a/Lib/urllib2.py b/Lib/urllib2.py index aadeb73..a5feb03 100644 --- a/Lib/urllib2.py +++ b/Lib/urllib2.py @@ -848,6 +848,18 @@ class AbstractBasicAuthHandler: def reset_retry_count(self): self.retried = 0 + def generate_auth_header(self, host, req, realm): + user, pw = self.passwd.find_user_password(realm, host) + if pw is not None: + raw = "%s:%s" % (user, pw) + auth = 'Basic %s' % base64.b64encode(raw).strip() + if req.headers.get(self.auth_header, None) == auth: + return None + req.add_unredirected_header(self.auth_header, auth) + return req + else: + return None + def http_error_auth_reqed(self, authreq, host, req, headers): # host may be an authority (without userinfo) or a URL with an # authority @@ -875,14 +887,10 @@ class AbstractBasicAuthHandler: return response def retry_http_basic_auth(self, host, req, realm): - user, pw = self.passwd.find_user_password(realm, host) - if pw is not None: - raw = "%s:%s" % (user, pw) - auth = 'Basic %s' % base64.b64encode(raw).strip() - if req.headers.get(self.auth_header, None) == auth: - return None - req.add_unredirected_header(self.auth_header, auth) - return self.parent.open(req, timeout=req.timeout) + req = self.generate_auth_header(host, req, realm) + + if req is not None: + self.parent.open(req, timeout=req.timeout) else: return None @@ -898,6 +906,17 @@ class HTTPBasicAuthHandler(AbstractBasicAuthHandler, BaseHandler): self.reset_retry_count() return response + def http_request(self, req): + host = req.get_host() + + new_req = self.generate_auth_header(host, req, None) + if new_req is not None: + req = new_req + + return req + + https_request = http_request + class ProxyBasicAuthHandler(AbstractBasicAuthHandler, BaseHandler): -- 1.8.5.2.192.g7794a68

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2014-02-11, 12:27 GMT, you wrote:
It is there (http://bugs.python.org/file34031/0001-Add-an-authorization-header-to-the-ini...), but given I am the only on Nose list of the bug, I thought it necessary to make a noise here. Matěj -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iD8DBQFS+h/14J/vJdlkhKwRAoOAAJ9nbR2LI+OPC6B/6LkpFvOnF5B2OwCdFgMg dhTv8f3u9d+Qmmukpmo2b9Y= =lj/m -----END PGP SIGNATURE-----

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 02/11/2014 08:04 AM, Matěj Cepl wrote:
For people looking for it, that's issue 19494: http://bugs.python.org/issue19494 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJS+igLAAoJENxauZFcKtNxIioH/RRm1qtQrYDlxzqia1dVegdM dyiD85zVgspmIp1EnszhH2X2QSIAtE3AQgBSmYG7isMZbrGGyiItwFlYLJElgbQg b+rGStCxsVhUEauzPHq6gMpqf8Nmfw+NsS3itZ0M0T41H7G7pEhi4Yep3ruqJ1Rp 21wNVMzck/9Zj8p5YDVncDESotODjNN2HeQDg5drsmROvzMW0nQidoXqfaS+7GXV GTIp4sjW4tJi5771Ob8hR9riEHNU9fQ12hf1z/IwNrsaTHToXa6PAhjGcmLez5Vj Vs66dQYtSki45hP5opNyBaEaaV+9dajs8fiktOeVyGA46UwJ7qyQN3SMNKGW4gg= =jiOU -----END PGP SIGNATURE-----

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2014-02-11, 12:27 GMT, you wrote:
It is there (http://bugs.python.org/file34031/0001-Add-an-authorization-header-to-the-ini...), but given I am the only on Nose list of the bug, I thought it necessary to make a noise here. Matěj -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iD8DBQFS+h/14J/vJdlkhKwRAoOAAJ9nbR2LI+OPC6B/6LkpFvOnF5B2OwCdFgMg dhTv8f3u9d+Qmmukpmo2b9Y= =lj/m -----END PGP SIGNATURE-----

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 02/11/2014 08:04 AM, Matěj Cepl wrote:
For people looking for it, that's issue 19494: http://bugs.python.org/issue19494 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJS+igLAAoJENxauZFcKtNxIioH/RRm1qtQrYDlxzqia1dVegdM dyiD85zVgspmIp1EnszhH2X2QSIAtE3AQgBSmYG7isMZbrGGyiItwFlYLJElgbQg b+rGStCxsVhUEauzPHq6gMpqf8Nmfw+NsS3itZ0M0T41H7G7pEhi4Yep3ruqJ1Rp 21wNVMzck/9Zj8p5YDVncDESotODjNN2HeQDg5drsmROvzMW0nQidoXqfaS+7GXV GTIp4sjW4tJi5771Ob8hR9riEHNU9fQ12hf1z/IwNrsaTHToXa6PAhjGcmLez5Vj Vs66dQYtSki45hP5opNyBaEaaV+9dajs8fiktOeVyGA46UwJ7qyQN3SMNKGW4gg= =jiOU -----END PGP SIGNATURE-----
participants (3)
-
Eric V. Smith
-
Matěj Cepl
-
Terry Reedy