[Python-Dev] urllib2 doesn't grok URLs w/ user/passwd

Tue Dec 30 11:11:44 EST 2003

On Tuesday 30 December 2003 04:03 pm, Skip Montanaro wrote:
> SF seems to be down for some unscheduled reason.  Posting here just so I
> don't completely forget about it should I exit my web browser before SF is
> back up...
>
> urllib2.urlopen("http://foo@www.python.org/") fails (at least in part)
> because it fails to separate the username and password from the hostname.
> Trying to open http://foo:bar@www.python.org/ reveals other shortcomings in
> its url parsing.  It seems to me the syntactic bits shouldn't be difficult
> to resolve using urllib.spluituser().  I'm much less clear what to do with
> the username and password once they've been separated from the hostname.

Presumably they need to be kept somewhere and sent in the Authorization
header in case the server returns a 401 error and challenge (or a proxy 
returns a 407 error and challenge) -- or maybe the Authorization header
(with the base 64 encoding of user:pass) can be sent even as part of the
first request to speed things up (assuming an authorization scheme of
Basic).  RFC 2617, I believe.  urllib2's architecture delegates authorization
to separate components, of course, so I guess the userid and password
should just be handed over to such components if they're present, but I
haven't looked into that in detail.

Alex