[Python-bugs-list] [ python-Bugs-405939 ] HTTPConnection Host hdr wrong w/ proxy
nobody
nobody@sourceforge.net
Mon, 05 Mar 2001 12:31:39 -0800
Bugs #405939, was updated on 2001-03-04 21:44
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=405939&group_id=5470
Category: Python Library
Group: None
Status: Open
Priority: 5
Submitted By: Ernie Sasaki
Assigned to: Nobody/Anonymous
Summary: HTTPConnection Host hdr wrong w/ proxy
Initial Comment:
The HTTPConnection class' putrequest() method is
incorrect if self._http_vsn == 11 and a proxy is in
use.
Currently the following is done in httplib.py revision
1.33:
if self.port == HTTP_PORT:
self.putheader('Host', self.host)
else:
self.putheader('Host', "%s:%s" % (self.host,
self.port))
However if a proxy is in use, self.host is the proxy
address, and url contains the "realhost" which should
be in the Host header. (urllib does the right thing
here but it uses the HTTP class and not
HTTPConnection. It doesn't see this problem because
then HTTP/1.0 is used and no Host header is sent
automatically.)
Instead the following is correct:
match = httpRE.search(url)
if match:
self.putheader('Host', match.group(1))
else:
if self.port == HTTP_PORT:
self.putheader('Host', self.host)
else:
self.putheader('Host', "%s:%s" % (self.host,
self.port))
where:
httpRE = re.compile(r'(?i)http://([^/]+)')
----------------------------------------------------------------------
Comment By: Ernie Sasaki
Date: 2001-03-05 12:31
Message:
Logged In: YES
user_id=139439
Well, my not very good answers are (notwithstanding your
quote):
1). This is what Netscape 4.7 does.
2). This is what urllib's open_http does.
3). I rather you didn't send a Host header at all rather
than a wrong one. It just makes no sense to me to give the
origin server a Host header that relates to the proxy's
address. How would the virtual host mechanism (mentioned in
the section you quote) ever work thru a proxy then?? You
need the concept of a host different from what is specified
in the Request-URI.
4). I speculate (with only secondhand evidence) that a
proxy can change the absoluteURI to an absolute path when
passing it on to the origin server. In that case, the Host
header would indeed determine the host.
As far as the patch being incomplete: In no part of httplib
does any special handling of an embedded user/password
appear. It is assumed that you'll take care of sending the
Authorization header yourself.
----------------------------------------------------------------------
Comment By: Martin v. Löwis
Date: 2001-03-05 00:39
Message:
Logged In: YES
user_id=21627
Why is that a bug? RFC 2616, section 5.2, states
# If Request-URI is an absoluteURI, the host is part of the
# Request-URI. Any Host header field value in the request
# MUST be ignored.
So in the presence of an absolute URI, the Host: field does
not matter. It is certainly nicer to fill in the right Host:
field, but I'd like to understand the problem before
applying a fix. Your patch is incomplete, IMO: it does not
deal with the user/password part in the URL.
----------------------------------------------------------------------
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=405939&group_id=5470