[Patches] [ python-Patches-527518 ] urllib2.py: fix behavior with proxies
noreply@sourceforge.net
noreply@sourceforge.net
Sat, 06 Jul 2002 08:08:14 -0700
Patches item #527518, was opened at 2002-03-08 13:50
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=527518&group_id=5470
Category: Library (Lib)
Group: Python 2.1.2
Status: Open
Resolution: None
Priority: 5
Submitted By: Chris Lawrence (lordsutch)
Assigned to: Moshe Zadka (moshez)
Summary: urllib2.py: fix behavior with proxies
Initial Comment:
The following patch against Python 2.1 fixes some
problems with the
urllib2 module when used with proxies; in particular, if
$http_proxy="http://user:passwd@host:port/" is used.
It also
generates the correct Host header for proxy requests
(some proxies,
such as oops, get confused otherwise, despite RFC 2616
section 5.2
which says they are to ignore it in the case of a full
URL on the
request line).
----------------------------------------------------------------------
>Comment By: Chris Lawrence (lordsutch)
Date: 2002-07-06 10:08
Message:
Logged In: YES
user_id=6757
Moshe: The updated patch seems to be A-OK and fixes the
issue in urllib2.py. At some point I'll have to get back to
urllib.py.
Chris
----------------------------------------------------------------------
Comment By: Moshe Zadka (moshez)
Date: 2002-06-18 02:40
Message:
Logged In: YES
user_id=11645
I've looked at the patch, and it mixes cleanup with fixes. I
removed the cleanups parts, since I want
an "obviously correct" patch. Attached is a new patch I
generated which fixes the two problems:
* incorrect quoting of the user/password in the proxy code
* bad host headers when using proxies.
I am also curious about the logic in the later fix. Can
"sel_host" ever be empty? When? Or can we
just remove the "or host" stuff?
Thanks.
----------------------------------------------------------------------
Comment By: Moshe Zadka (moshez)
Date: 2002-06-13 13:04
Message:
Logged In: YES
user_id=11645
Nope, no reason, except I need to properly test it
and check it in, and I won't have time for that until
the weekend.
----------------------------------------------------------------------
Comment By: Jeremy Hylton (jhylton)
Date: 2002-06-13 12:51
Message:
Logged In: YES
user_id=31392
This patch vs. CVS HEAD looks good to me.
Note that it would be better to get the Host header by
upgrading urllib2 to use HTTPConnection instead of HTTP, but
that's a much bigger project. Would it be a problem to
always send HTTP/1.1 requests -- even to 1.0 servers?
Any reason not to check it in Moshe?
----------------------------------------------------------------------
Comment By: Chris Lawrence (lordsutch)
Date: 2002-06-13 09:24
Message:
Logged In: YES
user_id=6757
I'll try to make these changes sometime over the next few
days; of course, if someone else wants to do it sooner &
check it in, they're more than welcome.
----------------------------------------------------------------------
Comment By: Bastian Kleineidam (calvin)
Date: 2002-06-13 04:45
Message:
Logged In: YES
user_id=9205
I testet the urllib.py patches for 2.1 and 2.2, they work.
Some minor quibbles are left:
a) the user and/or password may be empty, so your test "if
proxypass and proxyuser" is not enough. You should test
against "is None".
b) in the urllib2 patches, you use unquote() for user and
pass, but in the urllib patches you dont. You should use
unquote in both modules.
c) in urllib2 patch, you use encodestring() without strip()
Here is an example that catches the corner cases
# http://@host.com (empty user and password)
# http://:@host.com (empty user and password)
# http://user@host.com (empty password)
# http://user:@host.com (empty password)
# http://:pass@host.com (empty user)
proxyuserpass, host = splithost(host)
if proxyuserpass is not None:
....# unquote
....proxyuserpass = unquote(proxyuserpass)
....# add empty password if missing
....if ":" not in proxyuserpass: proxyuserpass += ":"
....# base64
....proxyuserpass = base64.encodestring(proxyuserpass).strip()
....req.add_header("Proxy-Authorization", "Basic
"+proxyuserpass)
Greetings, Bastian
----------------------------------------------------------------------
Comment By: Chris Lawrence (lordsutch)
Date: 2002-06-12 22:17
Message:
Logged In: YES
user_id=6757
Ok, here's the patch for urllib.py; again, one patch for
each of 2.1, 2.2 and CVS HEAD. I also moved the Host header
to right after the GET/PUT request line; this should help
servers that have multiple virtual hosts handle requests
more efficiently.
----------------------------------------------------------------------
Comment By: Chris Lawrence (lordsutch)
Date: 2002-06-12 21:39
Message:
Logged In: YES
user_id=6757
Ok, I've cleaned up the patch a bit. I've got versions for
2.1, 2.2 and current CVS HEAD; they're all the same
substantively, but the 2.2 -> 2.3 jump changed things enough
that the 2.2 patch won't apply cleanly to CVS.
Note that the first big chunk fixes the proxy authentication
problem, while the second chunk fixes the incorrect Host
header problem. The changes to the import at the beginning
are necessary for either part to work.
I'll investigate urllib.py further. It looks like the
underlying problem is fixed in CVS HEAD already, but I'll
try to confirm after setting up some test code for urllib.
----------------------------------------------------------------------
Comment By: Chris Lawrence (lordsutch)
Date: 2002-06-12 19:54
Message:
Logged In: YES
user_id=6757
Moshe, Calvin:
I'll see about reworking the patch against current CVS and
using splituser etc. I can break it up into two bits if you
like, too; probably cleaner that way. (Have I mentioned how
much I hate fooling with SF.net's BTS... give me debbugs any
day :-)
Chris
----------------------------------------------------------------------
Comment By: Bastian Kleineidam (calvin)
Date: 2002-06-12 11:41
Message:
Logged In: YES
user_id=9205
Note that the proxy thing is also a bug in urllib.py.
Chris, can you supply a patch for urllib.py too?
And I dont like the attached patch because it does not use the
splituser and splitpasswd functions already in urllib. I
would suggest
that you use something like
proxyuser, host = splituser(host)
if proxyuser is not None:
....proxypass, proxyuser = splitpasswd(proxyuser)
....[base64 encode and add header]
Chris, if you are too busy, close this patch and I will open
a new bug with a revised patch.
So long, Bastian
----------------------------------------------------------------------
Comment By: Moshe Zadka (moshez)
Date: 2002-06-11 05:34
Message:
Logged In: YES
user_id=11645
I want to take a look at this....I'm not thrilled about the
patch, especially solving two unrelated
problems and all, but I do think there's a real problem, and
I'll try to fix it.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=527518&group_id=5470