[Python-bugs-list] [ python-Bugs-588714 ] urllib.urlopen.geturl() and redirects
noreply@sourceforge.net
noreply@sourceforge.net
Wed, 31 Jul 2002 01:34:25 -0700
Bugs item #588714, was opened at 2002-07-30 19:11
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=588714&group_id=5470
Category: Python Library
Group: Python 2.2.1
Status: Open
Resolution: None
Priority: 5
Submitted By: Matthias Klose (doko)
>Assigned to: Jeremy Hylton (jhylton)
Summary: urllib.urlopen.geturl() and redirects
Initial Comment:
[From http://bugs.debian.org/146408]
From: Matthew Vernon <matthew@pick.ucam.org>
Subject: python2.2: urllib.urlopen.geturl() fails to
deal with redirects properly
urllib.urlopen.geturl() claims: "
The geturl() method returns the real URL of the page.
In some cases,
the HTTP server redirects a client to another URL. The
urlopen()
function handles this transparently, but in some cases
the caller
needs to know which URL the client was redirected to.
The geturl()
method can be used to get at this redirected URL.
But it appears not to:
>>>
urllib.urlopen("http://www.google.com/search?q=test&btnI=I'm+Feeling+Lucky").geturl()
"http://www.google.com/search?q=test&btnI=I'm+Feeling+Lucky"
Doing the same by steam:
HEAD
http://www.google.com/search?q=test&btnI=I'm+Feeling+Lucky
HTTP/1.1
Host: www.google.com
HTTP/1.0 302 Moved Temporarily
Content-Length: 151
Server: GWS/2.0
Date: Thu, 09 May 2002 16:51:37 GMT
Location: http://www.toefl.org/
Content-Type: text/html
----------------------------------------------------------------------
>Comment By: Michael Hudson (mwh)
Date: 2002-07-31 08:34
Message:
Logged In: YES
user_id=6656
Something even wierder happens when I try urllib2:
>>>
urllib2.urlopen("http://www.google.com/search?q=test&btnI=I'm+Feeling+Lucky").geturl()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/home/mwh/src/python/dist/src/Lib/urllib2.py", line
136, in urlopen
return _opener.open(url, data)
File "/home/mwh/src/python/dist/src/Lib/urllib2.py", line
324, in open
'_open', req)
File "/home/mwh/src/python/dist/src/Lib/urllib2.py", line
303, in _call_chain
result = func(*args)
File "/home/mwh/src/python/dist/src/Lib/urllib2.py", line
792, in http_open
return self.do_open(httplib.HTTP, req)
File "/home/mwh/src/python/dist/src/Lib/urllib2.py", line
786, in do_open
return self.parent.error('http', req, fp, code, msg, hdrs)
File "/home/mwh/src/python/dist/src/Lib/urllib2.py", line
350, in error
return self._call_chain(*args)
File "/home/mwh/src/python/dist/src/Lib/urllib2.py", line
303, in _call_chain
result = func(*args)
File "/home/mwh/src/python/dist/src/Lib/urllib2.py", line
402, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 403: Forbidden
(sf is going to mangle that traceback, I can tell).
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=588714&group_id=5470