[Python-bugs-list] [ python-Bugs-588714 ] urllib.urlopen.geturl() and redirects

noreply@sourceforge.net noreply@sourceforge.net
Wed, 31 Jul 2002 01:34:25 -0700


Bugs item #588714, was opened at 2002-07-30 19:11
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=588714&group_id=5470

Category: Python Library
Group: Python 2.2.1
Status: Open
Resolution: None
Priority: 5
Submitted By: Matthias Klose (doko)
>Assigned to: Jeremy Hylton (jhylton)
Summary: urllib.urlopen.geturl() and redirects

Initial Comment:
[From http://bugs.debian.org/146408]

From: Matthew Vernon <matthew@pick.ucam.org>
Subject: python2.2: urllib.urlopen.geturl() fails to
deal with redirects properly

urllib.urlopen.geturl() claims: "

The geturl() method returns the real URL of the page.
In some cases,
the HTTP server redirects a client to another URL. The
urlopen()
function handles this transparently, but in some cases
the caller
needs to know which URL the client was redirected to.
The geturl()
method can be used to get at this redirected URL.

But it appears not to:

>>>
urllib.urlopen("http://www.google.com/search?q=test&btnI=I'm+Feeling+Lucky").geturl()
"http://www.google.com/search?q=test&btnI=I'm+Feeling+Lucky"

Doing the same by steam:

HEAD
http://www.google.com/search?q=test&btnI=I'm+Feeling+Lucky
HTTP/1.1
Host: www.google.com

HTTP/1.0 302 Moved Temporarily
Content-Length: 151
Server: GWS/2.0
Date: Thu, 09 May 2002 16:51:37 GMT
Location: http://www.toefl.org/
Content-Type: text/html



----------------------------------------------------------------------

>Comment By: Michael Hudson (mwh)
Date: 2002-07-31 08:34

Message:
Logged In: YES 
user_id=6656

Something even wierder happens when I try urllib2:

>>>
urllib2.urlopen("http://www.google.com/search?q=test&btnI=I'm+Feeling+Lucky").geturl()

Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/home/mwh/src/python/dist/src/Lib/urllib2.py", line
136, in urlopen
    return _opener.open(url, data)
  File "/home/mwh/src/python/dist/src/Lib/urllib2.py", line
324, in open
    '_open', req)
  File "/home/mwh/src/python/dist/src/Lib/urllib2.py", line
303, in _call_chain
    result = func(*args)
  File "/home/mwh/src/python/dist/src/Lib/urllib2.py", line
792, in http_open
    return self.do_open(httplib.HTTP, req)
  File "/home/mwh/src/python/dist/src/Lib/urllib2.py", line
786, in do_open
    return self.parent.error('http', req, fp, code, msg, hdrs)
  File "/home/mwh/src/python/dist/src/Lib/urllib2.py", line
350, in error
    return self._call_chain(*args)
  File "/home/mwh/src/python/dist/src/Lib/urllib2.py", line
303, in _call_chain
    result = func(*args)
  File "/home/mwh/src/python/dist/src/Lib/urllib2.py", line
402, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 403: Forbidden

(sf is going to mangle that traceback, I can tell).

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=588714&group_id=5470