Getting final url when original url redirects

Philip Semanchuk philip at
Thu Mar 12 21:45:32 CET 2009

On Mar 12, 2009, at 3:57 PM, IanR wrote:

> I'm processing RSS content from a # of given sources.  Most of the
> time the url given by the RSS feed redirects to the real URL (I'm
> guessing they do this for tracking purposes)
> For example.
> This is a url that I get from and RSS feed,
> It redirects to
> I want to record the final URL and not the URL I get from the RSS feed
> (However sometimes there is no redirect so I might want the original
> URL)
> I've tried sniffing the header and don't see any "Location:"... I
> think sites are using different ways to redirect.  Does anyone have
> any suggestions on how I might handle this?

Hi Ian,
Using Firefox's Live HTTP Headers extension, I see a 302 redirect with  
a Location header (see session log below). Are aware that urrlib2  
resolves redirects for you? That might be why you're not seeing what  
you expect. If you want a record of each URL you'll have to implement  
an HTTPRedirectHandler.

GET /click.phdo?i=d22e9bc7641aab8a0566526f61806512 HTTP/1.1
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv: Gecko/2009021906 Firefox/3.0.7
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.7,sv;q=0.3
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive

HTTP/1.x 302 Found
Date: Thu, 12 Mar 2009 20:41:29 GMT
Server: Apache
X-Powered-By: PHP/5.2.3-1ubuntu6.3
Pragma: no-cache
Cache-Control: no-cache, must-revalidate
Set-Cookie: phdo=1-tst 
%3Aa8t5sELbkk9oy3pXsrohSnPslqQxQKIhVP%2F8Ots%3D; expires=Fri, 13- 
Mar-2009 20:41:29 GMT; path=/;
Content-Encoding: gzip
Vary: Accept-Encoding
Content-Length: 26
Connection: close
Content-Type: text/html

etc. etc.

More information about the Python-list mailing list