urllib, urlretrieve method, how to get headers?

Fri Jul 1 03:43:03 EDT 2011

Даниил Рыжков wrote:

> How can I get headers with urlretrieve? I want to send request and get
> headers with necessary information before I execute urlretrieve(). Or
> are there any alternatives for urlretrieve()?

It's easy to do it manually:

>>> import urllib2

Connect to website and inspect headers:

>>> f = urllib2.urlopen("http://www.python.org")
>>> f.headers["Content-Type"]
'text/html'

Write page content to file:

>>> with open("tmp.html", "w") as dest:
...     dest.writelines(f)
...

Did we get what we expected?

>>> with open("tmp.html") as f: print f.read().split("title")[1]
...
>Python Programming Language – Official Website</
>>>