urllib, urlretrieve method, how to get headers?

Peter Otten __peter__ at web.de
Fri Jul 1 09:43:03 CEST 2011


Даниил Рыжков wrote:

> How can I get headers with urlretrieve? I want to send request and get
> headers with necessary information before I execute urlretrieve(). Or
> are there any alternatives for urlretrieve()?
 
It's easy to do it manually:

>>> import urllib2

Connect to website and inspect headers:

>>> f = urllib2.urlopen("http://www.python.org")
>>> f.headers["Content-Type"]
'text/html'

Write page content to file:

>>> with open("tmp.html", "w") as dest:
...     dest.writelines(f)
...

Did we get what we expected?

>>> with open("tmp.html") as f: print f.read().split("title")[1]
...
>Python Programming Language – Official Website</
>>>




More information about the Python-list mailing list