Downloading TV listings with urllib
Gerhard Häring
gh at ghaering.de
Thu Apr 17 13:58:43 EDT 2003
Josh wrote:
> Can someone give me general pointers on how it could be done? I am trying
> to go to tvguide.com and download the TV listings for my area. The problem
> is how to specify my location and cable provider. I used Proxomitron to
> look at the traffic generated while downloading the listings with my
> browser. When I used httplib with:
>
> host = 'www.tvguide.com'
> pathn = '/Listings/index.asp'
>
> def tvg():
> try:
> h = httplib.HTTPConnection(host)
> h.putrequest('GET', pathn)
> h.putheader('Accept', 'text/html')
> h.putheader('Accept', 'text/plain')
> h.putheader('User-Agent:', 'Mozilla/5.0 (Windows; U; Windows NT
> 5.1; en-US; rv:1.3) Gecko/20030312')
> h.putheader('Accept:', 'application/x-shockwave-
> flash,text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/p
> lain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1')
> h.putheader('Accept-Language:','en-us,en;q=0.5')
> h.putheader('Accept-Encoding:', 'gzip,deflate,compress;q=0.9')
> h.putheader('Accept-Charset:', 'ISO-8859-1,utf-8;q=0.7,*;q=0.7')
> h.putheader('Keep-Alive:', '300')
> h.putheader('Cookie',
> 'SITESERVER=ID=fd7d0ca25acdd2439ab3f4b48b0827b4; GBA=18;
> TVGID=9CE64992B85F4F9A82B57EBC092D11B2; nat=0; ServiceID=75347; zip=63130;
> ptfc=yfki')
> h.endheaders()
> err = h.getresponse()
> return err
> except socket.error, er:
> print 'socket error ', er
>
> when I use err.read() to read the object returned I get everything in Hex.
My guess is that you get what you're asking for - gzip-compressed data.
-- Gerhard
More information about the Python-list
mailing list