Réf. : Downloading TV listings with urllib

Patrick.Bussi at space.alcatel.fr Patrick.Bussi at space.alcatel.fr
Thu Apr 17 20:06:42 CEST 2003


I suggest you use urllib2, associated with ClientCookie if you need to support
cookie on the client side. They are much more convenient libraries than dealing
with lower level httplib. They also allow you to set headers as you wish.

urllib2 is included in Python
ClientCookie is available at http://wwwsearch.sourceforge.net/ClientCookie/

Hope this help. Patrick.

---
Patrick Bussi
patrick.bussi at space.alcatel.fr

Any opinions expressed are my own and not necessarily those of my Company.




Josh <mlsj at earthlink.net> on 17/04/2003 19:23:34

Pour :    python-list at python.org
cc :   (ccc : Patrick Bussi/ALCATEL-SPACE)
Objet :   Downloading TV listings with urllib



Can someone give me general pointers on how it could be done? I am trying
to go to tvguide.com and download the TV listings for my area. The problem
is how to specify my location and cable provider. I used Proxomitron to
look at the traffic generated while downloading the listings with my
browser. When I used httplib with:

host = 'www.tvguide.com'
pathn = '/Listings/index.asp'

def tvg():
    try:
        h = httplib.HTTPConnection(host)
        h.putrequest('GET', pathn)
        h.putheader('Accept', 'text/html')
        h.putheader('Accept', 'text/plain')
        h.putheader('User-Agent:', 'Mozilla/5.0 (Windows; U; Windows NT
5.1; en-US; rv:1.3) Gecko/20030312')
        h.putheader('Accept:', 'application/x-shockwave-
flash,text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/p
lain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1')
        h.putheader('Accept-Language:','en-us,en;q=0.5')
        h.putheader('Accept-Encoding:', 'gzip,deflate,compress;q=0.9')
        h.putheader('Accept-Charset:', 'ISO-8859-1,utf-8;q=0.7,*;q=0.7')
        h.putheader('Keep-Alive:', '300')
        h.putheader('Cookie',
'SITESERVER=ID=fd7d0ca25acdd2439ab3f4b48b0827b4; GBA=18;
TVGID=9CE64992B85F4F9A82B57EBC092D11B2; nat=0; ServiceID=75347; zip=63130;
ptfc=yfki')
        h.endheaders()
        err = h.getresponse()
       return err
    except socket.error, er:
        print 'socket error ', er

when I use err.read() to read the object returned I get everything in Hex.

I tried using urllib, but there I can't figure out a way to pass the
location information etc.

As I said, I just need just general pointers on how to deal with this. Any
help would be greatly appreciated.

Thanks

Josh



--
http://mail.python.org/mailman/listinfo/python-list











More information about the Python-list mailing list