HTTPLIB - Problem retrieving a page

Colin Meeks colinmeeks at rogers.com
Fri Nov 23 20:46:46 CET 2001


I am running the following code to retrieve pages and strip out some
details,
however I have noticed that some sites do not work, even though the correct
URL is given.  I can verify it works by testing it in my browser. The below
code gives a 404 error.

The code is here

-------------------------------------

import urlparse, httplib, urllib
UseURL='http://www.meeks.ca/index.htm'

y=urlparse.urlparse(UseURL)
usesite=y[1]
useparameters=y[2]

if useparameters=='':
    useparameters='/'

h=httplib.HTTP(usesite)
h.putrequest('GET',useparameters)
h.putheader('Accept','text/html')
h.putheader('Accept', 'text/plain')
h.endheaders()
errcode, errmsg, headers=h.getreply()

x=h.getfile().read()

h.close()

---------------------------------

The site above is just one many that do not work.  Can anybody tell me a way
to work round this.

Colin


---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.286 / Virus Database: 152 - Release Date: 09/10/2001





More information about the Python-list mailing list