httplib problems.

Thu Apr 12 04:42:02 EDT 2001

Sammy Mannaert wrote:

> hi,
> 
> i'm trying to use httplib to fetch a file
> automatically. it's basically just the
> example from the 2.0 httplib GET example.
> 
> it works for all urls except for
> http://www.deathinjune.net/html/news/news.htm
> 
> does anyone know why it won't work for this
> url ? is there an easy way to fix it ?
> i tried surfinf to the url in netscape and
> lynx. both worked fine.
>

If you want to fetch files, you'd better use the higher level urllib module 
since it can follow redirects and can handle "virtual hosts". 

To fetch a file from a named virtual host you have to do like all modern 
browsers when you use low level code like httplib : send a "host" header 
with the full name of the server you want to match :

def fetch(domain, path):
    h = httplib.HTTP(domain)
    h.putrequest('GET', path)
    h.putheader('Accept', 'text/html')
    h.putheader('Accept', 'text/plain')
    h.putheader('Host', domain)   # <------------- add this line
    h.endheaders()
    errcode, errmsg, headers = h.getreply()
    print errcode
    f = h.getfile()
    data = f.read()
    f.close()
    print len(data)

-- 
Romuald Texier