How to grab HTML files behind authentification

Oleg Broytmann phd at phd.fep.ru
Thu Jun 28 05:51:12 EDT 2001


Thank you. But do you know urllib can do the same even simpler?

urllib.urlretrieve("myName:myPassword at http://www.something.com/secret/index.html")

On 28 Jun 2001, Dirk Krause wrote:
>   I've put together some code the python community might find useful.
> You can use this script to automatically spider web pages beyond the
> www-authenticate Dialog Box.
>
> ---snip---
> import httplib, string, base64
>
> # How to grab HTML files behind authentification
> # author: Dirk Krause, 06/28/2001
> # change these entries below!!
>
> base = 'http://www.something.com'
> path = '/secret/index.html'
>
> u_name = 'myName'
> u_pwd  = 'myPassword'
>
>
> # ok, here goes
>
> hlink = httplib.HTTP(base)
> hlink.putrequest('GET', path+' HTTP/1.0')
> hlink.putheader('Host', base)
>
> hlink.putheader('Accept', 'text/html')
> hlink.putheader('Accept', 'text/plain')
>
> temp = "%s:%s" % (u_name,u_pwd)
> temp = base64.encodestring(temp)
> temp = "Basic %s" % string.strip(temp)
> hlink.putheader("Authorization",temp)
>
> hlink.endheaders()
>
> errcode, errmsg, header = hlink.getreply()
> content = hlink.getfile().read()
>
> print content
> print errcode, header

Oleg.
----
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.





More information about the Python-list mailing list