[Tutor] password protection in httplib
Kent Johnson
kent37 at tds.net
Wed Mar 1 13:05:08 CET 2006
Andre Engels wrote:
> I am active in pywikipediabot, which is programmed in Python and is
> used to edit wikis (based on MediaWiki, such as Wikpedia). It uses
> httplib to connect to the site and get the HTML data.
>
> I now want to use it on another site, but this site is password
> protected (we want to first improve it before releasing it to the
> public). Is it possible with httplib to connect to password protected
> sites (given that I know the login and password of course), and if so,
> how is this done? If not, is there an alternative?
What kind of authentication is used? Basic and digest authentication
will pop up a dialog in the browser asking for your credentials. The
browser then remembers the credentials and includes them in subsequent
requests. With form-based authentication, a page displays in the browser
with a login form; the web site authenticates and usually sends a cookie
to the browser which must be included in subsequent requests.
urllib2 has good built-in support for basic and digest authentication of
web sites. For form-based authentication you have to do a bit more work
- install a cookie manager and post to the form yourself.
See http://www.voidspace.org.uk/python/articles/authentication.shtml for
examples of basic auth. Digest auth works pretty much the same way. Make
sure you read to the section "Doing It Properly" - the author likes to
show you the hard way first.
The article http://www.voidspace.org.uk/python/articles/cookielib.shtml
shows how to use cookies, though again the presentation makes it look
harder than it really is, at least in Python 2.4 that has CookieLib
built in. You have to post to the login form yourself, but that is just
another urllib2 request.
Kent
More information about the Tutor
mailing list