[Tutor] password protection in httplib

Kent Johnson kent37 at tds.net
Wed Mar 1 13:05:08 CET 2006


Andre Engels wrote:
> I am active in pywikipediabot, which is programmed in Python and is
> used to edit wikis (based on MediaWiki, such as Wikpedia). It uses
> httplib to connect to the site and get the HTML data.
> 
> I now want to use it on another site, but this site is password
> protected (we want to first improve it before releasing it to the
> public). Is it possible with httplib to connect to password protected
> sites (given that I know the login and password of course), and if so,
> how is this done? If not, is there an alternative?

What kind of authentication is used? Basic and digest authentication 
will pop up a dialog in the browser asking for your credentials. The 
browser then remembers the credentials and includes them in subsequent 
requests. With form-based authentication, a page displays in the browser 
with a login form; the web site authenticates and usually sends a cookie 
to the browser which must be included in subsequent requests.

urllib2 has good built-in support for basic and digest authentication of 
web sites. For form-based authentication you have to do a bit more work 
  - install a cookie manager and post to the form yourself.

See http://www.voidspace.org.uk/python/articles/authentication.shtml for 
examples of basic auth. Digest auth works pretty much the same way. Make 
sure you read to the section "Doing It Properly" - the author likes to 
show you the hard way first.

The article http://www.voidspace.org.uk/python/articles/cookielib.shtml 
shows how to use cookies, though again the presentation makes it look 
harder than it really is, at least in Python 2.4 that has CookieLib 
built in. You have to post to the login form yourself, but that is just 
another urllib2 request.

Kent



More information about the Tutor mailing list