Outbound HTML Authentication

Tyler Reguly ht at computerdefense.org
Thu Nov 29 21:42:16 CET 2007


You should probably read the HTTP RFC is you're going to write a screen
scraper... but either way.

401 tells you that Auth is required.... there are several types of
"server-based auth" (Different from form based auth)... They include

   - Basic
   - Digest
   - NTLM (or Negotiate)

Basic is easy to implement... Digest is slightly more complex... NTLM
requires that you have an understanding of how NTLM works in general.

There are a couple things you can do...

   1. Find a public implementation of NTLM in python (I don't believe one
   exists... but if it does, I'd love if someone could point it out)
   2. Use the  NTLM Authentication Proxy Server (
   http://www.geocities.com/rozmanov/ntlm/ )
   3. Follow Ronald Tschalär's write-up on NTLM over HTTP and implement
   it yourself ( http://www.innovation.ch/personal/ronald/ntlm.html )

I actually did the recently for a project that I'm working on... and looked
fairly deeply at Ronald's write-up... It is fairly decent... and I may
actually implement it at some point in the future as a released Python
module... for now though you'll have to do it yourself.

Tyler Reguly

On 11/29/07, Mudcat <mnations at gmail.com> wrote:
> Hi,
> I was trying to do a simple web scraping tool, but the network they
> use at work does some type of internal authentication before it lets
> the request out of the network. As a result I'm getting the '401 -
> Authentication Error' from the application.
> I know when I use a web browser or other application that it uses the
> information from my Windows AD to validate my user before it accesses
> a website. I'm constantly getting asked to enter in this info before I
> use Firefox, and I assume that IE picks it up automatically.
> However I'm not sure how to tell the request that I'm building in my
> python script to either use the info in my AD account or enter in my
> user/pass automatically.
> Anyone know how to do this?
> Thanks
> --
> http://mail.python.org/mailman/listinfo/python-list
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20071129/bbf90e78/attachment.html>

More information about the Python-list mailing list