Outbound HTML Authentication
ht at computerdefense.org
Thu Nov 29 21:42:16 CET 2007
You should probably read the HTTP RFC is you're going to write a screen
scraper... but either way.
401 tells you that Auth is required.... there are several types of
"server-based auth" (Different from form based auth)... They include
- NTLM (or Negotiate)
Basic is easy to implement... Digest is slightly more complex... NTLM
requires that you have an understanding of how NTLM works in general.
There are a couple things you can do...
1. Find a public implementation of NTLM in python (I don't believe one
exists... but if it does, I'd love if someone could point it out)
2. Use the NTLM Authentication Proxy Server (
3. Follow Ronald Tschalär's write-up on NTLM over HTTP and implement
it yourself ( http://www.innovation.ch/personal/ronald/ntlm.html )
I actually did the recently for a project that I'm working on... and looked
fairly deeply at Ronald's write-up... It is fairly decent... and I may
actually implement it at some point in the future as a released Python
module... for now though you'll have to do it yourself.
On 11/29/07, Mudcat <mnations at gmail.com> wrote:
> I was trying to do a simple web scraping tool, but the network they
> use at work does some type of internal authentication before it lets
> the request out of the network. As a result I'm getting the '401 -
> Authentication Error' from the application.
> I know when I use a web browser or other application that it uses the
> information from my Windows AD to validate my user before it accesses
> a website. I'm constantly getting asked to enter in this info before I
> use Firefox, and I assume that IE picks it up automatically.
> However I'm not sure how to tell the request that I'm building in my
> python script to either use the info in my AD account or enter in my
> user/pass automatically.
> Anyone know how to do this?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-list