Web authentication urllib2
dunmer at dreams.sk
Sat Jan 24 12:59:22 CET 2009
First, thank you both
I think this isn't basic auth, because this page has form login.
I read site's html source and used wireshark to analyze communication
between my browser and website and i really find out that a was ignoring
I added it to the parameters but it didn't help..
Maybe i'm still missing something
Here's the post packet:
and here's the code again, with little change and real web location added:
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())
params = urllib.urlencode(dict(login='login', pwd='pass', page=''))
f = opener.open('https://www.orangeportal.sk/', params)
data = f.read()
Login and pass are fake ofc.
Thank you in advice for any help.
Steve Holden wrote:
> Gabriel wrote:
>> I'm new in Python and i would like to write script which need to login
>> to a website. I'm experimenting with urllib2,
>> especially with something like this:
>> opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())
>> params = urllib.urlencode(dict(username='user', password='pass'))
>> f = opener.open('https://web.com', params)
>> data = f.read()
>> And the problem is, that this code logs me in on some sites, but on
>> others doesn't, especially on the one I really
>> need to login. And i don't know why. So is there some way how to debug
>> this code and find out why that script cannot
>> login on that specific site?
>> Sorry if this question is too lame, but i am really beginner both in
>> python and web programming .)
> That's actually pretty good code for a newcomer! There are a couple of
> issues you may be running into.
> First, not all sites use "application-based" authentication - they may
> use HTTP authentication of some kind instead. In that case you have to
> pass the username and password as a part of the HTTP headers. Michael
> Foord has done a fair write-up of the issues at
> and you will do well to read that if, indeed, you need to do basic
> Second, if it *is* the web application that's doing the authentication
> in the sites that are failing (in other words if the credentials are
> passed in a web form) then your code may need adjusting to use other
> field names, or to include other data as required by the login form. You
> can usually find out what's required by reading the HTML source of the
> page that contains the login form.
> Thirdly [nobody expects the Spanish Inquisition ...], it may be that
> some sites are extraordinarily sensitive to programmed login attempts
> (possible due to spam), typically using a check of the "Agent:" HTTP
> header to "make sure" that the login attempt is coming from a browser
> and not a program. For sites like these you may need to emulate a
> browser response more fully.
> You can use a program like Wireshark to analyze the network traffic,
> though you can get add-ons for Firefox that will show you the HTTP
> headers on request and response.
More information about the Python-list