Web authentication urllib2
dunmer at dreams.sk
Sat Jan 24 13:13:25 CET 2009
Oh, nevermind, it's working.
> First, thank you both
> I think this isn't basic auth, because this page has form login.
> I read site's html source and used wireshark to analyze communication
> between my browser and website and i really find out that a was ignoring
> one field
> I added it to the parameters but it didn't help..
> Maybe i'm still missing something
> Here's the post packet:
> and here's the code again, with little change and real web location added:
> opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())
> params = urllib.urlencode(dict(login='login', pwd='pass', page=''))
> f = opener.open('https://www.orangeportal.sk/', params)
> data = f.read()
> Login and pass are fake ofc.
> Thank you in advice for any help.
> Steve Holden wrote:
>> Gabriel wrote:
>>> I'm new in Python and i would like to write script which need to login
>>> to a website. I'm experimenting with urllib2,
>>> especially with something like this:
>>> opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())
>>> params = urllib.urlencode(dict(username='user', password='pass'))
>>> f = opener.open('https://web.com', params)
>>> data = f.read()
>>> And the problem is, that this code logs me in on some sites, but on
>>> others doesn't, especially on the one I really
>>> need to login. And i don't know why. So is there some way how to debug
>>> this code and find out why that script cannot
>>> login on that specific site?
>>> Sorry if this question is too lame, but i am really beginner both in
>>> python and web programming .)
>> That's actually pretty good code for a newcomer! There are a couple of
>> issues you may be running into.
>> First, not all sites use "application-based" authentication - they may
>> use HTTP authentication of some kind instead. In that case you have to
>> pass the username and password as a part of the HTTP headers. Michael
>> Foord has done a fair write-up of the issues at
>> and you will do well to read that if, indeed, you need to do basic
>> Second, if it *is* the web application that's doing the authentication
>> in the sites that are failing (in other words if the credentials are
>> passed in a web form) then your code may need adjusting to use other
>> field names, or to include other data as required by the login form. You
>> can usually find out what's required by reading the HTML source of the
>> page that contains the login form.
>> Thirdly [nobody expects the Spanish Inquisition ...], it may be that
>> some sites are extraordinarily sensitive to programmed login attempts
>> (possible due to spam), typically using a check of the "Agent:" HTTP
>> header to "make sure" that the login attempt is coming from a browser
>> and not a program. For sites like these you may need to emulate a
>> browser response more fully.
>> You can use a program like Wireshark to analyze the network traffic,
>> though you can get add-ons for Firefox that will show you the HTTP
>> headers on request and response.
More information about the Python-list