Web client login with redirection and cookies
John Hunter
jdhunter at ace.bsd.uchicago.edu
Tue Aug 6 16:23:48 EDT 2002
>>>>> "John" == John <john_lewis at mindspring.com> writes:
John> Hi, I'm trying to login to an intranet site that uses
John> cookies and redirection for a web scraping script. Are
John> there any good examples of how to accomplish this in Python?
John> I recently managed to get this type of login working in
John> Perl, and am now playing around with this in Python.
John> I have only been working with Perl for 6 months casually for
John> a few database and web scraping applications for automating
John> reporting, and have been thinking about switching to Python
John> before I invest too much more time. I am already struggling
John> a bit in maintaining my fairly small amount of code as I
John> only work on it a few days out of a month and thought Python
John> might benefit me in this regard.
Without a site a desired cookie vals, I can't provide a working
example, but here is a low level example where you ca directly
manipulate the http header and set the cookie and/or referer values.
There are friendlier http interfaces (see
http://groups.google.com/groups?q=FancyURLopener+cookie&ie=UTF-8&oe=UTF-8&hl=en&btnG=Google+Search)
but this should get you started.
import httplib
import socket
host = 'slashdot.org'
pathn = '/science/01/12/03/1630212.shtml'
try:
h = httplib.HTTP(host)
h.putrequest('GET', pathn)
h.putheader('Accept', 'text/html')
h.putheader('Accept', 'text/plain')
h.putheader('User-Agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.4) Gecko/20010913')
h.putheader('Referer', 'http://migley.zko.dec.com/httpget.py')
h.putheader('From', 'marshall at migley.zko.dec.com')
h.putheader('Cookie', 'mycookieval')
# h.putrequest('Referer', 'http://myserver.com')
h.endheaders()
errcode, errmsg, headers = h.getreply()
except socket.error, er:
print 'socket error ', er
print h.getfile().read()
JDH
More information about the Python-list
mailing list