How to share session with IE
John J. Lee
jjl at pobox.com
Tue Oct 10 13:36:41 EDT 2006
"Bernard" <bernard.chhun at gmail.com> writes:
> zdp wrote:
[...]
> > However, now I need to process some pages by a python program. When I
> > use urllib.urlopen(theurl), I can only get a page which told me I need
> > login. I think It's reasonable, becuase I wasn't in a loggined session
> > which as IE did.
> >
> > So how can I do my job? I want to get the right webpage by the url. I
> > have search answers from the groups but didn't get clear answer. Should
> > I use win32com or urllib? Any reply or information is appreciate. Hope
> > I put it clear.
> You can do the same thing as IE on your forum using urllib2 and
> cookielib. In short you need to code a small webcrawler. I can give you
> my browser module if necessary.
> You might not have the time to fiddle with the coding part or my
> browser module so you can also use this particularly useful module :
> http://wwwsearch.sourceforge.net/mechanize/
> The documentation is pretty clear for an initiated python programmer.
> If it's not your case, I'd recommend to read some ebooks on the python
> language first to get use to it.
In particular, if you're following the approach Bernard suggests, you
can either:
1. Log in every time your program runs, by going through the sequence
of clicks, pages, etc. that you would use in a browser to log in.
2. Once only (or once a month, or whatever), log in by hand using IE
with a "Remember me"-style feature (if the website offers that) --
where the webapp asks the browser to save the cookie rather than
just keeping it in memory until you close your browser. Then your
program can load the cookies from your real browser's cookie store
using this:
http://wwwsearch.sourceforge.net/mechanize/doc.html#browsers
There are other alternatives too, but they depend on knowing a little
bit more about how cookies and web apps work, and may or may not work
depending on what exactly the server does. I'm thinking specifically
here of saving *session* cookies (the kind that usually go away when
you close your browser) in a file -- but the server may not like them
when you send them back the next time, depending how much time has
elapsed since the last run. Of course, you can always detect the
"need to login" condition, and react accordingly.
John
More information about the Python-list
mailing list