How to share session with IE

zdp zhaodapu at gmail.com
Thu Oct 12 14:38:39 CEST 2006


I found some similar topics in the newsgroup and get some ideas from
them.
http://groups.google.com/group/comp.lang.python/browse_thread/thread/2fe0be6c386adce4
http://groups.google.com/group/comp.lang.python/browse_thread/thread/a51cec8747f64619

According to all you suggestions, there are at least two ways to get my
result.

1. Use the cookie of IE, so I don't need to code to logon. That means I
must use ClientCookie. I found some example in the docs and the
newsgroup. Below is some code based on the docs of ClientCookie. But
the page I get is still the page told me must login ( I CAN get the
right page in IE).

    import ClientCookie, urllib2

    url_string="http://www.targetsite.com/bbs/viewthread.php?tid=12345"
   #the page I want to get

    cj = ClientCookie.MSIECookieJar(delayload=True)
    cj.load_from_registry()
    print cj          #I want to know what I get

    opener =
ClientCookie.build_opener(ClientCookie.HTTPCookieProcessor(cj))
    ClientCookie.install_opener(opener)
    f = ClientCookie.urlopen(url_string)
    print f.read()          # NOT the right page html


2. Logon myself by python. First, I access the login page and submit
the form of username and password. The form has many fields other than
username and passwd, so the dict "data" has all the fields even if it's
hide. Then, if the login succeed, I can get my page use the opener with
CookieJar.

    import urllib2, cookielib

    url_string="http://www.targetsite.com/bbs/viewthread.php?tid=12345"
   #the page I want to get
    url_login="http://www.targetsite.com/bbs/logging.php?action=login"
     #the login page

    headers =  {'User-agent' : 'Mozilla/4.0 (compatible; MSIE 5.5;
Windows NT)'}
    cj = cookielib.CookieJar()
    opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))

    urllib2.install_opener(opener)
    data = {
        'formhash': '3bd8bc0a',
        "referer" : "index.php",
        "loginfield": "username",
        'username': 'myname',
        'password': 'mypass',
        "questionid": 0,
        "answer":"",
        "cookietime" : "315360000",
        "loginmode":"",
        "styleid":""
        }
    req=urllib2.Request(url_login, urllib.urlencode(data), headers)
    f = opener.open(req)
    print req.get_data()
    print req.header_items()
    print f.info()
    print f.read()

    ## if login succeed, I can get my page
    f=opener.open( url_string)


However, both ways didn't work for me. I don't know what's wrong. If
it's because the server page check the header or the submit of the form
is wrong?

I didn't study Mechanize module yet. I want a solution as simple as
possible for distribution reason.

John J. Lee 写道:

> "Bernard" <bernard.chhun at gmail.com> writes:
> > zdp wrote:
> [...]
> > > However, now I need to process some pages by a python program. When I
> > > use urllib.urlopen(theurl), I can only get a page which told me I need
> > > login. I think It's reasonable, becuase I wasn't in a loggined session
> > > which as IE did.
> > >
> > > So how can I do my job? I want to get the right webpage by the url. I
> > > have search answers from the groups but didn't get clear answer. Should
> > > I use win32com or urllib? Any reply or information is appreciate. Hope
> > > I put it clear.
>
> > You can do the same thing as IE on your forum using urllib2 and
> > cookielib. In short you need to code a small webcrawler. I can give you
> > my browser module if necessary.
> > You might not have the time to fiddle with the coding part or my
> > browser module so you can also use this particularly useful module :
> > http://wwwsearch.sourceforge.net/mechanize/
> > The documentation is pretty clear for an initiated python programmer.
> > If it's not your case, I'd recommend to read some ebooks on the python
> > language first to get use to it.
>
> In particular, if you're following the approach Bernard suggests, you
> can either:
>
> 1. Log in every time your program runs, by going through the sequence
>    of clicks, pages, etc. that you would use in a browser to log in.
>
> 2. Once only (or once a month, or whatever), log in by hand using IE
>    with a "Remember me"-style feature (if the website offers that) --
>    where the webapp asks the browser to save the cookie rather than
>    just keeping it in memory until you close your browser.  Then your
>    program can load the cookies from your real browser's cookie store
>    using this:
>
> http://wwwsearch.sourceforge.net/mechanize/doc.html#browsers
>
>
> There are other alternatives too, but they depend on knowing a little
> bit more about how cookies and web apps work, and may or may not work
> depending on what exactly the server does.  I'm thinking specifically
> here of saving *session* cookies (the kind that usually go away when
> you close your browser) in a file -- but the server may not like them
> when you send them back the next time, depending how much time has
> elapsed since the last run.  Of course, you can always detect the
> "need to login" condition, and react accordingly.
> 
> 
> John




More information about the Python-list mailing list