MSIE6 Python Question

Andy Baker andy at andybak.net
Mon May 24 08:55:57 EDT 2004


The site might be checking your user-agent string. Urllib must allow  you to
choose what browser to identify itself as. Simply match the user-agent of
known version of IE and see if that works.

> -----Original Message-----
> From: python-list-bounces+andy=andybak.net at python.org 
> [mailto:python-list-bounces+andy=andybak.net at python.org] On 
> Behalf Of Ralph A. Gable
> Sent: 24 May 2004 12:25
> To: python-list at python.org
> Subject: Re: MSIE6 Python Question
> 
> "Kevin T. Ryan" <kevryan0701 at yahoo.com> wrote in message 
> news:<40b1697d$0$3131$61fed72c at news.rcn.com>...
> > Ralph A. Gable wrote:
> > 
> > > I'm a newbie at this but I need to control MSIE6 using Python. I 
> > > have read the O'Reilly win32 python books and got some 
> hints. But I 
> > > need to Navigate to a site (which I know how to do) and 
> then I need 
> > > to get at the source code for that site inside Python (as 
> when one 
> > > used the
> > > View|Source drop down window). Can anyone point me to 
> some URLs that
> > > would help out? Or just tell me how to do it? I would be very 
> > > grateful.
> > 
> > I'm not sure why you need to go through IE, but maybe this will get 
> > you into the right direction:
> > 
> > >>> import urllib
> > >>> f = urllib.urlopen('http://www.python.org')
> > >>> f.readline()
> >  '<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"\n'
> > >>> f.readline()
> >  '                      "http://www.w3.org/TR/html4/loose.dtd" >\n'
> > >>>
> > 
> > You could do:
> > for line in f:
> >         process(line)
> > 
> > just like you can with a file.  Check the urllib, urllib2, 
> and other 
> > related modules (maybe httplib).  Hope that helps.
> 
> 
> Sorry. I forgot to mention that I have tried that. The data I 
> want is being stripped out when I access the URL via urllib. 
> I CAN see the data when I go into IE and do view source but 
> when I use urllib the site intentionally blanks out the 
> information I want. For that reason, I would like to get it 
> using IE6 if I can. If there are other ways to fake out the 
> site, I would be interested in that also. I thought that 
> perhaps the site was detecting the fact that I was not 
> querying it using a browser. I tried putting that into into 
> the HTTP messages but may not have done it right. At any rate 
> couldn't get that to work. It may be that the site is using 
> cookies to be sure someone is not getting the data. I haven't 
> pursued that. Again that is another reason I wanted to use 
> IE6 (since I know it works). The data is on a site to which I 
> subscribe to a service. But the particular information is 
> available to anyone if he/she types in the url (as long as 
> you are using a browser).
> --
> http://mail.python.org/mailman/listinfo/python-list
> 





More information about the Python-list mailing list