MSIE6 Python Question
Andy Baker
andy at andybak.net
Mon May 24 08:55:57 EDT 2004
The site might be checking your user-agent string. Urllib must allow you to
choose what browser to identify itself as. Simply match the user-agent of
known version of IE and see if that works.
> -----Original Message-----
> From: python-list-bounces+andy=andybak.net at python.org
> [mailto:python-list-bounces+andy=andybak.net at python.org] On
> Behalf Of Ralph A. Gable
> Sent: 24 May 2004 12:25
> To: python-list at python.org
> Subject: Re: MSIE6 Python Question
>
> "Kevin T. Ryan" <kevryan0701 at yahoo.com> wrote in message
> news:<40b1697d$0$3131$61fed72c at news.rcn.com>...
> > Ralph A. Gable wrote:
> >
> > > I'm a newbie at this but I need to control MSIE6 using Python. I
> > > have read the O'Reilly win32 python books and got some
> hints. But I
> > > need to Navigate to a site (which I know how to do) and
> then I need
> > > to get at the source code for that site inside Python (as
> when one
> > > used the
> > > View|Source drop down window). Can anyone point me to
> some URLs that
> > > would help out? Or just tell me how to do it? I would be very
> > > grateful.
> >
> > I'm not sure why you need to go through IE, but maybe this will get
> > you into the right direction:
> >
> > >>> import urllib
> > >>> f = urllib.urlopen('http://www.python.org')
> > >>> f.readline()
> > '<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"\n'
> > >>> f.readline()
> > ' "http://www.w3.org/TR/html4/loose.dtd" >\n'
> > >>>
> >
> > You could do:
> > for line in f:
> > process(line)
> >
> > just like you can with a file. Check the urllib, urllib2,
> and other
> > related modules (maybe httplib). Hope that helps.
>
>
> Sorry. I forgot to mention that I have tried that. The data I
> want is being stripped out when I access the URL via urllib.
> I CAN see the data when I go into IE and do view source but
> when I use urllib the site intentionally blanks out the
> information I want. For that reason, I would like to get it
> using IE6 if I can. If there are other ways to fake out the
> site, I would be interested in that also. I thought that
> perhaps the site was detecting the fact that I was not
> querying it using a browser. I tried putting that into into
> the HTTP messages but may not have done it right. At any rate
> couldn't get that to work. It may be that the site is using
> cookies to be sure someone is not getting the data. I haven't
> pursued that. Again that is another reason I wanted to use
> IE6 (since I know it works). The data is on a site to which I
> subscribe to a service. But the particular information is
> available to anyone if he/she types in the url (as long as
> you are using a browser).
> --
> http://mail.python.org/mailman/listinfo/python-list
>
More information about the Python-list
mailing list