urllib - changing the user agent

chitanom chantavong tonym1972 at club-internet.fr
Fri Jan 9 18:45:12 CET 2004


"Fuzzyman" <michael at foord.net> wrote in message
news:8089854e.0401090509.3dd74859 at posting.google.com...
> I'm writing a function that will query the comp.lang.python newsgroup
> via google groups.......  (I haven't got nntp access from work..)
>
> I'm using urllib (for the first time)..... and google don't seem very
> keen to let me search the group from within a program - the returned
> pages all tell me 'you're not allowed to do that' :-)
>
> I read in the urllib manual pages :
>
> class URLopener( [proxies[, **x509]])
>
> Base class for opening and reading URLs. Unless you need to support
> opening objects using schemes other than http:, ftp:, gopher: or
> file:, you probably want to use FancyURLopener.
> By default, the URLopener class sends a User-Agent: header of
> "urllib/VVV", where VVV is the urllib version number. Applications can
> define their own User-Agent: header by subclassing URLopener or
> FancyURLopener and setting the instance attribute version to an
> appropriate string value before the open() method is called.
>
>
> Could anyone tell me how to subclass this correctly with the version
> attribute set and what text string I should use to mimic Internet
> explorer and/or mozilla ?
>
>
> Ta
>
> Fuzzy

I normally use something like this for crawling webpages.

import urllib
urllib.URLopener.version = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT
5.0; T312461)'
urllib.FancyURLopener.prompt_user_passwd = lambda self, host, realm: (None,
None)

Anthony McDonald





More information about the Python-list mailing list