[Tutor] fetching wikipedia articles

Alan Gauld alan.gauld at btinternet.com
Fri Jan 23 12:23:11 CET 2009


"Andre Engels" <andreengels at gmail.com> wrote

> developers of Wikimedia why this is done, but for now you can 
> resolve
> this by editing robotparser.py in the following way:
>
> In the __init__ of the class URLopener, add the following at the 
> end:
>
> self.addheaders = [header for header in self.addheaders if header[0]
> != "User-Agent"] + [('User-Agent', '<whatever>')]

Rather than editing the existing code and making it non standard
why not subclass robotparser:

class WP_RobotParser(robotparser):
    def __init__(self, *args, *kwargs):
          robotparser.__init__(self, *args, *kwargs)
          self.addheaders = .......blah....

Thats one of the advantages of OOP, you can change the way
classes work without modifying the original code. And thus not
breaking any code that relies on the original behaviour.

HTH,

-- 
Alan Gauld
Author of the Learn to Program web site
http://www.freenetpages.co.uk/hp/alan.gauld 




More information about the Tutor mailing list