How do I correctly download Wikipedia pages?
taskinoor.hasan at csebuet.org
Thu Nov 26 05:37:39 CET 2009
I fetched a different problem. Whenever I tried to fetch any page from
wikipedia, I received 403. Then I found that wikipedia don't accept the
default user-agent (might be python-urllib2.x or something like this). After
setting my own user-agent, it worked fine. You can try this if you receive
On Thu, Nov 26, 2009 at 10:04 AM, Stephen Hansen <apt.shansen at gmail.com>wrote:
> 2009/11/25 Steven D'Aprano <steven at remove.this.cybersource.com.au>
> I'm trying to scrape a Wikipedia page from Python. Following instructions
> Have you checked out http://meta.wikimedia.org/wiki/Pywikipediabot?
> Its not just via urllib, but I've scraped several MediaWiki-based sites
> with the software successfully.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-list