How do I correctly download Wikipedia pages?
cousinstanley at gmail.com
Thu Nov 26 18:38:00 CET 2009
> I'm trying to scrape a Wikipedia page from Python.
On occasion I use a program under Debian Linux
called wikipedia2text that is very handy
for downloading wikipedia pages as plain text files ....
Description: displays Wikipedia articles on the command line
This script fetches Wikipedia articles (currently supports
around 30 Wikipedia languages) and displays them as plain text
in a pager or just sends the text to standard out. Alternatively
it opens the Wikipedia article in a (possibly GUI) web browser
or just shows the URL of the appropriate Wikipedia article.
Example directed through the lynx browser ....
wp2t -b lynx gorilla > gorilla.txt
Stanley C. Kitching
More information about the Python-list