How do I correctly download Wikipedia pages?

Cousin Stanley cousinstanley at gmail.com
Thu Nov 26 12:38:00 EST 2009


> I'm trying to scrape a Wikipedia page from Python.
> ....

  On occasion I use a program under Debian Linux
  called  wikipedia2text  that is very handy 
  for downloading wikipedia pages as plain text files .... 

    Description: displays Wikipedia articles on the command line
 
    This script fetches Wikipedia articles (currently supports 
    around 30 Wikipedia languages) and displays them as plain text 
    in a pager or just sends the text to standard out. Alternatively 
    it opens the Wikipedia article in a (possibly GUI) web browser 
    or just shows the URL of the appropriate Wikipedia article.

  Example directed through the lynx browser .... 

    wp2t -b lynx gorilla > gorilla.txt


-- 
Stanley C. Kitching
Human Being
Phoenix, Arizona




More information about the Python-list mailing list