[Tutor] extract plain english words from html

bob bgailer at alum.rpi.edu
Sat Oct 15 02:06:48 CEST 2005

At 03:50 PM 10/14/2005, Marc Buehler wrote:
>i have a ton of html files from which i want to
>extract the plain english words, and then write
>those words into a single text file.

http://www.crummy.com/software/BeautifulSoup/ will read the html, let you 
step from tag to tag and extract the text. Almost no effort on your part.


More information about the Tutor mailing list