[Tutor] Extract main text from HTML document
robertvstepp at gmail.com
Sat May 5 17:43:01 EDT 2018
On Sat, May 5, 2018 at 12:59 PM, Simon Connah <scopensource at gmail.com> wrote:
> I was wondering if there was a way in which I could download a web
> page and then just extract the main body of text without all of the
I do not have any experience with this, but I like to collect books.
One of them  says on page 245:
"Beautiful Soup is a module for extracting information from an HTML
page (and is much better for this purpose than regular expressions)."
I believe this topic has come up before on this list as well as the
main Python list. You may want to check it out. It can be installed
 "Automate the Boring Stuff with Python -- Practical Programming
for Total Beginners" by Al Sweigart.
More information about the Tutor