[Tutor] htmllib

Kent Johnson kent37 at tds.net
Wed Oct 5 13:52:36 CEST 2005


Ed Singleton wrote:
> I want to dump a html file into a python object.  Each nested tag
> would be a sub-object, attributes would be properties.  So that I can
> use Python in a similar way to the way I use JavaScript within a web
> page.

I don't know of a way to run Python from within a web page. But if you want to fetch an HTML page from a server and work with it (for example a web-scraping app), many people use BeautifulSoup for this. If you have well-formed HTML or XHTML you can use an XML parser as well but BS has the advantage of coping with badly-formed HTML.
http://www.crummy.com/software/BeautifulSoup/

Kent



More information about the Tutor mailing list