HTML parsing/scraping & python
fuzzyman at gmail.com
Thu Dec 1 09:46:05 CET 2005
The standard library module for fetching HTML is urllib2.
The best module for scraping the HTML is BeautifulSoup.
There is a project called mechanize, built by John Lee on top of
urllib2 and other standard modules.
It will emulate a browsers behaviour - including history, cookies,
basic authentication, etc.
There are several modules for automated form filling - FormEncode being
All the best,
More information about the Python-list