HTML parsing/scraping & python
Fuzzyman
fuzzyman at gmail.com
Thu Dec 1 03:46:05 EST 2005
The standard library module for fetching HTML is urllib2.
The best module for scraping the HTML is BeautifulSoup.
There is a project called mechanize, built by John Lee on top of
urllib2 and other standard modules.
It will emulate a browsers behaviour - including history, cookies,
basic authentication, etc.
There are several modules for automated form filling - FormEncode being
one.
All the best,
Fuzzyman
http://www.voidspace.org.uk/python/index.shtml
More information about the Python-list
mailing list