HTML parsing/scraping & python

Fuzzyman fuzzyman at
Thu Dec 1 09:46:05 CET 2005

The standard library module for fetching HTML is urllib2.

The best module for scraping the HTML is BeautifulSoup.

There is a project called mechanize, built by John Lee on top of
urllib2 and other standard modules.

It will emulate a browsers behaviour - including history, cookies,
basic authentication, etc.

There are several modules for automated form filling - FormEncode being

All the best,


More information about the Python-list mailing list