HTML parsing/scraping & python
mwm at mired.org
Thu Dec 1 21:25:55 CET 2005
"Fuzzyman" <fuzzyman at gmail.com> writes:
> The standard library module for fetching HTML is urllib2.
Does urllib2 replace everything in urllib? I thought there was some
urllib functionality that urllib2 didn't do.
> There is a project called mechanize, built by John Lee on top of
> urllib2 and other standard modules.
> It will emulate a browsers behaviour - including history, cookies,
> basic authentication, etc.
urllib2 handles cookies and authentication. I use those features
daily. I'm not sure history would apply, unless you're also handling
Mike Meyer <mwm at mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
More information about the Python-list