read all available pages on a Website

Leif K-Brooks eurleif at
Mon Sep 13 10:09:30 CEST 2004

Tim Roberts wrote:
> Brad Tilley <bradtilley at> wrote:
>>Is there a way to make urllib or urllib2 read all of the pages on a Web 
> By the way, there are many web sites for which this sort of behavior is not
> welcome.

Any site that didn't want to be crawled would most likely use a 
robots.txt file, so you could check that before doing the crawl.

More information about the Python-list mailing list