read all available pages on a Website
Leif K-Brooks
eurleif at ecritters.biz
Mon Sep 13 04:09:30 EDT 2004
Tim Roberts wrote:
> Brad Tilley <bradtilley at usa.net> wrote:
>
>>Is there a way to make urllib or urllib2 read all of the pages on a Web
>>site?
> By the way, there are many web sites for which this sort of behavior is not
> welcome.
Any site that didn't want to be crawled would most likely use a
robots.txt file, so you could check that before doing the crawl.
More information about the Python-list
mailing list