web crawler help?
s_gherman at yahoo.com
Tue Sep 10 21:22:31 CEST 2002
"koko" <kokohh at hotmail.com> wrote in message news:<JARd9.6927$yt3.3340577 at newssrv26.news.prodigy.com>...
> is there any sample for basic web crawler, that ask for a starting url and
> log the url and extract the hyperlinks?
It's a very simple one in Mark Pilgrim's "Dive into Python" book,
whose text is freely available at: http://diveintopython.org/
Check the "HTML processing" chapter. It contains a urllister.py 9
lines program, followed by a 7 lines usage example which does just
that: given a URL for a HTML file, it lists the hyperlinks inside it.
More information about the Python-list