spidering script

dubs wbornor at gmail.com
Thu Jan 18 20:25:59 CET 2007


Check out the quick start section in the documentation at Beautiful
Soup http://www.crummy.com/software/BeautifulSoup/


Wes


Jonathan Curran wrote:
> On Thursday 18 January 2007 11:57, David Waizer wrote:
> > Hello..
> >
> > I'm  looking for a script (perl, python, sh...)or program (such as wget)
> > that will help me get a list of ALL the links on a website.
> >
> > For example ./magicscript.pl www.yahoo.com and outputs it to a file, it
> > would be kind of like a spidering software..
> >
> > Any suggestions would be appreciated.
> >
> > David
>
> David, this is a touchy topic but whatever :P Look into sgmllib, and you can
> filter on the "A" tag. The book 'Dive Into Python' covers it quite nicely:
> http://www.diveintopython.org/html_processing/index.html
> 
> Jonathan




More information about the Python-list mailing list