any suggestions for URL cataloging project?

Paul McGuire ptmcg at austin.rr._bogus_.com
Tue Sep 7 14:07:48 CEST 2004


"Matthew K Jensen" <matt.torment at gmail.com> wrote in message
news:a8dfce8c.0409062258.269b1b35 at posting.google.com...
> I've just come up with an idea to make a small-time record of web
> pages linking to other web pages. I don't want to download every page
> on the internet (I'll leave google to do that). I just want to know if
> anyone has any suggestions on how to acquire just the links from a web
> page using python. This is for a cataloging purpose. Is there some
> library or script out there that I haven't heard of?

One of the examples that comes with pyparsing is urlextractor.py.  Point it
at a web page and it lists out the urls and linked text.

Download pyparsing at http://pyparsing.sourceforge.net.

-- Paul





More information about the Python-list mailing list