WebScraping

Michael Torrie torriem at chem.byu.edu
Sat Nov 4 23:25:04 EST 2006


On Sun, 2006-11-05 at 13:40 +1100, Steven D'Aprano wrote:
> On Sun, 05 Nov 2006 08:09:52 +1000, Graham Feeley wrote:
> 
> > Can someone steer me to scripts / modules etc on webscraping please???
> 
> The definitive documentation on the built-in Python modules can be found
> here: http://docs.python.org/modindex.html
> 
> The ActiveState Python cookbook should be useful, e.g.
> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/391929
> 
> Also see Beautiful Soup:
> http://www.crummy.com/software/BeautifulSoup/

Beautiful soup is not always speedy, but it sure is the most flexible
scraper I've ever came across.  I hacked together a web forum-to-nntp
gateway using Beautiful Soup.  Worked very well.

Michael


> 
> And of course, GIYF ("Google Is Your Friend") http://www.google.com which
> leads me to:
> 
> http://sig.levillage.org/?p=588
> http://sig.levillage.org/2005/03/11/web-scraping-with-python-part-ii/
> http://wiki.tcl.tk/2915 (not focused on Python, but may still be useful).
> 
> 
> > Ultimately I would like someone to write a script for me.
> 
> Are you offering to hire a developer?
> 
> 
> -- 
> Steven.
> 




More information about the Python-list mailing list