[Tutor] An idea for a script

Kent Johnson kent37 at tds.net
Thu Oct 11 02:16:56 CEST 2007


Dick Moores wrote:
> How about a hint of how to get those ">jcooley<" things from the source? 
> (I'm able to have the script get the source, using urllib2.)
> 
> BTW I thought I wouldn't try to use BeautifulSoup right now, but take 
> the hard way.

Well, you could probably do it with regular expressions, the source is 
pretty regular. You could use sgmllib directly, Dive Into Python has a 
chapter about this:
http://www.diveintopython.org/html_processing/index.html

or you could learn to use a handy tool and do it with BeautifulSoup...

Kent


More information about the Tutor mailing list