[Tutor] htmllib vs re question

Kent Johnson kent37 at tds.net
Fri Mar 10 04:25:47 CET 2006


->Terry<- wrote:
> I want to parse some text from an HTML file that contains
> blocks of pre-formatted text. All I'm after is what's between
> the <pre> and </pre> tags.
> 
> The HTML file size varies, but I don't expect the size to exceed
> 150-200k. Speed is not a bug concern.
> 
> What is the Pythonic way and why?
> 
> Any recommendations or comments?

Try Beautiful Soup
http://www.crummy.com/software/BeautifulSoup/

Kent



More information about the Tutor mailing list