Parsing HTML

Richie Hindle richie at entrian.com
Thu Sep 23 14:28:43 CEST 2004


[Richie]
> BeautifulSoup is perfect for this job:

Um, BeautifulSoup may be perfect, but my script isn't.  It fails with the
Swedish page because it doesn't cope with "<b></b>" appearing in the HTML.
And I don't know whether you'd consider it correct to extract only the bold
text from the entries that have bold text.  But it gives you a place to start.
8-)

-- 
Richie Hindle
richie at entrian.com




More information about the Python-list mailing list