HTML Parsing and Indexing

Stefan Behnel stefan.behnel-n05pAM at
Tue Nov 14 08:50:57 CET 2006

mailtogops at wrote:
>     I am involved in one project which tends to collect news
> information published on selected, known web sites inthe format of
> HTML, RSS, etc and sortlist them and create a bookmark on our website
> for the news content(we will use django for web development). Currently
> this project is under heavy development.
> I need a help on HTML parser.

lxml includes an HTML parser which can parse straight from URLs.


More information about the Python-list mailing list