"Soup Strainer" for ElementSoup?

Stefan Behnel stefan_ml at behnel.de
Thu Mar 27 13:23:25 EDT 2008


erikcw wrote:
> I'm parsing real-world HTML with BeautifulSoup and XML with
> cElementTree.
> 
> I'm guessing that the only benefit to using ElementSoup is that I'll
> have one less API to keep track of, right?

If your "real-world" HTML is still somewhat close to HTML, lxml.html might be
an option. It combines the ElementTree API with a good close-to-HTML parser
and some helpful HTML handling tools.

http://codespeak.net/lxml
http://codespeak.net/lxml/lxmlhtml.html

You can also use it with the BeautifulSoup parser if you really need to.

http://codespeak.net/lxml/elementsoup.html

Stefan



More information about the Python-list mailing list