PyParsing module or HTMLParser

Paul McGuire ptmcg at austin.rr.com
Tue Mar 29 15:06:16 EST 2005


La -

In general, I have shied away from doing general-purpose HTML parsing
with pyparsing.  It's a crowded field, and it's likely that there are
better candidates out there for your problem.  I've heard good things
about BeautifulSoup, but I've also heard from at least one person that
they prefer pyparsing to BS.

I personally have had good luck with *simple* HTML scraping with
pyparsing, such as extracting data from tables.  It just depends on how
variable your source text is.  Tables within tables may be a bit
challenging, but we'll never know unless you provide more to go on.  If
you post a URL or some sample HTML, I could  give you a more definitive
answer (possibly even a working code sample, you never know).

-- Paul




More information about the Python-list mailing list