[2.5] Regex doesn't support MULTILINE?
Carsten Haese
carsten at uniqsys.com
Sat Jul 21 22:40:44 EDT 2007
On Sat, 2007-07-21 at 19:22 -0700, Paul Rubin wrote:
> Carsten Haese <carsten at uniqsys.com> writes:
> > Use an actual HTML parser such as BeautifulSoup
> > (http://www.crummy.com/software/BeautifulSoup/) and your life will be
> > much easier.
>
> BeautifulSoup is a lot simpler to use than RE's but a heck of a lot
> slower. I ended up having to use RE's last time I had to scrape a lot
> of pages.
True, but the OP said "extract information from a web page", not "from a
lot of pages." Until BeautifulSoup is actually too slow for that job,
going straight to RE is premature optimization.
--
Carsten Haese
http://informixdb.sourceforge.net
More information about the Python-list
mailing list