How to define repeated string when using the re module?

Chris Rebert clp2 at rebertia.com
Tue Aug 2 13:22:28 EDT 2011


On Tue, Aug 2, 2011 at 9:20 AM, smith jack <thinke365 at gmail.com> wrote:
> if it's for a single character, this should be very easy, such as
> c{m,n}   the occurrence of c is between m and n,
>
> if i want to define the occurrence of (.*?)</div>  how should make it
> done?  ((.*?)</div>){1,3}  seems not work, any method to define repeat
> string using python regex?

Don't parse HTML using regexes; use an HTML parser!
http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags

Here's a survey of Python HTML parsing libraries:
http://blog.ianbicking.org/2008/03/30/python-html-parser-performance/

Cheers,
Chris
--
http://rebertia.com



More information about the Python-list mailing list