How to define repeated string when using the re module?
Chris Rebert
clp2 at rebertia.com
Tue Aug 2 13:22:28 EDT 2011
On Tue, Aug 2, 2011 at 9:20 AM, smith jack <thinke365 at gmail.com> wrote:
> if it's for a single character, this should be very easy, such as
> c{m,n} the occurrence of c is between m and n,
>
> if i want to define the occurrence of (.*?)</div> how should make it
> done? ((.*?)</div>){1,3} seems not work, any method to define repeat
> string using python regex?
Don't parse HTML using regexes; use an HTML parser!
http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags
Here's a survey of Python HTML parsing libraries:
http://blog.ianbicking.org/2008/03/30/python-html-parser-performance/
Cheers,
Chris
--
http://rebertia.com
More information about the Python-list
mailing list