Regular expression help
bokr at oz.net
Thu Jul 17 17:57:22 CEST 2003
On Thu, 17 Jul 2003 04:27:23 GMT, David Lees <abcdebl2nonspammy at verizon.net> wrote:
>I forget how to find multiple instances of stuff between tags using
>regular expressions. Specifically I want to find all the text between a
>series of begin/end pairs in a multiline file.
> >>> p = 'begin(.*)end'
> >>> m = re.search(p,s,re.DOTALL)
>and got everything between the first begin and last end. I guess
>because of a greedy match. What I want to do is a list where each
>element is the text between another begin/end pair.
You were close. For non-greedy add the question mark after the greedy expression:
>>> import re
>>> s = """
... begin first end
... begin problem begin nested end end
... begin last end
>>> p = 'begin(.*?)end'
>>> rx =re.compile(p,re.DOTALL)
[' first ', '\nsecond\n', ' problem begin nested ', ' last ']
Notice what happened with the nested begin-ends. If you have nesting, you
will need more than a simple regex approach.
More information about the Python-list