Need help in Python regular expression

John S jstrickler at gmail.com
Fri Jun 12 00:47:51 EDT 2009


On Jun 11, 10:30 pm, meryl <silverburgh.me... at gmail.com> wrote:
> Hi,
>
> I have this regular expression
> blockRE = re.compile(".*RenderBlock {\w+}")
>
> it works if my source is "RenderBlock {CENTER}".
>
> But I want it to work with
> 1. RenderTable {TABLE}
>
> So i change the regexp to re.compile(".*Render[Block|Table] {\w+}"),
> but that breaks everything
>
> 2. RenderBlock (CENTER)
>
> So I change the regexp to re.compile(".*RenderBlock {|\(\w+}|\)"),
> that also breaks everything
>
> Can you please tell me how to change my reg exp so that I can support
> all 3 cases:
> RenderTable {TABLE}
> RenderBlock (CENTER)
> RenderBlock {CENTER}
>
> Thank you.

Short answer:

r = re.compile(r"Render(?:Block|Table)\s+[({](?:TABLE|CENTER)[})]")

s = """
    blah blah blah
    blah blah blah RenderBlock {CENTER} blah blah RenderBlock {CENTER}
    blah blah blah RenderTable {TABLE} blah blah RenderBlock (CENTER)
    blah blah blah
"""

print r.findall(s)



output:
['RenderBlock {CENTER}', 'RenderBlock {CENTER}', 'RenderTable
{TABLE}', 'RenderBlock (CENTER)']



Note that [] only encloses characters, not strings; [foo|bar] matches
'f','o','|','b','a', or 'r', not "foo" or "bar".
Use (foo|bar) to match "foo" or "bar"; (?xxx) matches xxx without
making a backreference (i.e., without capturing text).

HTH

-- John Strickler



More information about the Python-list mailing list