Need help in Python regular expression

Rhodri James rhodri at wildebst.demon.co.uk
Fri Jun 12 14:20:16 EDT 2009


On Fri, 12 Jun 2009 06:20:24 +0100, meryl <silverburgh.meryl at gmail.com>  
wrote:
> On Jun 11, 9:41 pm, "Mark Tolonen" <metolone+gm... at gmail.com> wrote:
>> "meryl" <silverburgh.me... at gmail.com> wrote in message
>>
>> > Hi,
>>
>> > I have this regular expression
>> > blockRE = re.compile(".*RenderBlock {\w+}")
>>
>> > it works if my source is "RenderBlock {CENTER}".

[snip]

>> -----------------------code----------------------
>> import re
>> pat = re.compile(r'Render(?:Block|Table) (?:\(\w+\)|{\w+})')
>>
>> testdata = '''\
>> RenderTable {TABLE}
>> RenderBlock (CENTER)
>> RenderBlock {CENTER}
>> RenderTable {TABLE)      #shouldn't match
>> '''
>>
>> print pat.findall(testdata)
>> ---------------------------------------------------
>>
>> Result:
>>
>> ['RenderTable {TABLE}', 'RenderBlock (CENTER)', 'RenderBlock {CENTER}']
>>
>> -Mark
>
> Thanks for both of your help. How can i modify the RegExp so that
> both
> RenderTable {TABLE}
> and
> RenderTable {TABLE} [text with a-zA-Z=SPACE0-9]
> will match
>
> I try adding ".*" at the end , but it ends up just matching the second
> one.

Curious, it should work (and match rather more than you want, but
that's another matter.  Try adding this instead:

'(?: \[[a-zA-Z= 0-9]*\])?'

Personally I'd replace all those spaces with \s* or \s+, but I'm
paranoid when it comes to whitespace.

-- 
Rhodri James *-* Wildebeest Herder to the Masses



More information about the Python-list mailing list