Insanity

Sun Jan 23 06:52:39 EST 2005

Fredrik Lundh wrote:

> Tim Daneliuk wrote:
> 
> 
>>Given an arbitrary string, I want to find each individual instance of
>>text in the form:  "[PROMPT:optional text]"
>>
>>I tried this:
>>
>>    y=re.compile(r'\[PROMPT:.*\]')
>>
>>Which works fine when the text is exactly "[PROMPT:whatever]"
> 
> 
> didn't you leave something out here?  "compile" only compiles that pattern;
> it doesn't match it against your string...

Sorry - I thought this was obvious - I was interested more in the conceptual
part of the contruction of the re itself.

> 
>>but does not match on:
>>
>>   "something [PROMPT:foo] something [PROMPT:bar] something ..."
>>
>>The overall goal is to identify the beginning and end of each [PROMPT...]
>>string in the line.
> 
> 
> if the pattern can occur anywhere in the string, you need to use "search",
> not "match".  if you want multiple matches, you can use "findall" or, better
> in this case, "finditer":
> 
> import re
> 
> s = "something [PROMPT:foo] something [PROMPT:bar] something"
> 
> for m in re.finditer(r'\[PROMPT:[^]]*\]', s):
>     print m.span(0)
> 
> prints
> 
>     (10, 22)
>     (33, 45)
> 
> which looks reasonably correct.
> 
> (note the "[^x]*x" form, which is an efficient way to spell "non-greedy match"
> for cases like this)
> 

Thanks - very helpful.  One followup - your re works as advertised.  But
if I use: r'\[PROMPT:[^]].*\]'  it seems not to.  the '.*' instead of just '*'
it matches the entire string ... which seems counterintutive to me.

Thanks,

-- 
----------------------------------------------------------------------------
Tim Daneliuk     tundra at tundraware.com
PGP Key:         http://www.tundraware.com/PGP/