[Tutor] non-greedy matching

Willi Richert w.richert at gmx.net
Fri Jan 30 13:56:45 CET 2009


Hi,

you make it non-greedy with "?":

import re
text="axa axa"

greedy_x, greedy_y = re.match("a.*a", text).span()
print text[greedy_x:greedy_y]

non_greedy_x, non_greedy_y = re.match("a.*?a", text).span()
print text[non_greedy_x:non_greedy_y]

Will print out:
axa axa
axa

Regards,
wr

On Freitag, 30. Januar 2009 13:44:42 A.T.Hofkamp wrote:
> spir wrote:
> > Hello,
> >
> > imagine you need to match such a pattern:
> >
> > pat : (@@ [charset]* @@) | [charset]*
> > ... where [charset] has to include '@'
> >
> > My questions are:
> > * Is there any other way than using a non-greedy form of [charset]* ?
>
> Something like this?
> (in pseudo-RE syntax)
>
> "(@@" ( [^@]* "@" [^@] | [^@]* "@" "@"+ [^@)] )* [^@]* "@" "@"+ ")"
>
> to understand, break the above down in pieces:
>
> "(@@" ( A | B )* C
>
> with A: [^@]* "@" [^@]
>          # lots of chars, then a @, and a non-@
>
>       B: [^@]* "@" "@"+ [^@)]
>          # lots of chars then at least two @, and a non-closing bracket
>          # (the non-@ at the end is for forcing all @ to be matched in
> "@"+)
>
>       C: [^@]* "@" "@"+ ")"
>          # lots of chars, then at least two @, and finally a closing
> bracket
>
> > * How, actually, is non-greedy character string matching performed?
>
> That's what I'd like to know too.
> (and while we are at it, what about the \b )?
>
>
> Sincerely,
> Albert
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20090130/29e3457d/attachment.htm>


More information about the Tutor mailing list