[Tutor] Question regular expressions - the non-greedy pattern
Walter Prins
wprins at gmail.com
Tue Jan 22 01:39:58 CET 2013
Hi Marcin,
On 21 January 2013 23:11, Marcin Mleczko <Marcin.Mleczko at onet.eu> wrote:
> first thank you very much for the quick reply.
>
No problem...
> The functions used here i.e. re.match() are taken directly form the
> example in the mentioned HowTo. I'd rather use re.findall() but I
> think the general interpretetion of the given regexp sould be nearly
> the same in both functions.
>
... except that the results are fundamentally different due to the
different goals for the 2 functions: the one (match) only matches a regex
from the first character of a string. (No conceptual "walking forward"
unless you've managed to match the string to a regex.) The other (find),
matches the first possible match (conceptually walking the starting point
forward only as far as necessary to find a possible match.)
> So I'd like to neglect the choise of a particular function for a
> moment a concentrate on the pure theory.
> What I got so far:
> in theory form s = '<<html><head><title>Title</title>'
> '<.*?>' would match '<html>' '<head>' '<title>' '</title>'
> to achieve this the engine should:
> 1. walk forward along the text until it finds <
> 2. walk forward from that point until in finds >
>
Here, conceptually the regex engines work for your original regex is
complete and it returns a match.
> 3. walk backward form that point (the one of >) until it finds <
>
No. No further walking backward when you've already matched the regex.
4. return the string between < from 3. and > from 2. as this gives the
> least possible string between < and >
>
"Non greedy" doesn't imply the conceptually altering the starting point in
a backwards manner after you've already found a match.
> Did I get this right so far? Is this (=least possible string between <
> and >), what non-greedy really translates to?
>
No, as explained above.
Walter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20130122/2450b83c/attachment.html>
More information about the Tutor
mailing list