[ python-Bugs-1104608 ] Wrong expression with \w+?
SourceForge.net
noreply at sourceforge.net
Tue Jan 18 18:26:00 CET 2005
Bugs item #1104608, was opened at 2005-01-18 16:49
Message generated for change (Comment added) made by niemeyer
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1104608&group_id=5470
Category: Regular Expressions
Group: Python 2.4
>Status: Closed
>Resolution: Invalid
Priority: 5
Submitted By: rengel (engel_re)
Assigned to: Gustavo Niemeyer (niemeyer)
Summary: Wrong expression with \w+?
Initial Comment:
str = 'match the url www.junit.org with following regex'
regex = re.compile('(www\.\w+?\.\w+?)')
print regex.sub('<span class="url">\1</span>', str)
# It produces
match the url <span class="url">www.junit.o</span>rg
with following regex
# It should produce
match the url <span class="url">www.junit.org</span>
with following regex
----------------------------------------------------------------------
>Comment By: Gustavo Niemeyer (niemeyer)
Date: 2005-01-18 17:25
Message:
Logged In: YES
user_id=7887
There's nothing wrong with this result. You asked for a non-greedy match
(you've used '\w+?', not '\w+'), and SRE gave you the minimum possible
match.
----------------------------------------------------------------------
Comment By: Fredrik Lundh (effbot)
Date: 2005-01-18 17:23
Message:
Logged In: YES
user_id=38376
No, it shouldn't. "+?" means the shortest possible match
that's one character or more. If you want the longest
possible match, get rid of the "?".
(in this case, I'd use "(www[.\w]*)")
</F>
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1104608&group_id=5470
More information about the Python-bugs-list
mailing list