[ python-Bugs-1104608 ] Wrong expression with \w+?

SourceForge.net noreply at sourceforge.net
Tue Jan 18 18:26:00 CET 2005


Bugs item #1104608, was opened at 2005-01-18 16:49
Message generated for change (Comment added) made by niemeyer
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1104608&group_id=5470

Category: Regular Expressions
Group: Python 2.4
>Status: Closed
>Resolution: Invalid
Priority: 5
Submitted By: rengel (engel_re)
Assigned to: Gustavo Niemeyer (niemeyer)
Summary: Wrong expression with \w+?

Initial Comment:
str = 'match the url www.junit.org with following regex'
regex = re.compile('(www\.\w+?\.\w+?)')
print regex.sub('<span class="url">\1</span>', str)
# It produces
match the url <span class="url">www.junit.o</span>rg
with following regex
# It should produce
match the url <span class="url">www.junit.org</span>
with following regex



----------------------------------------------------------------------

>Comment By: Gustavo Niemeyer (niemeyer)
Date: 2005-01-18 17:25

Message:
Logged In: YES 
user_id=7887

There's nothing wrong with this result. You asked for a non-greedy match 
(you've used '\w+?', not '\w+'), and SRE gave you the minimum possible 
match. 

----------------------------------------------------------------------

Comment By: Fredrik Lundh (effbot)
Date: 2005-01-18 17:23

Message:
Logged In: YES 
user_id=38376

No, it shouldn't.  "+?" means the shortest possible match 
that's one character or more.  If you want the longest 
possible match, get rid of the "?".

(in this case, I'd use "(www[.\w]*)")

</F>

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1104608&group_id=5470


More information about the Python-bugs-list mailing list