[ python-Bugs-1212411 ] Incorrect result for regular expression - "|(hello)|(world)"

SourceForge.net noreply at sourceforge.net
Thu Jun 2 00:59:37 CEST 2005


Bugs item #1212411, was opened at 2005-06-01 05:13
Message generated for change (Comment added) made by karamana
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1212411&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Regular Expressions
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Vijay Kumar (karamana)
Assigned to: Gustavo Niemeyer (niemeyer)
Summary: Incorrect result for regular expression - "|(hello)|(world)"

Initial Comment:
The regular expression "|hello|world" incorrectly gives a 
match, owing to the starting '|'.  Below is a sample 
program which highlights this.  The correct result 
behavior is to return None:

If the leading '|' is removed then the result is correct.

-----
import re
m = re.search("|hello|world","This is a simple sentence")
print m

m2 = re.search("hello|world","This is a simple sentence")
print m2

---- output ---
<_sre.SRE_Match object at 0x00B71F70>
None
----------
The first one is incorrect.  Should have returned a None.


----------------------------------------------------------------------

>Comment By: Vijay Kumar (karamana)
Date: 2005-06-01 22:59

Message:
Logged In: YES 
user_id=404715

I think what you are saying is correct in terms of a formal 
sense, but it makes sense to distinguish between a useful 
match and an empty match.  May be there can be an 
additional method isEmptyMatch() in the match object which 
can be used to detect this.

Also this one does not work: Gives a compile error
m = re.search("[]","This is a simple sentence")
print m

wherease this one returns None:
m = re.search("[|]","This is a simple sentence")
print m

So the empty match is not consistent :)  (don't know if I 
should wink )

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2005-06-01 21:39

Message:
Logged In: YES 
user_id=80475

The current behavior best matches my expectations.
One other datapoint, AWK handles it the same way.

Recommend closing this as Invalid.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2005-06-01 05:19

Message:
Logged In: YES 
user_id=31435

I expect you'll find that, e.g., Perl does the same thing:  
a "missing" alternative is treated as an empty string, and an 
empty string always matches.  What basis do you have for 
claiming it should not match (beyond just repeating that it 
should not <wink>)?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1212411&group_id=5470


More information about the Python-bugs-list mailing list