[New-bugs-announce] [issue24636] re.search not respecting anchor markers in or-ed construction

Almer Tigelaar report at bugs.python.org
Wed Jul 15 12:13:24 CEST 2015


New submission from Almer Tigelaar:

>From the documentation ^ should restrict the matching of re.search to the beginning of the string, as mentioned here: https://docs.python.org/3.4/library/re.html#search-vs-match

However, this doesn't always seem to work as the following example shows:

re.search("^([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9]:[0-5][0-9]:[0-5][0-9]\\.[0-9]+)|([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9]:[0-5][0-9]:[0-5][0-9])|([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9]:[0-5][0-9])|([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9])|([0-9]{4}-[01][0-9]-[0-3][0-9])|([0-9]{4}-[01][0-9])|([0-9]{4})$", "2015-AE-02T10:16:08.450904")

This should not match since the expression uses or-ed patterns between anchors ^ and $. Based on the "AE" this should not return a match, yet it returns one from positions 22 to 26, based on the last pattern in the or-red sequence of patterns: ([0-9]{4})

This can be worked around by explicitly including the anchor markers in the last pattern as follows:

re.search("^([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9]:[0-5][0-9]:[0-5][0-9]\\.[0-9]+)|([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9]:[0-5][0-9]:[0-5][0-9])|([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9]:[0-5][0-9])|([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9])|([0-9]{4}-[01][0-9]-[0-3][0-9])|([0-9]{4}-[01][0-9])|(^[0-9]{4}$)$", "2015-AE-02T10:16:08.450904")

Notice: the last pattern now explicitly includes the anchors: (^[0-9]{4}$), which is factually duplicate with the anchors that already exist at the beginning and end of the entire regular expression!

This work around correctly produces no match (which is the behaviour I expected from the first pattern).

----------
components: Regular Expressions
messages: 246756
nosy: Almer Tigelaar, ezio.melotti, mrabarnett
priority: normal
severity: normal
status: open
title: re.search not respecting anchor markers in or-ed construction
type: behavior
versions: Python 3.4

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue24636>
_______________________________________


More information about the New-bugs-announce mailing list