Bug? re.finditer fails to terminate with empty match
The iterator returned by re.finditer appears to not terminate if the final match is empty, but rather keeps returning the final (empty) match. Is this a bug in _sre? If so, I'll be happy to file it, though fixing it is a bit beyond my _sre experience level at this point. The solution would appear to be to either a check for duplicate match in iterator.next(), or to increment position by one after returning an empty match (which should be OK, because if a non-empty match started at that location, we would have returned it instead of the empty match). Code to illustrate the failure: from re import finditer last = None for m in finditer( ".*", "asdf" ): if last == m.span(): print "duplicate match:", last break print m.group(), m.span() last = m.span() --- asdf (0, 4) (4, 4) duplicate match: (4, 4) --- findall works: print re.findall( ".*", "asdf" ) ['asdf', ''] Workaround is to explicitly check for a duplicate span, as I did above, or to check for a duplicate end(), which avoids the final empty match kb
participants (1)
-
Kevin J. Butler