[docs] [issue24243] behavior for finding an empty string is inconsistent with documentation

Serhiy Storchaka report at bugs.python.org
Wed May 20 16:03:18 CEST 2015


Serhiy Storchaka added the comment:

Changing behavior of such base methods is dangerous and can break existing code. Even if the behavior is wrong, some code can depends on it. We should be very careful with this.

There are several simple invariants for these methods.

s1.index(s2, start, end) (for non-negative indices) returns minimal index i such that start <= i and i + len(s2) <= end and s1[i: i + len(s2)] = s2. Or raise an exception if such index doesn't exist. find() returns -1 instead of an exception. Therefore "".find("", 1, 0) should be -1 and it is. All right.

The only bug is in inconsistency in startswith/endswith between str and bytes (unicode and str in Python 2). The worse, the behavior of str in Python 2 and Python 3 is different.

s1.startswith(s2, start, end) (for non-negative indices and non-tuple s2) is equivalent to start + len(s2) <= end and s2[start: start + len(s2)] == s2. Or to s1.find(s2, start, end) == start. Therefore "".startswith("", 1, 0) should be False, and it is for bytes and str in Python 2. The behavior of "".startswith("", 1, 0) in Python 3 and u"".startswith(u"", 1, 0) in Python 3 is wrong.

The question is can we fix this behavior in default branch (I think rather yes) and should we fix it in maintained releases (doubt)?

----------
components: +Interpreter Core, Unicode
nosy: +benjamin.peterson, ezio.melotti, haypo, lemburg, pitrou, serhiy.storchaka
versions: +Python 2.7, Python 3.4, Python 3.5

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue24243>
_______________________________________


More information about the docs mailing list