[docs] [issue24243] behavior for finding an empty string is inconsistent with documentation

Vedran Čačić report at bugs.python.org
Mon Sep 18 16:50:03 EDT 2017

Vedran Čačić added the comment:

Raymond, with respect, I think you're either wrong here, or misleading with a purpose.

There is a big difference between any(()) returning False, all(()) returning True, '' in '' returning True, math.factorial(0) returning 1,  and set() <= set() returning True, on one hand, and ''.rindex('') returning 5, on the other. You might argue the latter is convenient (though I haven't seen any argument in favor of that), but it's simply not in the same category as other phenomena you're equating it with.

Those in the first group are mathematical definitions. They cannot be different without breaking various mathematical properties. 0! is 1 in the same way, and with the same reason, that 3! is 6. If 0! were 5, then 3! would _have_ to be 30. Both the specification (as the order of the symmetry group), and various algorithms for calculating factorial, simply give 1 when given 0 as an input. You can't really get 5 unless you explicitly treat 0 as a special case.

In the second group, there is an answer that might be convenient, and it probably has some obvious algorithm that produces it (I haven't read the code, but I doubt that someone treated '' as a special case there), but it doesn't fit the specification (maximal index doesn't exist), and you can easily write another algorithm that doesn't obviously treat '' as a special case, gives the same answers for all the other "sensible" cases, and gives something completely different for this case. The proof is that in many very similar cases, those different algorithms _have_ (inadvertently) been written.

One question is _whether_ '' is in some other string. Quite another is _where_ it is. First one is (greatly simplified, since it doesn't require contiguousness) all(c in other_string for c in ''), and that's obviously True for the same reason that set() <= any_set is True. But all(1/0 for c in '') is _also_ True, which shows that it really doesn't matter _what_ we test, as long as we test it on an empty collection. It shouldn't give us (by design) _any_ information that we can extract about "that particular occurence of ''" because there is in fact _no_ particular occurence to talk about.

In the face of ambiguity, refuse the temptation to guess. Yes, I see the convenience of not rasing ValueErrors for various string operations so some algorithms can say "ok, give me any value in case of '', I don't really care what it is", but we already do raise it for some operations - e.g. when splitting on an empty separator. We should do the same here. Either change the specification, or if the specification tells you to calculate something that doesn't exist, raise an exception.

nosy: +veky

Python tracker <report at bugs.python.org>

More information about the docs mailing list