
On 10/4/2011 10:21 PM, Guido van Rossum wrote:
But with str.find, 0 is a legitimate result, so if we were to return None there'd be *two* outcomes mapping to false: no match, or a match at the start of the string, which is no good.
People would have to test that the result 'is None' or 'is not None'. That is no worse than testing '== -1' or '>= 0'. I claim it is better because continuing to compute with None as if it were a number will more likely quickly raise an error, whereas doing so with a '-1' that looks like a legitimate string position (the last), but does not really mean that, might never raise an error but lead to erroneous output. (I said 'more likely' because None is valid in slicings, same as -1.)
Example: define char_before(s,c) as returning the character before the first occurance of c in s. Ignoring the s.startswith(c) case:
s='abcde' s[s.find('e')-1]
'd' # Great, it works
s[s.find('f')-1]
'd' # Whoops, not so great. s[None] fails, as it should.
You usually try to avoid such easy bug bait. I cannot think of any other built-in function that returns such a valid but invalid result.
Hence the -1: the intent was that people should write "if s.find(x)>= 0" -- but clearly that didn't work out either, it's too easy to forget the ">= 0" part.
As easy or easier than forgetting '== None'
We also have str.index which raised an exception, but people dislike writing try/except blocks.
Given that try/except blocks are routinely used for flow control in Python, and that some experts even advocate using them over if/else (leap first), I am tempted to ask why such people are using Python. I am curious, though, why this exception is more objectionable than all the others -- and why you apparently give such objections for this function more weight than for others.
One could justify out-of-range IndexError on the basis that an in-range indexing could return any object, including None, so that the signal *must* not be a normal return (even of an exception object).
However, Python comes with numerous, probably 100s of functions with restricted output ranges that raise exceptions (TypeError, ValueError, AttributeError, etc) instead of returning, for instance, None.
For example, consider int('a'): why not None instead of ValueError? One reason is that s[:int('a')] would then return s instead of raising an error.
I strongly suspect that if we did not have str.find now, we would not add it, and certainly not in its current form.
I'm out of ideas here. But of all these, str.find is probably still the worst -- I've flagged bugs caused by it too many times to count.
So lets deprecate it for eventual removal, maybe in Py4.