[Python-ideas] Deprecate str.find

Steven D'Aprano steve at pearwood.info
Sun Jul 17 12:56:15 CEST 2011


Masklinn wrote:
> On 2011-07-17, at 10:09 , Raymond Hettinger wrote:
>> On Jul 17, 2011, at 12:15 AM, Nick Coghlan wrote:
>>> Indeed, the problem as I see it is that our general idiom for
>>> functions and methods that raise 'Not Found' exceptions is to accept
>>> an optional parameter that specifies a value to return in the Not
>>> Found case.
>> There's a difference between methods that return looked-up values
>> (where a default might make sense) versus a method that returns
>> an index (where it usually makes no sense at all).
> Why not? An index is a looked-up value here (it's just a reverse lookup)
> and .find is returning a (non-configurable) default value is it not?


No.

In context, Raymond is not talking about values as arbitrary objects. He 
is talking specifically about values of a collection. E.g. given:

mylist = [23, 42, 100]

the values Raymond is talking about are 23, 42 and 100, *not* 0, 1, 2 
(the indexes of the list) or 3 (the length of the list) or 165 (the sum 
of the list) or any other arbitrary value.



>> * Take a look at what other languages do.  Practically every general
>> purpose language has an API for doing substring searches.  Since
>> we're not blazing new territory here, there needs to be a good precedent
>> for this change (no shooting from the hip when the problem has already
>> been well solved many times over).
> Other languages do everything and their reverse. You have languages
> returning −1 or 0 (the latter for 1-indexed languages), languages
> returning differently-typed sentinels, etc…
> 
> SML even returns the length of the string.
> 
> See a listing at http://en.wikipedia.org/wiki/Comparison_of_programming_languages_(string_functions)#Find

I can't really see that they do "everything and their reverse". There 
are two basic strategies: return an out-of-bound value, and raise an 
exception, both of which Python already does.

Out-of-bound values are usually one smaller than the lowest valid index 
(0 or -1) or one higher than the highest valid index (length of the 
string, or more greater than the length of the string). A couple of 
languages return False, which is inappropriate for Python on account of 
False equaling 0. Some return a dedicated "Not Found" special value, but 
Python doesn't go in for a proliferation of special constants. A couple 
of languages, including Ruby, return the equivalent of Python's None.

Notably missing is anything like the ability for the caller to specify 
what index to return if the sub-string is missing.

Ask yourself, can you imagine needing mydict.get(key, 1) or 
mydict.get(key, set())? I expect that you can easily think of reasons 
why this would be useful. The usefulness of being able to set the return 
value of failed lookups like dicts is obvious. I wish lists also had a 
similar get method, and I bet that everybody reading this, even if they 
disagree that it should be built-in, can see the value of it as a 
utility function.

But can you think of a realistic scenario where you might want to call 
mystring.find(substr, missing=1)? Why would you want "substring not 
present" and "substring found at index 1" to both return the same thing?

How about mystring.find(substr, missing=set())?

If you can't imagine a realistic scenario where you would want such a 
feature, then you probably don't need this proposed feature.



> The most common behavior on the page does seem to be returning a
> numerical sentinel. On the other hand, I'm not sure how many of these
> languages return a sentinel value which is also a valid index.

The first table on the page says that four languages accept negative 
indexes: Python, Ruby, Perl and Lua. Perl and Python return -1 on not 
found; Ruby and Lua return nil.



-- 
Steven



More information about the Python-ideas mailing list