[Python-ideas] Deprecate str.find
Steven D'Aprano
steve at pearwood.info
Sun Jul 17 12:56:15 CEST 2011
Masklinn wrote:
> On 2011-07-17, at 10:09 , Raymond Hettinger wrote:
>> On Jul 17, 2011, at 12:15 AM, Nick Coghlan wrote:
>>> Indeed, the problem as I see it is that our general idiom for
>>> functions and methods that raise 'Not Found' exceptions is to accept
>>> an optional parameter that specifies a value to return in the Not
>>> Found case.
>> There's a difference between methods that return looked-up values
>> (where a default might make sense) versus a method that returns
>> an index (where it usually makes no sense at all).
> Why not? An index is a looked-up value here (it's just a reverse lookup)
> and .find is returning a (non-configurable) default value is it not?
No.
In context, Raymond is not talking about values as arbitrary objects. He
is talking specifically about values of a collection. E.g. given:
mylist = [23, 42, 100]
the values Raymond is talking about are 23, 42 and 100, *not* 0, 1, 2
(the indexes of the list) or 3 (the length of the list) or 165 (the sum
of the list) or any other arbitrary value.
>> * Take a look at what other languages do. Practically every general
>> purpose language has an API for doing substring searches. Since
>> we're not blazing new territory here, there needs to be a good precedent
>> for this change (no shooting from the hip when the problem has already
>> been well solved many times over).
> Other languages do everything and their reverse. You have languages
> returning −1 or 0 (the latter for 1-indexed languages), languages
> returning differently-typed sentinels, etc…
>
> SML even returns the length of the string.
>
> See a listing at http://en.wikipedia.org/wiki/Comparison_of_programming_languages_(string_functions)#Find
I can't really see that they do "everything and their reverse". There
are two basic strategies: return an out-of-bound value, and raise an
exception, both of which Python already does.
Out-of-bound values are usually one smaller than the lowest valid index
(0 or -1) or one higher than the highest valid index (length of the
string, or more greater than the length of the string). A couple of
languages return False, which is inappropriate for Python on account of
False equaling 0. Some return a dedicated "Not Found" special value, but
Python doesn't go in for a proliferation of special constants. A couple
of languages, including Ruby, return the equivalent of Python's None.
Notably missing is anything like the ability for the caller to specify
what index to return if the sub-string is missing.
Ask yourself, can you imagine needing mydict.get(key, 1) or
mydict.get(key, set())? I expect that you can easily think of reasons
why this would be useful. The usefulness of being able to set the return
value of failed lookups like dicts is obvious. I wish lists also had a
similar get method, and I bet that everybody reading this, even if they
disagree that it should be built-in, can see the value of it as a
utility function.
But can you think of a realistic scenario where you might want to call
mystring.find(substr, missing=1)? Why would you want "substring not
present" and "substring found at index 1" to both return the same thing?
How about mystring.find(substr, missing=set())?
If you can't imagine a realistic scenario where you would want such a
feature, then you probably don't need this proposed feature.
> The most common behavior on the page does seem to be returning a
> numerical sentinel. On the other hand, I'm not sure how many of these
> languages return a sentinel value which is also a valid index.
The first table on the page says that four languages accept negative
indexes: Python, Ruby, Perl and Lua. Perl and Python return -1 on not
found; Ruby and Lua return nil.
--
Steven
More information about the Python-ideas
mailing list