[Python-ideas] Deprecate str.find

Mike Graham mikegraham at gmail.com
Sat Jul 16 05:43:27 CEST 2011


On Fri, Jul 15, 2011 at 6:50 PM, Cameron Simpson <cs at zip.com.au> wrote:
> On 15Jul2011 12:12, Mike Graham <mikegraham at gmail.com> wrote:
> | It isn't necessarily an error if the substring is not in the string
> | (though it sometimes is), but it is an exceptional case.
>
> No it isn't, IMO. It's simply the _other_ case.

There are many cases that mean exactly the same thing (0 means the
substring was found starting at 0, 1 means the substring was found
starting at 1, 2 means the substring was found starting at 2,...), so
the other case (substring not contained) we can label special or
exceptional. We can recognize such cases by their requiring separate
logic.

> | Python uses
> | exceptions pretty liberally most places -- it isn't necessarily an
> | error if an iterator is exhausted or if float("4.2 bad user input") is
> | called or if BdbQuit was raised. In these cases, an exception can be
> | perfectly expected to indicate that what happened is different from
> | the information used in a return value.
>
> In all the cases you cite the exception indicates failure of the
> operation: .next() has nothing to "next" to, float is being handed
> garbage etc.
>
> str.find does not have a failure mode, it has string found and string
> not found.

I think you are way off here. "Error" is in the eye of the beholder. I
could say that an iterator has two cases: one where it gives me a
value, and the other, that it's exhausted. Nothing exceptional there.
(Indeed, the choice of StopIteration not to have "Error" in the name
was made for the precise reason this isn't regarded as an error.)

Similarly, when I pass user input to float, there are just two normal
cases, no failure mode: user entered a number, user didn't enter a
number.

The distinction between an error and another type of special case is
subtle at best.

> | Making a Python user write a try/except block when she wants to handle
> | both the cases "substring is in s" and "substring isn't in s" seems
> | perfectly fine to me and, really, preferable to the if statement
> | required to handle these two cases.
>
> You don't find try/except wordy and opaque? I find "if" more idiomatic most of the time.

Not really. If we don't like exceptionhandling, we're using the wrong language.

> Not to mention vague: it can often be quite hard to be sure the raised
> exception came from just the operation you imagine it came from. With
> str.find there's little scope for vagueness I agree (unless you aren't
> really using a str, but a duck-type). But plenty of:
>
>  try:
>    x = foofunc(y)
>  except IndexError, e:
>    ...
>
> is subject to uncaught IndexError arbitrarily deep in foofunc's call stack.

This is the strength and the flaw with exceptions period. It is a much
broader question than the one we are facing here. If you do not like
exceptions period or Python's use of relatively few exception types
for many occasions, I really don't think we can start the discussion
at the level of str.find.

If I did manage to have an IndexError propagate through to my
SomeDuckType.index method when it shouldn't have the meaning I ascribe
it, then this is a bug in my implementation of SomeDuckType. This bug
would be very unfortunate because when a user tries to use my code
right--catching the IndexError--they will completely squash the
offending exception and the source of the bug will be unclear.
Unfortunately, str.find is highly prone to such bugs as I've discussed
since -1 is a valid index for the string.

> | The base two cases really are about the same:
> [... try ... excpt ...]
> | vs.
> |
> | i = s.find(sub)
> | if i == -1:
> |     do_something()
> |
> | But what if I forgot to handle the special case? [...]
> | In this second case, I get the value of -1. Later I can use it as an
> | index, use it in a slice, or perform arithmetic on it. This can
> | introduce seemingly-unrelated values later on, making this especially
> | hard to track down.
>
> I agree it may be a pity that str.find doesn't return None on string not
> found, which would generally raise an exception on an attempt to use it
> as a number.
>
> Cheers,
> --
> Cameron Simpson

Mike



More information about the Python-ideas mailing list