[Python-Dev] Remove str.find in 3.0?

Guido van Rossum gvanrossum at gmail.com
Sun Aug 28 00:54:41 CEST 2005


On 8/27/05, Josiah Carlson <jcarlson at uci.edu> wrote:
> With the existance of literally thousands of uses of .find and .rfind in
> the wild, any removal consideration should be weighed heavily - which
> honestly doesn't seem to be the case here with the ~15 minute reply time
> yesterday (just my observation and opinion).  If you had been ruminating
> over this previously, great, but that did not seem clear to me in your
> original reply to Terry Reedy.

I hadn't been ruminating about deleting it previously, but I was well
aware of the likelihood of writing buggy tests for find()'s return
value. I believe that str.find() is not just something that can be
used to write buggy code, but something that *causes* bugs over and
over again. (However, see below.)

The argument that there are thousands of usages in the wild doesn't
carry much weight when we're talking about Python 3.0.

There are at least a similar number of modules that expect
dict.keys(), zip() and range() to return lists, or that depend on the
distinction between Unicode strings and 8-bit strings, or on bare
except:, on any other feature that is slated for deletion in Python
3.0 for which the replacement requires careful rethinking of the code
rather than a mechanical translation.

The *premise* of Python 3.0 is that it drops backwards compatibility
in order to make the language better in the long term. Surely you
believe that the majority of all Python programs have yet to be
written?

The only argument in this thread in favor of find() that made sense to
me was Tim Peters' observation that the requirement to use a
try/except clause leads to another kind of sloppy code. It's hard to
judge which is worse -- the buggy find() calls or the buggy/cumbersome
try/except code.

Note that all code (unless it needs to be backwards compatible to
Python 2.2 and before) which is using find() to merely detect whether
a given substring is present should be using 's1 in s2' instead.

Another observation: despite the derogatory remarks about regular
expressions, they have one thing going for them: they provide a higher
level of abstraction for string parsing, which this is all about.
(They are higher level in that you don't have to be counting
characters, which is about the lowest-level activity in programming --
only counting bytes is lower!)

Maybe if we had a *good* way of specifying string parsing we wouldn't
be needing to call find() or index() so much at all! (A good example
is the code that Raymond lifted from ConfigParser: a semicolon
preceded by whitespace starts a comment, other semicolons don't.
Surely there ought to be a better way to write that.)

All in all, I'm still happy to see find() go in Python 3.0, but I'm
leaving the door ajar: if you read this post carefully, you'll know
what arguments can be used to persuade me.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-Dev mailing list