[Python-Dev] Remove str.find in 3.0?

Sat Aug 27 06:48:27 CEST 2005

"Terry Reedy" <tjreedy at udel.edu> wrote:
> 
> "Josiah Carlson" <jcarlson at uci.edu> wrote in message 
> news:20050826134317.7DFD.JCARLSON at uci.edu...
> >
> > "Terry Reedy" <tjreedy at udel.edu> wrote:
> >>
> >> Can str.find be listed in PEP 3000 (under builtins) for removal?
> 
> Guido has already approved,

I noticed, but he approved before anyone could say anything.  I
understand it is a dictatorship, but he seems to take advisment and
reverse (or not) his decisions on occasion based on additional
information. Whether this will lead to such, I don't know.

> but I will try to explain my reasoning a bit 
> better for you.  There are basically two ways for a system, such as a 
> Python function, to indicate 'I cannot give a normal response."  One (1a) 
> is to give an inband signal that is like a normal response except that it 
> is not (str.find returing -1).  A variation (1b) is to give an inband 
> response that is more obviously not a real response (many None returns). 
> The other (2) is to not respond (never return normally) but to give an 
> out-of-band signal of some sort (str.index raising ValueError).
> 
> Python as distributed usually chooses 1b or 2.  I believe str.find and 
> .rfind are unique in the choice of 1a.  I am pretty sure that the choice 
> of -1 as error return, instead of, for instance, None, goes back the the 
> need in static languages such as C to return something of the declared 
> return type.  But Python is not C, etcetera.  I believe that this pair is 
> also unique in having exact counterparts of type 2.  (But maybe I forgot 
> something.)

Taking a look at the commits that Guido did way back in 1993, he doesn't
mention why he added .find, only that he did.  Maybe it was another of
the 'functional language additions' that he now regrets, I don't know.

> >> Would anyone really object?
> 
> > I would object to the removal of str.find().
> 
> So, I wonder, what is your favored alternative?
> 
> A. Status quo: ignore the opportunity to streamline the language.

str.find is not a language construct.  It is a method on a built-in type
that many people use.  This is my vote.

> B. Change the return type of .find to None.

Again, this would break potentially thousands of lines of user code that
is in the wild.  Are we talking about changes for 2.5 here, or 3.0?

> C. Remove .(r)index instead.

see below *

> D. Add more redundancy for those who do not like exceptions.

In 99% of the cases, such implementations would be minimal.  While I
understand that "There should be one-- and preferably only one --obvious
way to do it.", please see below *.

> > Further, forcing users to use try/except when they are looking for the
> > offset of a substring seems at least a little strange (if not a lot
> > braindead, no offense to those who prefer their code to spew exceptions
> > at every turn).
> 
> So are you advocating D above or claiming that substring indexing is 
> uniquely deserving of having two versions?  If the latter, why so special? 
> If we only has str.index, would you actually suggest adding this particular 
> duplication?

Apparently everyone has forgotten the dozens of threads on similar
topics over the years.  I'll attempt to summarize.

Adding functionality that isn't used is harmful, but not nearly as
harmful as removing functionality that people use.

If you take just two seconds and do a search on '.find(' vs '.index(' in
the standard library, you will notice that '.find(' is used more often
than '.index(' regardless of type (I don't have the time this evening to
pick out which ones are string only, but I doubt the standard library
uses mmap.find, DocTestFinder.find, or gettext.find).  This example
seems to show that people find str.find to be more intuitive and/or
useful than str.index, even though you spent two large paragraphs
explaining that Python 'doesn't do it that way very often so it isn't
Pythonic'. Apparently the majority of people who have been working on
the standard library for the last decade disagree.

> > Considering the apparent dislike/hatred for str.find.
> 
> I don't hate str.find.  I simply (a) recognize that a function designed for 
> static typing constraints is out of place in Python, which does not have 
> those constraints and (b) believe that there is no reason other than 
> history for the duplication and (c) believe that dropping .find is 
> definitely better than dropping .index and changing .find.

* I don't see why it is necessary to drop or change either one.  We've
got list() and [] for construcing a list.  Heck, we've even got
list(iterable) and [i for i in iterable] for making a list copy of any
arbitrary iterable.  This goes against TSBOOWTDI, so why don't we toss
list comprehensions now that we have list(generator expression)?  Or did
I miss something and this was already going to happen?

> > Would you further request that .rfind be removed from strings?
> 
> Of course.  Thanks for reminding me.

No problem, but again, do a search in the standard library...  I found 4
examples if str.rindex, but over 40 of str.rfind.  Koders.com offers
1153 and 100 for .rfind and .rindex respectively (probably not all
string methods, but I'm too lazy to check every one). A common factor of
over 10. If koders.com had a decent way to search for the full name of a
method call, we could do a find vs. index as well, though I expect we
would see closer to 4:3 with .find winning (those are the approximate
numbers I get when checking the standard library).

The reason I'm making a stink is because you are proposing (and Guido
has agreed) to get rid of methods V,W which are used more often than
methods X,Y in order to 'streamline the language' for 3.0.  The removal
of two methods and their implementations will not go terribly far
towards streamlining the language, especially when all four methods
(find, rfind, index, rindex) call the same C function to do the actual
search.

 - Josiah