[Python-3000] Droping find/rfind?

Wed Aug 23 17:52:35 CEST 2006

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wed, 23 Aug 2006 08:18:03 -0700
"Guido van Rossum" <guido at python.org> wrote:

> That's too narrow a view on the language. Surely the built-in types
> (especially those with direct compiler support, like literal
> notations) are part of the language. The people who complain most
> frequently about Python getting too big aren't language designers,
> they are users (e.g. scientists) and to them it doesn't matter what
> technically is or isn't in the language -- it's the complete set of
> tools they have to deal with. That doesn't include all of the standard
> library, but it surely includes the built-in types and their behavior!
> Otherwise the int/long and str/unicode unifications wouldn't be
> language changes either...

Oleg has a point though.  Speaking generally, the perception of
"bigness" comes down to how much you can -- and /have/ to -- keep in
your head at one time while programming or reading code.  Python's
traditionally made excellent choices here.  The language is small
enough to keep in your head but the library is huge.  I don't know
about anybody else, but my aging brain can't keep much of the library
in its RAM so I'm highly dependent on help() and the library reference
manual to find things when I need them.

But I almost never have to look up a particular language feature, and
this was one of the primary reasons I switch from Perl to Python over a
decade ago.  To me, Python's growth with the last few releases is felt
more deeply with language features than with library improvements.
Features like list comprehensions, generators and generator
expressions, and decorators have all been ingrained, and while
originally felt "big" now are common tools I reach for and intuitively
understand.  Some of the 2.5 features such as 'with', relative imports,
and condition expressions haven't reached that level of comfort and
make Python feel "big" to me again.

There are some counter examples: built-in sets, while making a library
feature a built-in type, makes Python feel a bit smaller because sets
are such a natural concept and code using them looks cleaner.  For
Python 3000, integrating ints and longs will definitely do this, as
will (I suspect) making all strings unicode with a (probably rarely
used) byte type.

So the question is where string methods like index and find fall.  To
me, they don't feel like language features. Built-in types fall
somewhere in between language features and library.  Their /presence/
is a language feature but what you can do with them seems more
library-ish to me.  For me, the reason is that I can easily keep in my
head that I have strings to represent text, ints, longs, etc. to
represent numbers, sets, dicts, lists, and tuples to represent
collections, etc.  But I may not remember exactly how to use str.find()
or dict.setdefault() because I use them more rarely (which doesn't
mean they're unimportant!). I know they're there and I vaguely remember
how to use them, so when I need them, it's off to the library reference
or help() for a quick referesher.

This suggests to me that a guiding principle ought to be reducing
language features without losing important functionality, just as the
int/long, str/unicode, all-newstyle classes work is doing.  Here you're
trying to polish the conceptual edges off the language, compound-W'ing
the language warts, and generally streamlining the language so it can
more easily fit in your head.  Where it comes to the library, I think we
ought to concentrate on reducing duplication.  TOOWTDI.  Get rid of the
User* modules.  If I need to do web-stuff, do I need urllib, urllib2,
urlparse, or what? etc.

As for the built-in types, let's reduce duplication here too, so if
there's a better way of e.g. doing what find, rfind, index, and rindex
do, then let's remove them and encourage the other uses.
dict.has_key() is a perfect example here.  'in' replaces many
of the use cases for str.find and friends, but not all.  Maybe
str.partition completes the picture, though I don't have enough
experience with them to know.

Anyway, enough blathering.  Those are my thoughts.  For this
specific case, maybe we really don't need any of ?find() and ?index(),
but if the choice comes down to one or the other, I still find catching
the exception less convenient than checking a return value.

- -Barry
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBROx5w3EjvBPtnXfVAQI03QP/X9KyJabidsid1Vu01PWQZ0Op2ZvoMWyg
b9VQrS94auA/AQD9zg6SoBQaPIIGLAWg6Oh4FjkiuuCwhsb96YHjGdiSE510VfjW
R6qXg9beWTaafJVtzkjCLn0Gu+H5R9EdWnLGvwdVvF2ASPwfrZ2N0G6k/daQlCNk
3G5ucal/Jug=
=vwWM
-----END PGP SIGNATURE-----