feature request: a better str.endswith
vze4rx4y at verizon.net
Sun Jul 20 01:23:25 CEST 2003
> > >>I often feel the need to extend the string method ".endswith" to tuple
> > >>arguments, in such a way to automatically check for multiple endings.
> > >>For instance, here is a typical use case:
> > >>
> > >>if filename.endswith(('.jpg','.jpeg','.gif','.png')):
> > >> print "This is a valid image file"
> > > extensions = ('.jpg', '.jpeg', '.gif', '.png')
> > > if filter(filename.endswith, extensions):
> > > print "This is a valid image file
> > >
> > > Jp
> > Using filter Michele's original statement becomes:
> > if filter(filename.endswith, ('.jpg','.jpeg','.gif','.png')):
> > print "This is a valid image file"
> > IMHO this is simple enough to not require a change to the
> > .endswith method...
> I haven't thought of "filter". It is true, it works, but is it really
> readable? I had to think to understand what it is doing.
> My (implicit) rationale for
> was that it works exactly as "isinstance", so it is quite
> obvious what it is doing. I am asking just for a convenience,
> which has already a precedent in the language and respects
> the Principle of Least Surprise.
I prefer that this feature not be added. Convenience functions
like this one rarely pay for themselves because:
-- The use case is not that common (afterall, endswith() isn't even
used that often).
-- It complicates the heck out of the C code
-- Checking for optional arguments results in a slight slowdown
for the normal case.
-- It is easy to implement a readable version in only two or three
lines of pure python.
-- It is harder to read because it requires background knowledge
of how endswith() handles a tuple (quick, does it take any
iterable or just a tuple, how about a subclass of tuple; is it
like min() and max() in that it *args works just as well as
argtuple; which python version implemented it, etc).
-- It is a pain to keep the language consistent. Change endswith()
and you should change startswith(). Change the string object and
you should also change the unicode object and UserString and
perhaps mmap. Update the docs for each and add test cases for
each (including weird cases with zero-length tuples and such).
-- The use case above encroaches on scanning patterns that are
already efficiently implemented by the re module.
-- Worst of all, it increases the sum total of python language to be
learned without providing much in return.
-- In general, the language can be kept more compact, efficient, and
maintainable by not trying to vectorize everything (the recent addition
of the __builtin__.sum() is a rare exception that is worth it). It is
better to use a general purpose vectorizing function (like map, filter,
or reduce). This particular case is best implemented in terms of the
some() predicate documented in the examples for the new itertools module
(though any() might have been a better name for it):
The implementation of some() is better than the filter version because
it provides an "early-out" upon the first successful hit.
More information about the Python-list