On 24 Mar 2020, at 2:42, Steven D'Aprano wrote:

On Sun, Mar 22, 2020 at 10:25:28PM -0000, Dennis Sweeney wrote:

Changes:
- More complete Python implementation to match what the type checking in the C implementation would be
- Clarified that returning ``self`` is an optimization
- Added links to past discussions on Python-Ideas and Python-Dev
- Specified ability to accept a tuple of strings

I am concerned about that tuple of strings feature.
[...]
Aside from those questions about the reference implementation, I am
concerned about the feature itself. No other string method that returns
a modified copy of the string takes a tuple of alternatives.

* startswith and endswith do take a tuple of (pre/suff)ixes, but they
don't return a modified copy; they just return a True or False flag;

* replace does return a modified copy, and only takes a single
substring at a time;

* find/index/partition/split etc don't accept multiple substrings
to search for.

That makes startswith/endswith the unusual ones, and we should be
conservative before emulating them.

Actually I would like for other string methods to gain the ability to search for/chop off multiple substrings too.

A find() that supports multiple search strings (and returns the leftmost position where a search string can be found) is a great help in implementing some kind of tokenizer:

def tokenize(source, delimiter):
    lastpos = 0
    while True:
        pos = source.find(delimiter, lastpos)
        if pos == -1:
            token = source[lastpos:].strip()
            if token:
                yield token
            break
        else:
            token = source[lastpos:pos].strip()
            if token:
                yield token
            yield source[pos]
        lastpos = pos + 1

print(list(tokenize(" [ 1, 2, 3] ", ("[", ",", "]"))))

This would output ['[', '1', ',', '2', ',', '3', ']'] if str.find() supported multiple substring.

Of course to be really usable find() would have to return which substring was found, which would make the API more complicated (and somewhat incompatible with the existing find()).

But for cutprefix() (or whatever it's going to be called). I'm +1 on supporting multiple prefixes. For ambiguous cases, IMHO the most straight forward option would be to chop off the first prefix found.

[...]

Servus,
Walter