[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

24 Mar 2020

      On 24 Mar 2020, at 2:42, Steven D'Aprano wrote:
...
On Sun, Mar 22, 2020 at 10:25:28PM -0000, Dennis Sweeney wrote:
...
Changes:
    - More complete Python implementation to match what the type 
checking in the C implementation would be
    - Clarified that returning ``self`` is an optimization
    - Added links to past discussions on Python-Ideas and Python-Dev
    - Specified ability to accept a tuple of strings
I am concerned about that tuple of strings feature.
[...]
Aside from those questions about the reference implementation, I am
concerned about the feature itself. No other string method that 
returns
a modified copy of the string takes a tuple of alternatives.
* startswith and endswith do take a tuple of (pre/suff)ixes, but they
  don't return a modified copy; they just return a True or False flag;
* replace does return a modified copy, and only takes a single
  substring at a time;
* find/index/partition/split etc don't accept multiple substrings
  to search for.
That makes startswith/endswith the unusual ones, and we should be
conservative before emulating them.
Actually I would like for other string methods to gain the ability to 
search for/chop off multiple substrings too.

A `find()` that supports multiple search strings (and returns the 
leftmost position where a search string can be found) is a great help in 
implementing some kind of tokenizer:

```python
def tokenize(source, delimiter):
	lastpos = 0
	while True:
		pos = source.find(delimiter, lastpos)
		if pos == -1:
			token = source[lastpos:].strip()
			if token:
				yield token
			break
		else:
			token = source[lastpos:pos].strip()
			if token:
				yield token
			yield source[pos]
		lastpos = pos + 1

print(list(tokenize(" [ 1, 2, 3] ", ("[", ",", "]"))))
```

This would output `['[', '1', ',', '2', ',', '3', ']']` if `str.find()` 
supported multiple substring.

Of course to be really usable `find()` would have to return **which** 
substring was found, which would make the API more complicated (and 
somewhat incompatible with the existing `find()`).

But for `cutprefix()` (or whatever it's going to be called). I'm +1 on 
supporting multiple prefixes. For ambiguous cases, IMHO the most 
straight forward option would be to chop off the first prefix found.
...
[...]
Servus,
    Walter

[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

Walter Dörwald