On Sun, Mar 22, 2020 at 10:25:28PM -0000, Dennis Sweeney wrote:
Changes: - More complete Python implementation to match what the type checking in the C implementation would be - Clarified that returning ``self`` is an optimization - Added links to past discussions on Python-Ideas and Python-Dev - Specified ability to accept a tuple of strings
I am concerned about that tuple of strings feature. First, an implementation question: you do this when the prefix is a tuple: if isinstance(prefix, tuple): for option in tuple(prefix): if not isinstance(option, str): raise TypeError() option_str = str(option) which looks like two unnecessary copies: 1. Having confirmed that `prefix` is a tuple, you call tuple() to make a copy of it in order to iterate over it. Why? 2. Having confirmed that option is a string, you call str() on it to (potentially) make a copy. Why? Aside from those questions about the reference implementation, I am concerned about the feature itself. No other string method that returns a modified copy of the string takes a tuple of alternatives. * startswith and endswith do take a tuple of (pre/suff)ixes, but they don't return a modified copy; they just return a True or False flag; * replace does return a modified copy, and only takes a single substring at a time; * find/index/partition/split etc don't accept multiple substrings to search for. That makes startswith/endswith the unusual ones, and we should be conservative before emulating them. The difficulty here is that the notion of "cut one of these prefixes" is ambiguous if two or more of the prefixes match. It doesn't matter for startswith: "extraordinary".startswith(('ex', 'extra')) since it is True whether you match left-to-right, shortest-to-largest, or even in random order. But for cutprefix, which prefix should be deleted? Of course we can make a ruling by fiat, right now, and declare that it will cut the first matching prefix reading left to right, whether that's what users expect or not. That seems reasonable when your prefixes are hard-coded in the source, as above. But what happens here? prefixes = get_prefixes('user.config') result = mystring.cutprefix(prefixes) Whatever decision we make -- delete the shortest match, longest match, first match, last match -- we're going to surprise and annoy the people who expected one of the other behaviours. This is why replace() still only takes a single substring to match and this isn't supported: "extraordinary".replace(('ex', 'extra'), '') We ought to get some real-life exposure to the simple case first, before adding support for multiple prefixes/suffixes. -- Steven