On 3/21/20 12:51 PM, Rob Cliffe via Python-Dev wrote:
On 21/03/2020 16:15, Eric V. Smith wrote:
On 3/21/2020 11:20 AM, Ned Batchelder wrote:
On 3/20/20 9:34 PM, Cameron Simpson wrote:
On 20Mar2020 13:57, Eric Fahlgren <ericfahlgren@gmail.com> wrote:
On Fri, Mar 20, 2020 at 11:56 AM Dennis Sweeney <sweeney.dennis650@gmail.com> wrote:
If ``s`` is one these objects, and ``s`` has ``pre`` as a prefix, then ``s.cutprefix(pre)`` returns a copy of ``s`` in which that prefix has been removed. If ``s`` does not have ``pre`` as a prefix, an unchanged copy of ``s`` is returned. In summary, ``s.cutprefix(pre)`` is roughly equivalent to ``s[len(pre):] if s.startswith(pre) else s``.
The second sentence above unambiguously states that cutprefix returns 'an unchanged *copy*', but the example contradicts that and shows that 'self' may be returned and not a copy. I think it should be reworded to explicitly allow the optimization of returning self.
My versions of these (plain old functions) return self if unchanged, and are explicitly documented as doing so.
This has the concrete advantage that one can test for nonremoval if the suffix with "is", which is very fast, instead of == which may not be.
So one writes (assuming methods):
prefix = cutsuffix(s, 'abc') if prefix is s: ... no change else: ... definitely changed, s != prefix also
I am explicitly in favour of returning self if unchanged.
Why be so prescriptive? The semantics of these functions should be about what the resulting string contains. Leave it to implementors to decide when it is OK to return self or not.
The only reason I can think of is to enable the test above: did a suffix/prefix removal take place? That seems like a useful thing. I think if we don't specify the behavior one way or the other, people are going to rely on Cpython's behavior here, consciously or not.
Is there some python implementation that would have a problem with the "is" test, if we were being this prescriptive? Honest question.
Of course this would open the question of what to do if the suffix is the empty string. But since "'foo'.startswith('')" is True, maybe we'd have to return a copy in that case. It would be odd to have "s.startswith('')" be true, but "s.cutprefix('') is s" also be True. Or, since there's already talk in the PEP about what happens if the prefix/suffix is the empty string, and if we adopt the "is" behavior we'd add more details there. Like "if the result is the same object as self, it means either the suffix is the empty string, or self didn't start with the suffix".
Eric
*If* no python implementation would have a problem with the "is" test (and from a position of total ignorance I would guess that this is the case :-)), then it would be a useful feature and it is easier to define it now than try to force conformance later. I have no problem with 's.startswith("") == True and s.cutprefix("") is s'. YMMV.
Why take on that "*If*" conditional? We're constantly telling people not to compare strings with "is". So why define how "is" will behave in this PEP? It's the implementation's decision whether to return a new immutable object with the same value, or the same object. As Steven points out elsewhere in this thread, Python's builtins' behavior differ, across methods and versions, in this regard. I certainly didn't know that, and it was probably news to you as well. So why do we need to nail it down for suffixes and prefixes? There will be no conformance to force later, because if the value doesn't change, then it doesn't matter whether it's a new string or the same string. --Ned.