In that case, the PEP should advice to use .startwith() or .endswith() explicitly if the caller requires to know if the string is going to be modified. Example:
modified = False # O(n) complexity where n=len("prefix:") if line.startswith("prefix:"): line = line.cutprefix("prefix: ") modified = True
It should be more efficient than:
old_line = line line = line.cutprefix("prefix: ") modified = (line != old_line) # O(n) complexity where n=len(line)
since the checked prefix is usually way shorter than the whole string.
Le sam. 21 mars 2020 à 17:45, Eric V. Smith firstname.lastname@example.org a écrit :
On 3/21/2020 12:39 PM, Victor Stinner wrote:
Well, if CPython is modified to implement tagged pointers and supports storing a short strings (a few latin1 characters) as a pointer, it may become harder to keep the same behavior for "x is y" where x and y are strings.
Good point. And I guess it's still a problem for interned strings, since even a copy could be the same object:
s = 'for' s[:] is 'for'
So I now agree with Ned, we shouldn't be prescriptive here, and we should explicitly say in the PEP that there's no way to tell if the strip/cut/whatever took place, other than comparing via equality, not identity.
Le sam. 21 mars 2020 à 17:23, Eric V. Smith email@example.com a écrit :
On 3/21/2020 11:20 AM, Ned Batchelder wrote:
On 3/20/20 9:34 PM, Cameron Simpson wrote:
On 20Mar2020 13:57, Eric Fahlgren firstname.lastname@example.org wrote:
On Fri, Mar 20, 2020 at 11:56 AM Dennis Sweeney email@example.com wrote:
> If ``s`` is one these objects, and ``s`` has ``pre`` as a prefix, then > ``s.cutprefix(pre)`` returns a copy of ``s`` in which that prefix has > been removed. If ``s`` does not have ``pre`` as a prefix, an > unchanged copy of ``s`` is returned. In summary, ``s.cutprefix(pre)`` > is roughly equivalent to ``s[len(pre):] if s.startswith(pre) else s``. > The second sentence above unambiguously states that cutprefix returns 'an unchanged *copy*', but the example contradicts that and shows that 'self' may be returned and not a copy. I think it should be reworded to explicitly allow the optimization of returning self.
My versions of these (plain old functions) return self if unchanged, and are explicitly documented as doing so.
This has the concrete advantage that one can test for nonremoval if the suffix with "is", which is very fast, instead of == which may not be.
So one writes (assuming methods):
prefix = cutsuffix(s, 'abc') if prefix is s: ... no change else: ... definitely changed, s != prefix also
I am explicitly in favour of returning self if unchanged.
Why be so prescriptive? The semantics of these functions should be about what the resulting string contains. Leave it to implementors to decide when it is OK to return self or not.
The only reason I can think of is to enable the test above: did a suffix/prefix removal take place? That seems like a useful thing. I think if we don't specify the behavior one way or the other, people are going to rely on Cpython's behavior here, consciously or not.
Is there some python implementation that would have a problem with the "is" test, if we were being this prescriptive? Honest question.
Of course this would open the question of what to do if the suffix is the empty string. But since "'foo'.startswith('')" is True, maybe we'd have to return a copy in that case. It would be odd to have "s.startswith('')" be true, but "s.cutprefix('') is s" also be True. Or, since there's already talk in the PEP about what happens if the prefix/suffix is the empty string, and if we adopt the "is" behavior we'd add more details there. Like "if the result is the same object as self, it means either the suffix is the empty string, or self didn't start with the suffix".
Python-Dev mailing list -- firstname.lastname@example.org To unsubscribe send an email to email@example.com https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://firstname.lastname@example.org/message/HYSZSIAZ... Code of Conduct: http://python.org/psf/codeofconduct/