
I've said a few times that I think it would be good if the behavior were defined /in terms of __getitem__/'s behavior. If the rough behavior is this: def removeprefix(self, prefix): if self.startswith(prefix): return self[len(prefix):] else: return self[:] Then you can shift all the guarantees about whether the subtype is str and whether it might return `self` when the prefix is missing onto the implementation of __getitem__. For CPython's implementation of str, `self[:]` returns `self`, so it's clearly true that __getitem__ is allowed to return `self` in some situations. Subclasses that do not override __getitem__ will return the str base class, and subclasses that /do/ overwrite __getitem__ can choose what they want to do. So someone could make their subclass do this: class MyStr(str): def __getitem__(self, key): if isinstance(key, slice) and key.start is key.stop is key.end is None: return self return type(self)(super().__getitem__(key)) They would then get "removeprefix" and "removesuffix" for free, with the desired semantics and optimizations. If we go with this approach (which again I think is much friendlier to subclassers), that obviates the problem of whether `self[:]` is a good summary of something that can return `self`: since "does the same thing as self[:]" /is/ the behavior it's trying to describe, there's no ambiguity. Best, Paul On 3/25/20 1:36 PM, Dennis Sweeney wrote:
I'm removing the tuple feature from this PEP. So now, if I understand correctly, I don't think there's disagreement about behavior, just about how that behavior should be summarized in Python code.
It appears that in CPython, self[:] is self is true for base str objects, so I think return self[:] is consistent with (1) the premise that returning self is an implementation detail that is neither mandated nor forbidden, and (2) the premise that the methods should return base str objects even when called on str subclasses. The Python interpreter in my head sees self[:] and returns a copy. A note that says a str is returned would be more useful than trying to exactly mirror internal details in the Python "roughly equivalent" code. I think I'm still in the camp that ``return self[:]`` more precisely prescribes
Ethan Furman wrote: the desired behavior. It would feel strange to me to write ``return self`` and then say "but you don't actually have to return self, and in fact you shouldn't when working with subclasses". To me, it feels like
return (the original object unchanged, or a copy of the object, depending on implementation details, but always make a copy when working with subclasses)
is well-summarized by
return self[:]
especially if followed by the text
Note that ``self[:]`` might not actually make a copy -- if the affix is empty or not found, and if ``type(self) is str``, then these methods may, but are not required to, make the optimization of returning ``self``. However, when called on instances of subclasses of ``str``, these methods should return base ``str`` objects, not ``self``.
...which is a necessary explanation regardless. Granted, ``return self[:]`` isn't perfect if ``__getitem__`` is overridden, but at the cost of three characters, the Python gains accuracy over both the optional nature of returning ``self`` in all cases and the impossibility (assuming no dunders are overridden) of returning self for subclasses. It also dissuades readers from relying on the behavior of returning self, which we're specifying is an implementation detail.
Is that text explanation satisfactory? _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/4E77QD52... Code of Conduct: http://python.org/psf/codeofconduct/