I've said a few times that I think it would be good if the
behavior were defined in terms of __getitem__'s behavior.
If the rough behavior is this:
def removeprefix(self, prefix):
if self.startswith(prefix):
return self[len(prefix):]
else:
return self[:]
Then you can shift all the guarantees about whether the subtype is
str and whether it might return `self` when the prefix is missing
onto the implementation of __getitem__.
For CPython's implementation of str, `self[:]` returns `self`, so
it's clearly true that __getitem__ is allowed to return `self` in
some situations. Subclasses that do not override __getitem__ will
return the str base class, and subclasses that do
overwrite __getitem__ can choose what they want to do. So someone
could make their subclass do this:
class MyStr(str):
def __getitem__(self, key):
if isinstance(key, slice) and key.start is
key.stop is key.end is None:
return self
return type(self)(super().__getitem__(key))
They would then get "removeprefix" and "removesuffix" for free,
with the desired semantics and optimizations.
If we go with this approach (which again I think is much
friendlier to subclassers), that obviates the problem of whether
`self[:]` is a good summary of something that can return `self`:
since "does the same thing as self[:]" is the behavior
it's trying to describe, there's no ambiguity.
Best,
Paul
I'm removing the tuple feature from this PEP. So now, if I understand correctly, I don't think there's disagreement about behavior, just about how that behavior should be summarized in Python code. Ethan Furman wrote:It appears that in CPython, self[:] is self is true for base str objects, so I think return self[:] is consistent with (1) the premise that returning self is an implementation detail that is neither mandated nor forbidden, and (2) the premise that the methods should return base str objects even when called on str subclasses.The Python interpreter in my head sees self[:] and returns a copy. A note that says a str is returned would be more useful than trying to exactly mirror internal details in the Python "roughly equivalent" code.I think I'm still in the camp that ``return self[:]`` more precisely prescribes the desired behavior. It would feel strange to me to write ``return self`` and then say "but you don't actually have to return self, and in fact you shouldn't when working with subclasses". To me, it feels like return (the original object unchanged, or a copy of the object, depending on implementation details, but always make a copy when working with subclasses) is well-summarized by return self[:] especially if followed by the text Note that ``self[:]`` might not actually make a copy -- if the affix is empty or not found, and if ``type(self) is str``, then these methods may, but are not required to, make the optimization of returning ``self``. However, when called on instances of subclasses of ``str``, these methods should return base ``str`` objects, not ``self``. ...which is a necessary explanation regardless. Granted, ``return self[:]`` isn't perfect if ``__getitem__`` is overridden, but at the cost of three characters, the Python gains accuracy over both the optional nature of returning ``self`` in all cases and the impossibility (assuming no dunders are overridden) of returning self for subclasses. It also dissuades readers from relying on the behavior of returning self, which we're specifying is an implementation detail. Is that text explanation satisfactory? _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/4E77QD52JCMHSP7O62C57XILLQN6SPCT/ Code of Conduct: http://python.org/psf/codeofconduct/