On Fri, Mar 29, 2019 at 04:05:55PM -0700, Christopher Barker wrote:
This proposal would provide a minor gain for an even more minor disruption.
I don't think that is correct. I think you are underestimating the gain and exaggerating the disruption :-) Cutting a prefix or suffix from a string is a common task, and there is no obvious "battery" in the std lib available for it. And there is a long history of people mistaking strip() and friends as that battery. The problem is that it seems to work: py> "something.zip".rstrip(".zip") 'something' until it doesn't: py> "something.jpg".rstrip(".jpg") 'somethin' It is *very common* for people to trip over this and think they have found a bug: https://duckduckgo.com/?q=python+bug+in+strip I would guestimate that for every person who think that they found a bug, there are probably a hundred who trip over this and then realise their error without ever going public. I believe this is a real pain point for people doing string processing. I know it has bitten me once or twice. The correct solution is a verbose statement: if string.startswith("spam"): string = string[:len("spam")] which repeats itself (*two* references to the prefix being removed, *three* references to the string being cut). The expression form is no better: process(a, b, string[:len("spam")] if string.startswith("spam") else string, c) and heaven help you if you need to cut from both ends. To make that practical, you really need a helper function. Now that's fine as far as it goes, but why do we make people re-invent the wheel over and over again? A pair of "cut" methods (cut prefix, cut suffix) fills a real need, and will avoid a lot of mistaken bug reports/questions. As for the disruption, I don't see that this will cause *any* disruption at all, beyond bike-shedding the method names and doing an initial implementation. It is a completely backwards compatible change. Since we can't monkey-patch builtins, this isn't going to break anyone's use of str. Any subclasses of str which define the same methods will still work. I've sometimes said in the past that any change will break *someone's* code, and so we should be risk-adverse. I still stand by that, but we shouldn't be *so risk adverse* that we're paralysed. Breaking users' code is a cost, but there is also the uncounted opportunity cost of *not* adding this useful battery. If we don't add these new methods, how many hundreds of users over the next decade will we condemn to repeating the same old misuse of strip() that has been misused so often in the past? How much developer time will be wasted writing, and then closing, bug reports like this? https://bugs.python.org/issue5318 Inaction has costs too. I can only think of one scenario where this change might break someone's code: - we decide on method names (let's say) lcut and rcut; - somebody else already has a class with lcut and rcut; - which does something completely different; - and they use hasattr() to decide whether to call those methods, rather than isinstance: if hasattr(myobj, 'lcut'): print(myobj.lcut(1, 2, 3, 4)) else: # do something else - and they sometimes pass strings into this code. In 3.7 and older, ordinary strings will take the second path. If we add these methods, they will take the first path. But the chances of this actually being more than a trivially small problem for anyone in real life is so small that I don't know why I even raise it. This isn't a minor disruption. Its a small possibility of a minor disruption to a tiny set of users who can fix the breakage easily. The functionality is clear, meets a real need, is backwards compatible, and has no significant downsides. The only hard part is bikeshedding names for the methods: lcut rcut cutprefix cutsuffix ltrim rtrim prestrip poststrip etc. Am I wrong about any of these statements? -- Steven