
I'm new to Python and this is my first suggestion, so please bear with me: I believe there is a simple but useful string operation missing from Python: subtraction. This is best described by way of example:
If b and c were strings, then: a = b - c would be equivalent to: if b.find(c) < 0: a = b else: a = b[:b.find(c)] + b[b.find(c)+len(c):] The operation would remove from the minuend the first occurrence (searching from left to right) of the subtrahend. In the case of no match, the minuend would be returned unmodified. To those unfamiliar with string subtraction, it might seem non-intuitive, but it's a useful programming construct. Many things can be done with it and it's a good way to keep code simple. I think it would be preferable to the current interpreter response:
As the interpreter currently checks for this attempted operation, it seems it would be straightforward to add the code needed to do something useful with it. I don't think there would be backward compatibility issues, as this would be a new feature in place of a fatal error.

People already do this with the s1.replace(s2, "") idiom. I'm not sure what the added value is. Your equivalent implementation looks pretty strange and complex: how is it different from str.replace with the empty string as second argument? cheers lvh

On Thu, 14 Oct 2010 20:58:52 +0000 Dave Jakeman <davejakeman@hotmail.com> wrote:
The existing construct a = b.replace(c, '', 1) The problem isn't that it's non-intuitive (there's only one intuitive interface, and it's got nothing to do with computers), it's that there are a wealth of "intuitive" meanings. A case can be made that it should mean the same as any of thise: a = b.replace(c, '') a = b.replace(c, ' ', 1) a = b.replace(c, ' ') For that matter, it might also mean the same thing as any of these: a = re.sub(r'\s*%s\s*' % c, '', b, 1) a = re.sub(r'\s*%s\s*' % c, '', b) a = re.sub(r'\s*%s\s*' % c, ' ', b, 1) a = re.sub(r'\s*%s\s*' % c, ' ', b) a = re.sub(r'%s\s*' % c, '', b, 1) a = re.sub(r'%s\s*' % c, '', b) a = re.sub(r'%s\s*' % c, ' ', b, 1) a = re.sub(r'%s\s*' % c, ' ', b) a = re.sub(r'\s*%s' % c, '', b, 1) a = re.sub(r'\s*%s' % c, '', b) a = re.sub(r'\s*%s' % c, ' ', b, 1) a = re.sub(r'\s*%s' % c, ' ', b) Unless you can make a clear case as to why exactly one of those cases is different enough from the others to warrant a syntax all it's own, It's probably best to be explicit about the desired behavior. <mike -- Mike Meyer <mwm@mired.org> http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

On Thu, 14 Oct 2010 23:49:25 +0200 Masklinn <masklinn@masklinn.net> wrote:
Well, if you use the standard left-to right ordering, that equality doesn't hold for the proposed meaning for string subtraction: ("xyzzy and " + "xyzzy") - "xyzzy" = " and xyzzy" != "xyzzy and " It won't hold for any of the definition I proposed either - not if a contains a copy of b. Come to think of it, it doesn't hold for the computer representation of numbers, either. <mike -- Mike Meyer <mwm@mired.org> http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Here's a useful function along these lines, which ideally would be string.remove(): def remove(s, sub, maxremove=None, sep=None): """Removes instances of sub from the string. Args: s: The string to be modified. sub: The substring to be removed. maxremove: If specified, the maximum number of instances to be removed (starting from the left). If omitted, removes all instances. sep: Optionally, the separators to be removed. If the separator appears on both sides of a removed substring, one of the separators is removed.
processed = '' remaining = s while maxremove is None or maxremove > 0: parts = remaining.split(sub, 1) if len(parts) == 1: return processed + remaining processed += parts[0] remaining = parts[1] if sep and processed.endswith(sep) and remaining.startswith(sep): remaining = remaining[len(sep):] if maxremove is not None: maxremove -= 1 return processed + remaining --- Bruce Latest blog post: http://www.vroospeak.com/2010/10/today-we-are-all-chileans.html<http://www.vroospeak.com> Learn how hackers think: http://j.mp/gruyere-security On Thu, Oct 14, 2010 at 3:16 PM, spir <denis.spir@gmail.com> wrote:

On Thu, Oct 14, 2010 at 7:45 PM, Bruce Leban <bruce@leapyear.org> wrote:
Could be written as def remove(string, sub, max_remove=-1, sep=None): if sep: sub = sub + sep return string.replace(sub, '', max_remove) t = 'test,blah,blah,blah,this' print(remove(t, 'blah')) print(remove(t, 'blah', 2)) print(remove(t, 'blah', sep=',')) print(remove(t, 'blah', 2, ',')) print(remove('foo(1)blah(2)blah(3)bar', 'blah', 1)) Dj Gilcrease ____ ( | \ o () | o |`| | | /`\_/| | | ,__ ,_, ,_, __, , ,_, _| | | / | | |/ / / | |_/ / | / \_|_/ (/\___/ |/ /(__,/ |_/|__/\___/ |_/|__/\__/|_/\,/ |__/ /| \|

Your code operates differently for "test blah,this". My code produces "test ,this" while yours produces "test this". Eliding multiple separators is perhaps more useful when sep=' ' but I used commas because they're easier to see. An alternative design removes one separator either before or after a removed string (but not both). That would work better for an example like this:
remove('The Illuminati fnord are everywhere fnord.', 'fnord', sep=' ') 'The Illuminati are everywhere.'
Neither version of this may have sufficient utility to be added to standard library. --- Bruce http://www.vroospeak.com http://j.mp/gruyere-security On Thu, Oct 14, 2010 at 8:22 PM, Dj Gilcrease <digitalxero@gmail.com> wrote:

People already do this with the s1.replace(s2, "") idiom. I'm not sure what the added value is. Your equivalent implementation looks pretty strange and complex: how is it different from str.replace with the empty string as second argument? cheers lvh

On Thu, 14 Oct 2010 20:58:52 +0000 Dave Jakeman <davejakeman@hotmail.com> wrote:
The existing construct a = b.replace(c, '', 1) The problem isn't that it's non-intuitive (there's only one intuitive interface, and it's got nothing to do with computers), it's that there are a wealth of "intuitive" meanings. A case can be made that it should mean the same as any of thise: a = b.replace(c, '') a = b.replace(c, ' ', 1) a = b.replace(c, ' ') For that matter, it might also mean the same thing as any of these: a = re.sub(r'\s*%s\s*' % c, '', b, 1) a = re.sub(r'\s*%s\s*' % c, '', b) a = re.sub(r'\s*%s\s*' % c, ' ', b, 1) a = re.sub(r'\s*%s\s*' % c, ' ', b) a = re.sub(r'%s\s*' % c, '', b, 1) a = re.sub(r'%s\s*' % c, '', b) a = re.sub(r'%s\s*' % c, ' ', b, 1) a = re.sub(r'%s\s*' % c, ' ', b) a = re.sub(r'\s*%s' % c, '', b, 1) a = re.sub(r'\s*%s' % c, '', b) a = re.sub(r'\s*%s' % c, ' ', b, 1) a = re.sub(r'\s*%s' % c, ' ', b) Unless you can make a clear case as to why exactly one of those cases is different enough from the others to warrant a syntax all it's own, It's probably best to be explicit about the desired behavior. <mike -- Mike Meyer <mwm@mired.org> http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

On Thu, 14 Oct 2010 23:49:25 +0200 Masklinn <masklinn@masklinn.net> wrote:
Well, if you use the standard left-to right ordering, that equality doesn't hold for the proposed meaning for string subtraction: ("xyzzy and " + "xyzzy") - "xyzzy" = " and xyzzy" != "xyzzy and " It won't hold for any of the definition I proposed either - not if a contains a copy of b. Come to think of it, it doesn't hold for the computer representation of numbers, either. <mike -- Mike Meyer <mwm@mired.org> http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Here's a useful function along these lines, which ideally would be string.remove(): def remove(s, sub, maxremove=None, sep=None): """Removes instances of sub from the string. Args: s: The string to be modified. sub: The substring to be removed. maxremove: If specified, the maximum number of instances to be removed (starting from the left). If omitted, removes all instances. sep: Optionally, the separators to be removed. If the separator appears on both sides of a removed substring, one of the separators is removed.
processed = '' remaining = s while maxremove is None or maxremove > 0: parts = remaining.split(sub, 1) if len(parts) == 1: return processed + remaining processed += parts[0] remaining = parts[1] if sep and processed.endswith(sep) and remaining.startswith(sep): remaining = remaining[len(sep):] if maxremove is not None: maxremove -= 1 return processed + remaining --- Bruce Latest blog post: http://www.vroospeak.com/2010/10/today-we-are-all-chileans.html<http://www.vroospeak.com> Learn how hackers think: http://j.mp/gruyere-security On Thu, Oct 14, 2010 at 3:16 PM, spir <denis.spir@gmail.com> wrote:

On Thu, Oct 14, 2010 at 7:45 PM, Bruce Leban <bruce@leapyear.org> wrote:
Could be written as def remove(string, sub, max_remove=-1, sep=None): if sep: sub = sub + sep return string.replace(sub, '', max_remove) t = 'test,blah,blah,blah,this' print(remove(t, 'blah')) print(remove(t, 'blah', 2)) print(remove(t, 'blah', sep=',')) print(remove(t, 'blah', 2, ',')) print(remove('foo(1)blah(2)blah(3)bar', 'blah', 1)) Dj Gilcrease ____ ( | \ o () | o |`| | | /`\_/| | | ,__ ,_, ,_, __, , ,_, _| | | / | | |/ / / | |_/ / | / \_|_/ (/\___/ |/ /(__,/ |_/|__/\___/ |_/|__/\__/|_/\,/ |__/ /| \|

Your code operates differently for "test blah,this". My code produces "test ,this" while yours produces "test this". Eliding multiple separators is perhaps more useful when sep=' ' but I used commas because they're easier to see. An alternative design removes one separator either before or after a removed string (but not both). That would work better for an example like this:
remove('The Illuminati fnord are everywhere fnord.', 'fnord', sep=' ') 'The Illuminati are everywhere.'
Neither version of this may have sufficient utility to be added to standard library. --- Bruce http://www.vroospeak.com http://j.mp/gruyere-security On Thu, Oct 14, 2010 at 8:22 PM, Dj Gilcrease <digitalxero@gmail.com> wrote:
participants (8)
-
Bruce Leban
-
Dave Jakeman
-
Dj Gilcrease
-
Dougal Matthews
-
Laurens Van Houtven
-
Masklinn
-
Mike Meyer
-
spir