[Python-ideas] string.swap()

spir denis.spir at free.fr
Mon Apr 6 12:46:29 CEST 2009


Hello,

Some time ago on this list was mentionned swapping sub-strings inside a string. A short while later I had a use case case for that feature; this led me to explore this topic in a general case defined as:

* Sub-strings to be swapped must not be single characters.
* The string can hold any char or sequence of chars: there is then no safe choice usable as temporary place holder. So that the common scheme s1->temp / s2->s1 / temp->s2 using string.replace() cannot be used.
* But the issue of overlapping substrings is not taken into account.

I asked for algorithms on the tutor mailing list (see below). In addition to the 'naive', but tricky, step-by-step process, there was:
~ A proposal with regex, 'cheating' with the sub() method (I had some similar method, but Kent's is much smarter than mine).
~ A very clever proposal that splits the string into a list and joins back.

My conclusions about this feature are as follows:
-0- It seems rarely needed.
-1- It is of general use, as opposed to domain-specific.
-2- It is much less trivial/obvious than it seems.
-3- It maps conceptually to a single, whole, operation.

What do you think? Is it worth having it Python?


Below exemples of implementation. Note that behind the scene the regex method must perform something analog to the the naive one.
Timing produced the following results:
~ For any reason, the regex version's time is very unstable on smaller strings / number of replacements / number of trials (up to ~10000). Time varies from 60% to 120% of the naive version time.
~ On bigger trials (~ 100000) regex and naive versions are about as (in)efficient.
~ The list version is about 5 times faster in all cases.

def swapNaive(text, s1, s2):
    new_text = ""
    l1, l2 = len(s1), len(s2)
    while (text.count(s1) > 0) or (text.count(s2) > 0):
        i1, i2 = text.find(s1), text.find(s2)
        if i1>=0 and (i2<0 or i1 < i2):
            new_text += text[:i1] + s2
            text = text[i1+l1:]
        else:
            new_text += text[:i2] + s1
            text = text[i2+l2:]
    new_text += text
    return new_text

### proposed by Albert T. Hofkamp
### on the tutor mailing list
def swapList(text, s1, s2):
    pieces = text.split(s1)
    pieces = [p.replace(s2, s1) for p in pieces]
    return s2.join(pieces)

### proposed by Kent Johnson
### on the tutor mailing list
import re
def swapRegex(text, s1, s2):
    def replace(m):
        return s2 if m.group()==s1 else s1
    matcher = re.compile('%s|%s' % (re.escape(s1), re.escape(s2)))
    return matcher.sub(replace, text)

Denis
------
la vita e estrany



More information about the Python-ideas mailing list