Multiple string replacement...

Fredrik Lundh effbot at telia.com
Fri Sep 22 11:19:06 EDT 2000


Simon wrote:
> This all works OK, except...
> 
> I set up a ROT13 replacement set, which looks like:
> [['A', 'N'], ['B', 'O'], ['C', 'P'], ['D', 'Q'], ['E', 'R'], ['F', 'S'],
> ['G', 'T'], ['H', 'U'], ['I', 'V'], ['J', 'W'], ['K', 'X'], ['L', 'Y'],
> ['M', 'Z'], ['N', 'A'], ... and so on. The problem here is that after 'A'
> and been replaced by 'N', it gets replaced back again.
> 
> Anyone got a cunning plan?

here's one (reusable) way to do it:

import re, string

class MultiReplace:
    def __init__(self, repl_dict):
        # "compile" replacement dictionary

        # assume char to char mapping
        charmap = map(chr, range(256))
        for k, v in repl_dict.items():
            if len(k) != 1 or len(v) != 1:
                self.charmap = None
                break
            charmap[ord(k)] = v
        else:
            self.charmap = string.join(charmap, "")
            return

        # string to string mapping; use a regular expression
        keys = repl_dict.keys()
        keys.sort() # lexical order
        pattern = string.join(map(re.escape, keys), "|")
        self.pattern = re.compile(pattern)
        self.dict = repl_dict

    def replace(self, str):
        # apply replacement dictionary to string
        if self.charmap:
            return string.translate(str, self.charmap)
        def repl(match, get=self.dict.get):
            item = match.group(0)
            return get(item, item)
        return self.pattern.sub(repl, str)

>>> r = MultiReplace({"spam": "eggs", "spam": "eggs"})
>>> print r.replace("spam&eggs")
eggs&spam

>>> r = MultiReplace({"a": "b", "b": "a"})
>>> print r.replace("keaba")
kebab

</F>

<!-- (the eff-bot guide to) the standard python library:
http://www.pythonware.com/people/fredrik/librarybook.htm
-->




More information about the Python-list mailing list