Optimization help needed: Search and Replace using dictionary of parameters

maxm maxm at mxm.dk
Tue Jan 1 06:34:16 EST 2002


"Pekka Niiranen" <krissepu at vip.fi> wrote in message
news:3C309E64.6A7E37DD at vip.fi...
> How can I do this most efficiently:
>
>     I have filenames and parameters in a sparse matrix that is a
> dictionary:
>

I dont remember who posted this snippet on c.l.py a long time ago, but it
works like a charm, so I havn't bothered changing it.

Actually it ought to be a standard method on the string object, or at least
avaliable in the standard distribution somehow as it get's asked regularly
on the group.

But here it is again. Using this it should be easy to just itereate over the
files you want changed and then use the class on it.

regards Max M

###################################################

import re, string

class MultiReplace:

    def __init__(self, repl_dict):
        # "compile" replacement dictionary

        # assume char to char mapping
        charmap = map(chr, range(256))
        for k, v in repl_dict.items():
            if len(k) != 1 or len(v) != 1:
                self.charmap = None
                break
            charmap[ord(k)] = v
        else:
            self.charmap = string.join(charmap, "")
            return

        # string to string mapping; use a regular expression
        keys = repl_dict.keys()
        keys.sort() # lexical order
        pattern = string.join(map(re.escape, keys), "|")
        self.pattern = re.compile(pattern)
        self.dict = repl_dict


    def replace(self, str):
        # apply replacement dictionary to string
        if self.charmap:
            return string.translate(str, self.charmap)
        def repl(match, get=self.dict.get):
            item = match.group(0)
            return get(item, item)
        return self.pattern.sub(repl, str)


if __name__ == '__main__':

    r = MultiReplace({"spam": "eggs", "spam": "eggs"})
    print r.replace("spam&eggs")
    ## eggs&spam

    r = MultiReplace({"a": "b", "b": "a"})
    print r.replace("keaba")
    ## kebab






More information about the Python-list mailing list