String multi-replace
Benjamin Kaplan
benjamin.kaplan at case.edu
Wed Nov 17 23:30:12 EST 2010
On Wed, Nov 17, 2010 at 11:21 PM, Sorin Schwimmer <sxn02 at yahoo.com> wrote:
> Hi All,
>
> I have to eliminate diacritics in a fairly large file.
>
> Inspired by http://code.activestate.com/recipes/81330/, I came up with the following code:
>
> #! /usr/bin/env python
>
> import re
>
> nodia={chr(196)+chr(130):'A', # mamaliga
> chr(195)+chr(130):'A', # A^
> chr(195)+chr(142):'I', # I^
> chr(195)+chr(150):'O', # OE
> chr(195)+chr(156):'U', # UE
> chr(195)+chr(139):'A', # AE
> chr(197)+chr(158):'S',
> chr(197)+chr(162):'T',
> chr(196)+chr(131):'a', # mamaliga
> chr(195)+chr(162):'a', # a^
> chr(195)+chr(174):'i', # i^
> chr(195)+chr(182):'o', # oe
> chr(195)+chr(188):'u', # ue
> chr(195)+chr(164):'a', # ae
> chr(197)+chr(159):'s',
> chr(197)+chr(163):'t'
> }
> name="R\xc3\xa2\xc5\x9fca"
>
> regex = re.compile("(%s)" % "|".join(map(re.escape, nodia.keys())))
> print regex.sub(lambda mo: dict[mo.string[mo.start():mo.end()]], name)
>
> But it won't work; I end up with:
>
> Traceback (most recent call last):
> File "multirep.py", line 25, in <module>
> print regex.sub(lambda mo: dict[mo.string[mo.start():mo.end()]], name)
> File "multirep.py", line 25, in <lambda>
> print regex.sub(lambda mo: dict[mo.string[mo.start():mo.end()]], name)
> TypeError: 'type' object is not subscriptable
>
> What am I doing wrong?
>
> Thanks for your advice,
> SxN
>
dict is a type, not a dict. Your dict is called nodia. I'm guess
that's what you meant to use.
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
More information about the Python-list
mailing list