python tr equivalent (non-ascii)
fredrik at pythonware.com
Wed Aug 13 10:33:13 CEST 2008
> I was wondering how I ought to be handling character range
> translations in python.
> What I want to do is translate fullwidth numbers and roman alphabet
> characters into their halfwidth ascii equivalents.
> In perl I can do this pretty easily with tr:
> and I think the string.translate method is what I need to use to
> achieve the equivalent in python. Unfortunately the maktrans method
> doesn't seem to accept character ranges and I'm also having trouble
> with it's interpretation of length. What I came up with was to first
> fudge the ranges:
> my_test_string = u"ＡＢＣＤＥＦＧ"
> f_range = "".join([unichr(x) for x in
> t_range = "".join([unichr(x) for x in
> then use these as input to maketrans:
> my_trans_string =
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> UnicodeEncodeError: 'ascii' codec can't encode characters in position
> 0-93: ordinal not in range(128)
maketrans only works for byte strings.
as for translate itself, it has different signatures for byte strings
and unicode strings; in the former case, it takes lookup table
represented as a 256-byte string (e.g. created by maketrans), in the
latter case, it takes a dictionary mapping from ordinals to ordinals or
lut = dict((0xff00 + ch, 0x0020 + ch) for ch in range(0x80))
new_string = old_string.translate(lut)
could work (untested).
More information about the Python-list