Replace accented chars with unaccented ones
jcarlson at nospam.uci.edu
Thu Mar 18 04:12:57 CET 2004
> Josiah Carlson <jcarlson at nospam.uci.edu> wrote in message news:<c37ugc$llq$1 at news.service.uci.edu>...
>>> r += xlate[ord(i)]
>>> r += i
>>Perhaps I'm going to have to create a signature and drop information
>>about this in every post to c.l.py, but repeated string additions are
>>slow as hell for any reasonably large lengthed string. It is much
>>faster to place characters into a list and ''.join() them.
> True. Is this better?
> ... body of latin1_to_ascii() ...
> r = 
> for i in unicrap:
> if xlate.has_key(ord(i)):
> r.append (xlate[ord(i)])
> elif ord(i) >= 0x80:
> r.append (i)
> return ''.join(r)
''.join([xlate.get(ord(i), i) for i in unicrap \
if ord(i) in xlate or ord(i) < 0x80]
Using r.append(), in general, while being faster than string addition,
is significantly slower than using list comprehensions.
More information about the Python-list