Replace accented chars with unaccented ones

Josiah Carlson jcarlson at
Thu Mar 18 04:12:57 CET 2004

Noah wrote:

> Josiah Carlson <jcarlson at> wrote in message news:<c37ugc$llq$1 at>...
>>>            r += xlate[ord(i)]
>>>            r += i
>>Perhaps I'm going to have to create a signature and drop information 
>>about this in every post to, but repeated string additions are 
>>slow as hell for any reasonably large lengthed string.  It is much 
>>faster to place characters into a list and ''.join() them.
> True. Is this better?
>     ... body of latin1_to_ascii() ...
>     r = []
>     for i in unicrap:
>         if xlate.has_key(ord(i)):
>             r.append (xlate[ord(i)])
>         elif ord(i) >= 0x80:
>             pass
>         else:
>             r.append (i)
>     return ''.join(r)

I'd use:
''.join([xlate.get(ord(i), i) for i in unicrap \
           if ord(i) in xlate or ord(i) < 0x80]

Using r.append(), in general, while being faster than string addition, 
is significantly slower than using list comprehensions.

  - Josiah

More information about the Python-list mailing list