Replace accented chars with unaccented ones
Josiah Carlson
jcarlson at nospam.uci.edu
Wed Mar 17 22:12:57 EST 2004
Noah wrote:
> Josiah Carlson <jcarlson at nospam.uci.edu> wrote in message news:<c37ugc$llq$1 at news.service.uci.edu>...
>
>>> r += xlate[ord(i)]
>>> r += i
>>
>>Perhaps I'm going to have to create a signature and drop information
>>about this in every post to c.l.py, but repeated string additions are
>>slow as hell for any reasonably large lengthed string. It is much
>>faster to place characters into a list and ''.join() them.
>
>
> True. Is this better?
>
> ... body of latin1_to_ascii() ...
> r = []
> for i in unicrap:
> if xlate.has_key(ord(i)):
> r.append (xlate[ord(i)])
> elif ord(i) >= 0x80:
> pass
> else:
> r.append (i)
> return ''.join(r)
I'd use:
''.join([xlate.get(ord(i), i) for i in unicrap \
if ord(i) in xlate or ord(i) < 0x80]
Using r.append(), in general, while being faster than string addition,
is significantly slower than using list comprehensions.
- Josiah
More information about the Python-list
mailing list