Delete all not allowed characters..
bozonm at vscht.cz
Thu Oct 25 23:23:37 CEST 2007
>> the list comprehension does not allow "else", but it can be used in a
>> similar form:
( I was wrong, as Tim Chase have shown )
>> s2 = ""
>> for ch in s1:
>> s2 += ch if ch in allowed else " "
>> (maybe this could be written more nicely)
> Repeatedly adding strings together in this way is about the most
> inefficient, slow way of building up a long string. (Although I'm sure
> somebody can come up with a worse way if they try hard enough.)
> Even though recent versions of CPython have a local optimization that
> improves the performance hit of string concatenation somewhat, it is
> better to use ''.join() rather than add many strings together:
String appending is not tragically slower,
for strings long tens of MB, the speed
makes me a difference in few tens of percents,
so it is not several times slower, or so
> s2 = 
> for ch in s1:
> s2.append(ch if (ch in allowed) else " ")
> s2 = ''.join(s2)
> Although even that doesn't come close to the efficiency and speed of
> string.translate() and string.maketrans(). Try to find a way to use them.
> Here is one way, for ASCII characters.
> allowed = "abcdef"
> all = string.maketrans('', '')
> not_allowed = ''.join(c for c in all if c not in allowed)
> table = string.maketrans(not_allowed, ' '*len(not_allowed))
> new_string = string.translate(old_string, table)
Nice, I did not know that string translation exists, but
Abandoned have defined allowed characters, so making
a translation table for the unallowed characters,
which would take nearly complete unicode character table
would be inefficient.
More information about the Python-list