[Tutor] clean text
A.T.Hofkamp
a.t.hofkamp at tue.nl
Tue May 19 13:28:09 CEST 2009
spir wrote:
> def _cleanRepr(text):
> ''' text with control chars replaced by repr() equivalent '''
> chars = []
> for char in text:
> n = ord(char)
> if (n < 32) or (n > 126 and n < 160):
> char = repr(char)[1:-1]
> chars.append(char)
> return ''.join(chars)
>
> But what else can I do?
You seem to break down the string to single characters, replace a few of them,
and then build the whole string back.
Maybe you can insert larger chunks of text that do not need modification, ie
something like
start = 0
for idx, char in text:
n = ord(char)
if n < 32 or 126 < n < 160:
chars.append(text[start:idx])
chars.append(repr(char)[1:-1])
start = idx + 1
chars.append(text[start:])
return ''.join(chars)
An alternative of the above is to keep track of the first occurrence of each
of the chars you want to split on (after some 'start' position), and compute
the next point to break the string as the min of all those positions instead
of slowly 'walking' to it by testing each character seperately.
That would reduce the number of iterations you do in the loop, at the cost of
maintaining a large number of positions of the next breaking point.
Albert
More information about the Tutor
mailing list