[Tutor] finding special character string
Kent Johnson
kent37 at tds.net
Tue Jun 3 11:47:23 CEST 2008
On Tue, Jun 3, 2008 at 5:13 AM, Dinesh B Vadhia
<dineshbvadhia at hotmail.com> wrote:
> Yes, I'm happy because I found a non-regex way to solve the problem (see
> below).
> How did I solve it? I found a list of all the special words, created a set
> of special words and then checked if each word in the text belonged to the
> set of special words. If we assume that the list of special words doesn't
> exist then the problem is interesting in itself to solve.
Even with the list of special words I would still use a regex and
process the whole file at once. If the list is in the variable
'specials' and the file data in 'data', then build and apply a regex
like this:
import re
specialsRe = re.compile('|'.join(r'\.%s\.' % re.escape(s) for s in specials))
data = specialsRe.sub(data, '')
The regex just escapes any special chars in the words, brackets them
with "." and joins them with "|" in between.
But without the list of specials it is still easy with a regex. This
works on your explanatory text:
data = re.sub(r'\.[a-zA-Z-]{2,}\.', '', data)
Kent
More information about the Tutor
mailing list