help make it faster please
Lonnie Princehouse
finite.automaton at gmail.com
Thu Nov 10 14:36:58 EST 2005
The word_finder regular expression defines what will be considered a
word.
"[a-z0-9_]" means "match a single character from the set {a through z,
0 through 9, underscore}".
The + means "match as many as you can, minimum of one"
To match @ as well, add it to the set of characters to match:
word_finder = re.compile('[a-z0-9_@]+', re.I)
The re.I flag makes the expression case insensitive.
See the documentation for re for more information.
Also--- It looks like I forgot to lowercase matched words. The line
word = match.group(0)
should read:
word = match.group(0).lower()
More information about the Python-list
mailing list