need help of RE

Steven Bethard steven.bethard at gmail.com
Sun May 29 11:17:32 EDT 2005


John Machin wrote:
>  >>> import re
>  >>> text = "(word1 & (Word2|woRd3))".lower()
> # you seem to want downshifting ...
>  >>> re.split(r"\W+", text)
> ['', 'word1', 'word2', 'word3', '']
>  >>>
> 
> Hmmm ... near, but not exactly what you want. We need to throw away 
> those empty strings, which will appear if you have non-word characters 
> at the ends of your text.

You can also avoid the empty strings at the end by using re.findall with 
\w instead of re.split with \W:

py> import re
py> text = "(word1 & (Word2|woRd3))".lower()
py> re.findall(r"\w+", text)
['word1', 'word2', 'word3']

STeVe



More information about the Python-list mailing list