Regular Expression for words (with umlauts, without numbers)

Jens Lechtenboerger lechten at helios.uni-muenster.de
Fri May 13 12:01:30 EDT 2011


Dear experts,

I'm looking for a regular expression to recognize natural language
words with umlauts but without numbers.  While \w with re.U does
recognize words with umlauts, it also matches numbers, which I do
not want.

Is there a better way than an exhaustive enumeration such as
[-a-zàáâãäåæ...]?

I guess there should be a better way as \w appears to know about
alphabetical characters...

Thanks in advance
Jens



More information about the Python-list mailing list