Extracting "true" words
candide at free.invalid
Sat Apr 2 15:18:44 CEST 2011
Le 02/04/2011 01:10, Chris Rebert a écrit :
> "Word" presumably/intuitively; hence the non-standard "[:word:]"
> POSIX-like character class alias for \w in some environments.
> Are you intentionally excluding CJK ideographs (as not "letters"/alphabetic)?
Yes, CJK ideographs don't belong to the locale I'm working with ;)
> And what of hyphenated terms (e.g. "re-lock")?
I'm interested only with ascii letters and ascii letters with diacritics
Thanks for your response.
More information about the Python-list