unicode categories -- regex
koara at atlas.cz
Sat Sep 22 17:57:30 CEST 2007
Hello all -- my question regards special meta characters for the re
module. I saw in the re module documentation about the possibility to
abstract to any alphanumeric unicode character with '\w'. However,
there was no info on constructing patterns for other unicode
categories, such as purely alphabetical characters, or punctuation
I found that this category information actually IS available in python
-- in the standard module unicodedata. For example,
unicodedata.category(u'.') gives 'Po' for 'Punctuation, other' etc.
So how do i include this information in regular pattern search? Any
I'm talking about python2.5 here.
More information about the Python-list