regexps with unicode-aware characterclasses?
Fredrik Lundh
fredrik at pythonware.com
Tue Aug 30 10:33:07 EDT 2005
Stefan Rank wrote:
> I know that there is a re.U switch that makes \w match all unicode word
> characters, but there are no subclasses of that ([[:upper:]] or preferably \u).
unicode character classes are not supported by the current RE engine.
it's usually possible to work around this by matching all characters ("\w") in Unicode
mode ("(?u)"), and postprocessing the result to get rid of invalid matches.
</F>
More information about the Python-list
mailing list