Unicode regular expressions -- buggy?
Fredrik Lundh
fredrik at pythonware.com
Thu Aug 11 04:08:24 EDT 2005
Christopher Subich wrote:
> I don't think the python regular expression module correctly handles
> combining marks; it gives inconsistent results between equivalent forms
> of some regular expressions:
> Is this a limitation-by-design, or a bug?
limitation by design. if you want correct results, make sure to use
early normalization everywhere.
cf. http://www.w3.org/TR/charmod-norm/
</F>
More information about the Python-list
mailing list