[Python-Dev] Python and the Unicode Character Database

Vlastimil Brom vlastimil.brom at gmail.com
Tue Dec 7 14:02:47 CET 2010

2010/12/7 Alexander Belopolsky <alexander.belopolsky at gmail.com>:
> On Sat, Dec 4, 2010 at 5:58 PM, "Martin v. Löwis" <martin at v.loewis.de> wrote:
>>> I actually wonder if Python's re module can claim to provide even
>>> Basic Unicode Support.
>> Do you really wonder? Most definitely it does not.
> Were you more optimistic four years ago?
> http://bugs.python.org/issue1528154#msg54864
> I was hoping that regex syntax would be useful in
> explaining/documenting Python text processing routines (including
> string to number conversions) without a heavy dose of Unicode
> terminology.

The new regex version
supports much more features, including unicode properties, and the
mentioned possix classes etc. but definitely not all of the
requirements of that rather "generous" list.
It seems, e.g. in Perl, there are some omissions too

Do you know of any re engine fully complying to to tr18, even at the
first level: "Basic Unicode Support"?


More information about the Python-Dev mailing list