2010/12/7 Alexander Belopolsky firstname.lastname@example.org:
On Sat, Dec 4, 2010 at 5:58 PM, "Martin v. Löwis" email@example.com wrote:
I actually wonder if Python's re module can claim to provide even Basic Unicode Support.
Do you really wonder? Most definitely it does not.
Were you more optimistic four years ago?
I was hoping that regex syntax would be useful in explaining/documenting Python text processing routines (including string to number conversions) without a heavy dose of Unicode terminology.
The new regex version http://bugs.python.org/issue2636 supports much more features, including unicode properties, and the mentioned possix classes etc. but definitely not all of the requirements of that rather "generous" list. http://www.unicode.org/reports/tr18/ It seems, e.g. in Perl, there are some omissions too http://perldoc.perl.org/perlunicode.html#Unicode-Regular-Expression-Support-...
Do you know of any re engine fully complying to to tr18, even at the first level: "Basic Unicode Support"?