Re: [Python-Dev] Unicode 5.1.0

Aug. 25, 2008


      Guido van Rossum wrote:
...
2008/8/25 M.-A. Lemburg <mal@egenix.com <mailto:mal@egenix.com>>:
...
I would really like to see more Unicode support in Python, e.g.
for collation, compression, indexing based on graphemes and
code points, better support for special casing situations (to
cover e.g. the dotted vs. non-dotted i in the Turkish scripts),
etc.
There are also a few changes that we'd need to incorporate into
the UTF codecs, e.g. warn about more ill-formed byte sequences.
Would Google be willing to contribute such support or part
of it ?
That depends purely on how much need Google itself has for these 
features. I'll ask around, but for now I wouldn't bet on anything beyond 
the three points I raised at the start of this thread:
1. Upgrade the unicodata module to the Unicode 5.1.0 standard
2. Extende the unicodedata module with some additional properties
3. Add support for Unicode properties to the regex syntax, including
Boolean combinations
I think an Improve Unicode Support PEP would be a good idea to collect 
(and get approval or not for) various ideas from various people, even if 
Google only implements part of the PEP.

Re: [Python-Dev] Unicode 5.1.0

Terry Reedy