[Python-Dev] Unicode 5.1.0

Terry Reedy tjreedy at udel.edu
Mon Aug 25 19:13:33 CEST 2008



Guido van Rossum wrote:
> 2008/8/25 M.-A. Lemburg <mal at egenix.com <mailto:mal at egenix.com>>:
>  > I would really like to see more Unicode support in Python, e.g.
>  > for collation, compression, indexing based on graphemes and
>  > code points, better support for special casing situations (to
>  > cover e.g. the dotted vs. non-dotted i in the Turkish scripts),
>  > etc.
>  >
>  > There are also a few changes that we'd need to incorporate into
>  > the UTF codecs, e.g. warn about more ill-formed byte sequences.
>  >
>  > Would Google be willing to contribute such support or part
>  > of it ?
> 
> That depends purely on how much need Google itself has for these 
> features. I'll ask around, but for now I wouldn't bet on anything beyond 
> the three points I raised at the start of this thread:
> 
> 1. Upgrade the unicodata module to the Unicode 5.1.0 standard
> 2. Extende the unicodedata module with some additional properties
> 3. Add support for Unicode properties to the regex syntax, including
> Boolean combinations

I think an Improve Unicode Support PEP would be a good idea to collect 
(and get approval or not for) various ideas from various people, even if 
Google only implements part of the PEP.



More information about the Python-Dev mailing list