[Python-Dev] Unicode 5.1.0
Terry Reedy
tjreedy at udel.edu
Mon Aug 25 19:13:33 CEST 2008
Guido van Rossum wrote:
> 2008/8/25 M.-A. Lemburg <mal at egenix.com <mailto:mal at egenix.com>>:
> > I would really like to see more Unicode support in Python, e.g.
> > for collation, compression, indexing based on graphemes and
> > code points, better support for special casing situations (to
> > cover e.g. the dotted vs. non-dotted i in the Turkish scripts),
> > etc.
> >
> > There are also a few changes that we'd need to incorporate into
> > the UTF codecs, e.g. warn about more ill-formed byte sequences.
> >
> > Would Google be willing to contribute such support or part
> > of it ?
>
> That depends purely on how much need Google itself has for these
> features. I'll ask around, but for now I wouldn't bet on anything beyond
> the three points I raised at the start of this thread:
>
> 1. Upgrade the unicodata module to the Unicode 5.1.0 standard
> 2. Extende the unicodedata module with some additional properties
> 3. Add support for Unicode properties to the regex syntax, including
> Boolean combinations
I think an Improve Unicode Support PEP would be a good idea to collect
(and get approval or not for) various ideas from various people, even if
Google only implements part of the PEP.
More information about the Python-Dev
mailing list