[Python-Dev] Unicode 5.1.0

Guido van Rossum guido at python.org
Mon Aug 25 18:04:20 CEST 2008


2008/8/25 M.-A. Lemburg <mal at egenix.com>:
> I would really like to see more Unicode support in Python, e.g.
> for collation, compression, indexing based on graphemes and
> code points, better support for special casing situations (to
> cover e.g. the dotted vs. non-dotted i in the Turkish scripts),
> etc.
>
> There are also a few changes that we'd need to incorporate into
> the UTF codecs, e.g. warn about more ill-formed byte sequences.
>
> Would Google be willing to contribute such support or part
> of it ?

That depends purely on how much need Google itself has for these features.
I'll ask around, but for now I wouldn't bet on anything beyond the three
points I raised at the start of this thread:

1. Upgrade the unicodata module to the Unicode 5.1.0 standard
2. Extende the unicodedata module with some additional properties
3. Add support for Unicode properties to the regex syntax, including
Boolean combinations

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20080825/06697171/attachment.htm>


More information about the Python-Dev mailing list