2008/8/25 M.-A. Lemburg <<a href="mailto:mal@egenix.com">mal@egenix.com</a>>:<br>> I would really like to see more Unicode support in Python, e.g.<br>> for collation, compression, indexing based on graphemes and<br>
> code points, better support for special casing situations (to<br>> cover e.g. the dotted vs. non-dotted i in the Turkish scripts),<br>> etc.<br>><br>> There are also a few changes that we'd need to incorporate into<br>
> the UTF codecs, e.g. warn about more ill-formed byte sequences.<br>><br>> Would Google be willing to contribute such support or part<br>> of it ?<br><br>That depends purely on how much need Google itself has for these features. I'll ask around, but for now I wouldn't bet on anything beyond the three points I raised at the start of this thread:<br>
<br>1. Upgrade the unicodata module to the Unicode 5.1.0 standard<br>2. Extende the unicodedata module with some additional properties<br>3. Add support for Unicode properties to the regex syntax, including<br> Boolean combinations<br>
<br>-- <br>--Guido van Rossum (home page: <a href="http://www.python.org/~guido/">http://www.python.org/~guido/</a>)<br><br>