2008/8/25 M.-A. Lemburg &lt;<a href="mailto:mal@egenix.com">mal@egenix.com</a>&gt;:<br>&gt; I would really like to see more Unicode support in Python, e.g.<br>&gt; for collation, compression, indexing based on graphemes and<br>

&gt; code points, better support for special casing situations (to<br>&gt; cover e.g. the dotted vs. non-dotted i in the Turkish scripts),<br>&gt; etc.<br>&gt;<br>&gt; There are also a few changes that we&#39;d need to incorporate into<br>

&gt; the UTF codecs, e.g. warn about more ill-formed byte sequences.<br>&gt;<br>&gt; Would Google be willing to contribute such support or part<br>&gt; of it ?<br><br>That depends purely on how much need Google itself has for these features. I&#39;ll ask around, but for now I wouldn&#39;t bet on anything beyond the three points I raised at the start of this thread:<br>

<br>1. Upgrade the unicodata module to the Unicode 5.1.0 standard<br>2. Extende the unicodedata module with some additional properties<br>3. Add support for Unicode properties to the regex syntax, including<br>   Boolean combinations<br>

<br>-- <br>--Guido van Rossum (home page: <a href="http://www.python.org/~guido/">http://www.python.org/~guido/</a>)<br><br>