[I18n-sig] International Components for Unicode

Martin v. Loewis martin@loewis.home.cs.tu-berlin.de
Sat, 23 Jun 2001 09:47:34 +0200


> Is this of any value to us?
> 
> http://oss.software.ibm.com/icu/index.html

I'm not sure. It always seemed to me that ICU is an all-or-nothing
solution. I.e. if you want to access its functionality, you have to
use their Unicode type, their locale objects, their message catalogs
and so on.

Python 2.1 offers already quite a lot of this functionality; merging
that with ICU would be a real challenge. You'd probably need to offer
a choice: either ICU locales or C locales; either ICU message catalogs
or gettext. For the Unicode types, you'd have to copy strings forth
and back between ICU Unicode objects and Python Unicode objects.

Also, offering these services to Python users is challenging. It can't
really become a standard library: The ICU distribution is 6.5MB of C++
source code, so I doubt it would be ever included in core
Python. Somebody could volunteer and offer wrapper code, and put that
on SF. To use that API, and application author would need to get ICU,
and the wrapper (preferably in versions that match). Later, all users
of the application also need to install ICU, and the wrapper. These
days, Linux distributions offer precompiled ICU installations, but
that might add to the problems rather than reducing them: The wrapper
will need to deal with multiple ICU versions.

Finally, ICU solves non of the most urgent Python-and-I18N problems:
None of the standard libraries will become more Unicode-aware than
they are now; it still is not possible to use non-ASCII text in source
code in a convenient way; printing Unicode strings to sys.stdout will
continue to produce exceptions.

So my guess is that nothing will happen with ICU integration, and that
the question will come up every few months.

Regards,
Martin