[Python-Dev] Python and the Unicode Character Database

Antoine Pitrou solipsis at pitrou.net
Tue Nov 30 19:29:52 CET 2010


> Sure, if we code it in Python, supporting it will by much easier:
> 
> def normalize_digits(s):
>     digits = {m.group(1) for m in re.finditer('(\d)', s)}
>     trtab = {ord(d): str(unicodedata.digit(d)) for d in digits}
>     return s.translate(trtab)
> 
> >>> normalize_digits('١٢٣٤.٥٦')
> '1234.56'
> 
> I am not sure this belongs to the locale module, however.  It seems to
> me, something like 'unicodealgo' for unicode algorithms would be more
> appropriate.

It could simply be in unicodedata if you split the implementation into a
core C part and some Python bits.

Regards

Antoine.




More information about the Python-Dev mailing list