[Python-Dev] Python and the Unicode Character Database

Alexander Belopolsky alexander.belopolsky at gmail.com
Tue Nov 30 19:21:30 CET 2010


On Tue, Nov 30, 2010 at 12:40 PM, Michael Foord
<fuzzyman at voidspace.org.uk> wrote:
..
>> If you think non-ASCII digits are not difficult to support, please
>> contribute to the following tracker issues:
>>
>
> Would moving this functionality to the locale module make the issues any
> easier to fix?
>

Sure, if we code it in Python, supporting it will by much easier:

def normalize_digits(s):
    digits = {m.group(1) for m in re.finditer('(\d)', s)}
    trtab = {ord(d): str(unicodedata.digit(d)) for d in digits}
    return s.translate(trtab)

>>> normalize_digits('١٢٣٤.٥٦')
'1234.56'

I am not sure this belongs to the locale module, however.  It seems to
me, something like 'unicodealgo' for unicode algorithms would be more
appropriate.


More information about the Python-Dev mailing list