[Python-ideas] [Python-Dev] Unicode minus sign in numeric conversions

David Mertz mertz at gnosis.cx
Sun Jun 9 01:21:22 CEST 2013


On Jun 8, 2013, at 3:52 PM, Guido van Rossum wrote:
> Apologies, Python 3 does actually have limited support for the other
> Unicode digits (actually only the ones marked "Decimal" IIUC). I'd
> totally forgotten about that (since I still live primarily in an ASCII
> world :-). E.g.

This is cool, and I hadn't known about it.  I had just written a toy implementation of my own _float() to show a possible behavior.  Then looking at Guido's post, I find that:

>>> import unicodedata
>>> x = (
...   unicodedata.lookup('ARABIC-INDIC DIGIT ONE')+
...   unicodedata.lookup('ARABIC-INDIC DIGIT TWO')+
...   unicodedata.lookup('ARABIC-INDIC DIGIT THREE')+
...   "."+
...   unicodedata.lookup('ARABIC-INDIC DIGIT FOUR')+
...   unicodedata.lookup('ARABIC-INDIC DIGIT FIVE'))
>>> x
'١٢٣.٤٥'
>>> float(x)
123.45

... my idea was to add an optional named argument like 'lang="Arabic"', but really it isn't needed since the digits MEAN the same thing in various scripts.  However, this DOES seem a arguably strange as behavior:

>>> x = ('123.'+
...   unicodedata.lookup('ARABIC-INDIC DIGIT FOUR')+
...   unicodedata.lookup('ARABIC-INDIC DIGIT FIVE'))
>>> x
'123.٤٥'
>>> float(x)
123.45

Not wrong, but possibly surprising.

--
If I seem shortsighted to you, it is only because I have stood on the backs of midgets.



More information about the Python-ideas mailing list