Re: [Python-Dev] Python and the Unicode Character Database

Nov. 28, 2010


      2010/11/28 M.-A. Lemburg <mal@egenix.com>:
...
"Martin v. Löwis" wrote:
...
...
...
...
...
> float('١٢٣٤.٥٦')
1234.56
I think it's a bug that this works. The definition of the float builtin says
Convert a string or a number to floating point. If the argument is a
string, it must contain a possibly signed decimal or floating point
number, possibly embedded in whitespace. The argument may also be
'[+|-]nan' or '[+|-]inf'.
Now, one may wonder what precisely a "possibly signed floating point
number" is, but most likely, this refers to
floatnumber   ::=  pointfloat | exponentfloat
pointfloat    ::=  [intpart] fraction | intpart "."
exponentfloat ::=  (intpart | pointfloat) exponent
intpart       ::=  digit+
fraction      ::=  "." digit+
exponent      ::=  ("e" | "E") ["+" | "-"] digit+
digit          ::=  "0"..."9"
I don't see why the language spec should limit the wealth of number
formats supported by float().
It is not uncommon for Asians and other non-Latin script users to
use their own native script symbols for numbers. Just because these
digits may look strange to someone doesn't mean that they are
meaningless or should be discarded.
That's different. Python doesn't assign any semantic meaning to the
characters in identifiers. The non-latin support for numerals, though,
could change the meaning of a program dramatically and needs to be
well-specified. Whether int() should do this is debatable. I, for one,
think this kind of support belongs in the locale module.


-- 
Regards,
Benjamin

Re: [Python-Dev] Python and the Unicode Character Database

Benjamin Peterson