[Python-Dev] Python and the Unicode Character Database

"Martin v. Löwis" martin at v.loewis.de
Mon Nov 29 00:03:45 CET 2010


>> Now, one may wonder what precisely a "possibly signed floating point
>> number" is, but most likely, this refers to
>>
>> floatnumber   ::=  pointfloat | exponentfloat
>> pointfloat    ::=  [intpart] fraction | intpart "."
>> exponentfloat ::=  (intpart | pointfloat) exponent
>> intpart       ::=  digit+
>> fraction      ::=  "." digit+
>> exponent      ::=  ("e" | "E") ["+" | "-"] digit+
>> digit          ::=  "0"..."9"
> 
> I don't see why the language spec should limit the wealth of number
> formats supported by float().

If it doesn't, there should be some other specification of what
is correct and what is not. It must not be unspecified.

> It is not uncommon for Asians and other non-Latin script users to
> use their own native script symbols for numbers. Just because these
> digits may look strange to someone doesn't mean that they are
> meaningless or should be discarded.

Then these users should speak up and indicate their need, or somebody
should speak up and confirm that there are users who actually want
'١٢٣٤.٥٦' to denote 1234.56. To my knowledge, there is no writing
system in which '١٢٣٤.٥٦e4' means 12345600.0.

> Please also remember that Python3 now allows Unicode names for
> identifiers for much the same reasons.

No no no. Addition of Unicode identifiers has a well-designed,
deliberate specification, with a PEP and all. The support for
non-ASCII digits in float appears to be ad-hoc, and not founded
on actual needs of actual users.

> Note that the support in float() (and the other numeric constructors)
> to work with Unicode code points was explicitly added when Unicode
> support was added to Python and has been available since Python 1.6.

That doesn't necessarily make it useful. Alexander's complaint is that
it makes Python unstable (i.e. changing as the UCD changes).

> It is not a bug by any definition of "bug"

Most certainly it is: the documentation is either underspecified,
or deviates from the implementation (when taking the most plausible
interpretation). This is the very definition of "bug".

Regards,
Martin


More information about the Python-Dev mailing list