[Python-Dev] Python and the Unicode Character Database

"Martin v. Löwis" martin at v.loewis.de
Fri Dec 3 00:27:16 CET 2010

> The point is that we support all of Unicode in Python, not just a fragment,
> and therefore the numeric constructors support all of Unicode.

That conclusion is as false today as it was in Python 1.6, but only now
people start caring about that.

a) we don't support all of Unicode in numeric constructors. There are
   lots of things that you can write down that readers would recognize
   as a real/rational/integral number that float() won't parse.
b) if float() would restrict itself to the scientific notation of
   real numbers (as it should), Python could well continue to claim all
   of Unicode.

> Adding more locale aware numeric parsers and formatters to the
> locale module, based on these APIs is certainly a good idea,
> but orthogonal to the ongoing discussion, IMO.

Not at all. The concept of "Unicode numbers" is flawed: Unicode does
*not* prescribe any specific way to denote numbers. Unicode is about
characters, and Python supports the Unicode characters for digits as
well as it supports all the other Unicode characters.

Instead, support for non-scientific notation of real numbers should
be based on user needs, which probably can be approximated by looking
at actual scripts. This, in turn, is inherently locale-dependent.


More information about the Python-Dev mailing list