[Python-ideas] Format mini-language for lakh and crore

Stephan Houben stephanh42 at gmail.com
Sun Jan 28 04:30:35 EST 2018


Hi David,

Perhaps the "n" locale-dependent number formatting specifier
should accept a , to have locale-appropriate formatting of thousand
separators?

f"{x:,n}"

would Do The Right Thing(TM) depending on the locale.

Today it is an error.

Stephan

2018-01-28 7:25 GMT+01:00 David Mertz <mertz at gnosis.cx>:

> In South Asia, a different style of digit delimiters for large numbers is
> used than in Europe, North America, Australia, etc.  With some minor
> spelling differences, the term lakh is used for a hundred-thousand, and it
> is generally written as '1,00,000'.
>
> In turn, a crore is 100 lakh, and is written as '1,00,00,000'.  Extending
> this pattern, larger numbers continue to use two digits in groups (other
> than the smallest grouping of three digits.  So, e.g. 1e12 is written
> as 10,00,00,00,00,000.
>
> It's nice that we now have the optional underscore in numeric literals.
> So we could write a number as either `12_34_56_78_00_000` or
> `1_234_567_800_000` depending on what region of the world and which
> convention was more familiar.
>
> However, in *formatting* those numbers, the format mini-language only
> allows the European convention.  So e.g.
>
> In [1]: x = 12_34_56_78_00_000
> In [2]: "{:,d}".format(x)
> Out[2]: '1,234,567,800,000'
> In [3]: f"{x:,d}"
> Out[3]: '1,234,567,800,000'
>
>
> In order to get Indian number delimiters, you'd have to write a custom
> formatting function, notwithstanding that something like 1.5 billion people
> use the three-then-two delimiting convention.
>
> I propose that Python should have an additional grouping option, or some
> other way to specify this grouping convention.  Oddly, the '_' grouping
> symbol is available, even though no one actually uses that grouper outside
> of programming languages like Python, e.g.:
>
> In [4]: f"{x:_d}"
> Out[4]: '1_234_567_800_000'
>
>
> I guess this is nice for something like round-tripping numbers used in
> code, but it's not a symbol anyone uses "natively" (I understand why comma
> or period cannot be used in numeric literals since they mean something else
> in Python already).
>
> I'm not sure what symbol or combination I would recommend, but finding
> something suitable shouldn't be so hard.  Perhaps now that backtick no
> longer has any other meaning in Python, it could be used since it looks
> similar to a comma.  E.g. in Python 3.8 we might have:
>
> >>> f"{x:`d}"
> '12,34,56,78,00,000'
>
> (actually, this probably isn't any parser issue even in Python 2 since
> it's already inside quotes; but the issue is moot).
>
> Or maybe a two character version like:
>
> >>> f"{x:2,d}"
> '12,34,56,78,00,000'
>
>
> Or:
>
> >>> f"{x:,,d}"
> '12,34,56,78,00,000'
>
>
> Even if `2,` was used, that wouldn't preclude giving an additional length
> descriptor after it.  Now we can have:
>
> >>> f"{x:,.2f}"
>
> '1,234,567,800,000.00'
>
> Perhaps in the future this would work:
>
> >>> f"{x:2,.2f}"
> '12,34,56,78,00,000.00'
>
>
> --
> Keeping medicines from the bloodstreams of the sick; food
> from the bellies of the hungry; books from the hands of the
> uneducated; technology from the underdeveloped; and putting
> advocates of freedom in prisons.  Intellectual property is
> to the 21st century what the slave trade was to the 16th.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180128/751213b4/attachment-0001.html>


More information about the Python-ideas mailing list