[Python-ideas] Format mini-language for lakh and crore

Alex Walters tritium-list at sdamon.com
Sun Jan 28 05:57:07 EST 2018

It’s my opinion that instead of adding syntax, we should instead encourage using number formatting library functions.  


* You can replace the function or have the function dispatch differently depending on locale

* It means that syntax doesn’t need to be extended for every use case – its easier to replace a function than change syntax.




From: Python-ideas [mailto:python-ideas-bounces+tritium-list=sdamon.com at python.org] On Behalf Of David Mertz
Sent: Sunday, January 28, 2018 1:25 AM
To: python-ideas <python-ideas at python.org>
Subject: [Python-ideas] Format mini-language for lakh and crore


In South Asia, a different style of digit delimiters for large numbers is used than in Europe, North America, Australia, etc.  With some minor spelling differences, the term lakh is used for a hundred-thousand, and it is generally written as '1,00,000'.


In turn, a crore is 100 lakh, and is written as '1,00,00,000'.  Extending this pattern, larger numbers continue to use two digits in groups (other than the smallest grouping of three digits.  So, e.g. 1e12 is written as 10,00,00,00,00,000.


It's nice that we now have the optional underscore in numeric literals.  So we could write a number as either `12_34_56_78_00_000` or `1_234_567_800_000` depending on what region of the world and which convention was more familiar.


However, in *formatting* those numbers, the format mini-language only allows the European convention.  So e.g.


In [1]: x = 12_34_56_78_00_000

In [2]: "{:,d}".format(x)

Out[2]: '1,234,567,800,000'

In [3]: f"{x:,d}"

Out[3]: '1,234,567,800,000'


In order to get Indian number delimiters, you'd have to write a custom formatting function, notwithstanding that something like 1.5 billion people use the three-then-two delimiting convention.


I propose that Python should have an additional grouping option, or some other way to specify this grouping convention.  Oddly, the '_' grouping symbol is available, even though no one actually uses that grouper outside of programming languages like Python, e.g.:


In [4]: f"{x:_d}"

Out[4]: '1_234_567_800_000'


I guess this is nice for something like round-tripping numbers used in code, but it's not a symbol anyone uses "natively" (I understand why comma or period cannot be used in numeric literals since they mean something else in Python already).


I'm not sure what symbol or combination I would recommend, but finding something suitable shouldn't be so hard.  Perhaps now that backtick no longer has any other meaning in Python, it could be used since it looks similar to a comma.  E.g. in Python 3.8 we might have:


>>> f"{x:`d}"

(actually, this probably isn't any parser issue even in Python 2 since it's already inside quotes; but the issue is moot).


Or maybe a two character version like:


>>> f"{x:2,d}"





>>> f"{x:,,d}"



Even if `2,` was used, that wouldn't preclude giving an additional length descriptor after it.  Now we can have:


>>> f"{x:,.2f}"


Perhaps in the future this would work:


>>> f"{x:2,.2f}"




Keeping medicines from the bloodstreams of the sick; food 
from the bellies of the hungry; books from the hands of the 
uneducated; technology from the underdeveloped; and putting 
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180128/3f4f6003/attachment.html>

More information about the Python-ideas mailing list