Re: [Python-ideas] Fixed point format for numbers with locale based separators

Jan. 7, 2019

      Dnia 6 stycznia 2019 o 01:48 "Eric V. Smith" <eric@trueblade.com> napisał(a):
...
On 1/5/2019 3:03 PM, Łukasz Stelmach wrote:
...
Barry Scott <barry@barrys-emacs.org> writes:
...
On Friday, 4 January 2019 14:57:53 GMT Łukasz Stelmach wrote:
...
I would like to present two pull requests[1][2] implementing fixed
point presentation of numbers and ask for comments. The first is
mine. I learnt about the second after publishing mine.
The only format using decimal separator from locale data for
float/complex/decimal numbers at the moment is "n" which behaves
like "g". The drawback of these formats, I would like to overcome,
is the inability to print numbers ranging more than one order of
magnitude with the same number of decimal digits without "manually"
(with some additional custom code) adjusting precission. The other
option is to "manually" replace "." as printed by "f" with a local
decimal separator. Neither of these option is appealing to my.
Formatting 1.23456789 * n (LC_ALL=3Dpl_PL.UTF-8)
| n |    ".2f" |    ".3n" |
     |---+----------+----------|
     | 1 |     1.23 |     1,23 |
     | 2 |    12.35 |     12,3 |
     | 3 |   123.46 |      123 |
     | 4 |  1234.57 | 1,23e+03 |
Can you use locale.format_string() to solve this?
I am afraid I can't. I am using a library called pint[1] in my
project. It allows me to choose how its objects are formated but it
uses format() internally. It adds some custom extensions to format
strings which, as far as I can tell, mekes it hard if not impossible
to patch it to locale.format_string(). But this is rather an excuse.
I do think that this is a compelling use case for "f" style
locale-aware formatting. I support adding it in some format or another
(pun intended).
My only concern is how to paint the bike shed. Should we just use
another format spec "type" character instead of "f", as the two linked
issues propose? Or maybe use an additional "alternate form" style
character, so that we could use different locale options, either now
or in the future? https://bugs.python.org/issue33731 is similar to
https://bugs.python.org/issue34311 but proposes using LC_MONETARY
instead of LC_NUMERIC.
I'm not suggesting we solve every possible problem here, but we at
least shouldn't paint ourselves into a corner and instead allow a
future where we could expand things, if needed, and without using up
tons of format spec "type" characters for every permutation of "type"
plus LC_MONETARY or LC_NUMERIC.
Here's a straw man:
The current specification for the format spec is:
[[fill]align][sign][#][0][width][grouping_option][.precision][type]
Let's say we change it to:
[[fill]align][sign][#][*|$][0][width][grouping_option][.precision][type]
(I think that's unambiguous, but I'd have to think it through some more)
Let's call the new [*|$] character the "locale character".
[...]

OK, it doesn't sound bad at all and I wonder if there is *any* other
situation that may allow/require choosing between different categories of
locale data to format the same value. If so (I need to read some more
about locale date), I think your idea can be extended even
further. Let's use 'Lx' as even more general 'locale control sequence'
where 'x' is a locale category in general (LC_CTYPE, LC_). Should we
support only POSIX categories[1] or extensions like LC_PAPER in glibc or
other OS/library too?

BTW. Is there any scanf() equivalent in Python, that uses the same
syntax as format()? Because it might benefit from such control sequences
even more?
...
Again, this is just a straw man proposal that would require fleshing 
out. I think it might also require a PEP, but it would be as simple as 
PEP 378 for adding comma grouping formatting. Somewhere to memorialize 
the decision and how we got there, including rejected alternate 
proposals, would be a good thing.
Challenge accepted (-; Where do I start?

[1] https://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap07.html
-- 
Kind regards,
Łukasz Stelmach