Re: [Python-ideas] [Python-Dev] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev)
Joining the discussion over here to add a couple of points that I haven't seen in Raymond's PEP updates on the checkin list: 1. The Single Unix Specification apparently uses an apostrophe as a flag in prinft() %-formatting to request inclusion of a thousands separator in a locale aware way [1]. Since the apostrophe is much harder to mistake for a period than a comma is, I would modify my "just a flag" suggestion to use an apostrophe as the flag instead of a comma: [[fill]align][sign][#][0][width]['][.precision][type] The output would still use commas though: format(1234, "8.1f") --> ' 1234.0' format(1234, "8'.1f") --> ' 1,234.0' format(1234, "8d") --> ' 1234' format(1234, "8'd") --> ' 1,234' 2. PEP 3101 *already included* a way to modify the handling of format strings in a consistent way: use a custom string.Formatter subclass instead of relying on the basic str.format method. When the mini language parser is exposed (which I consider a separate issue from this PEP), a locale aware custom formatter is going to find a "include digit separators" flag far more useful than the overly explicit "use this thousands separator and this decimal separator". Cheers, Nick. [1] http://linux.die.net/man/3/printf -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
[Nick Coghlan]
1. The Single Unix Specification apparently uses an apostrophe as a flag in prinft() %-formatting to request inclusion of a thousands separator in a locale aware way [1].
We already use C-sharp's "n" flag for a locale aware thousands separator.
Since the apostrophe is much harder to mistake for a period than a comma is, I would modify my "just a flag" suggestion to use an apostrophe as the flag instead of a comma: . . . The output would still use commas though:
That doesn't make sense for two reasons: 1. Why mark a non-locale aware form with a flag that indicates locale awareness in another language. 2. It seems to be basic bad design to require an apostrophe to emit commas. FWIW, the comma-only version of the proposal is probably going to die anyway. The more flexible alternative evolved to something simple and direct. Also, the newsgroup discussion make it abundantly clear that half the world will rebel if commas are the only supported option.
2. PEP 3101 *already included* a way to modify the handling of format strings in a consistent way: use a custom string.Formatter subclass instead of relying on the basic str.format method.
When the mini language parser is exposed (which I consider a separate issue from this PEP), a locale aware custom formatter is going to find a "include digit separators" flag far more useful than the overly explicit "use this thousands separator and this decimal separator".
Thanks. Will note that in the PEP when I get a chance. Raymond
Le Fri, 13 Mar 2009 23:08:18 -0700, "Raymond Hettinger" <python@rcn.com> s'exprima ainsi:
Since the apostrophe is much harder to
mistake for a period than a comma is, I would modify my "just a flag" suggestion to use an apostrophe as the flag instead of a comma: . . . The output would still use commas though:
That doesn't make sense for two reasons: 1. Why mark a non-locale aware form with a flag that indicates locale awareness in another language. 2. It seems to be basic bad design to require an apostrophe to emit commas.
If I properly understand the PEP (by the way, congratulations for the reformulation -- the motivation section esp. is clearer and more motivat-ing) there are 2 differences between the poposals: * choose char for thousand-sep * choose decimal sep
FWIW, the comma-only version of the proposal is probably going to die anyway. The more flexible alternative evolved to something simple and direct. Also, the newsgroup discussion make it abundantly clear that half the world will rebel if commas are the only supported option.
If the first proposal let the user choose the thousand-sep char it would be more appealing, indeed. As is, it has no chance. Anyway, the second proposal is now rather clear and simple. In my mind, both separators work together even when there no possible conflict between the actual chars. +1 for version #2 (more or less as is now) I would just add: The SPACE can be either U+0020 (standard space) or U+00A0 (non-breakable space). Denis PS - OT: As the width param is the width of whole number, how to cope with with decimal point alignment, meaning that there should be integral part width/padding instead? 123.45 1.2 123456.789 Maybe this need is mainly in the financial field, so that this will be implicitly addressed because to the 2-digit rounding? 123.45 1.20 123456.79 ------ la vita e estrany
Hello, spir <denis.spir@...> writes:
I would just add: The SPACE can be either U+0020 (standard space) or U+00A0 (non-breakable
space). Then the proposal should allow for any kind of space characters (that is, any character for which isspace() is True). There are several non-breaking space characters in the unicode character set, with varying character widths, which is important for typography rules. See http://en.wikipedia.org/wiki/Non-breaking_space for some examples. Regards Antoine (playing devil's advocate a bit - but only a bit).
denis.spir@...> writes:
I would just add: The SPACE can be either U+0020 (standard space) or U+00A0 (non-breakable
space).
Then the proposal should allow for any kind of space characters (that is, any character for which isspace() is True). There are several non-breaking space characters in the unicode character set, with varying character widths, which is important for typography rules. See http://en.wikipedia.org/wiki/Non-breaking_space for some examples.
Regards
Antoine (playing devil's advocate a bit - but only a bit).
Keeping in mind the needs of people writing parsers, I don't think it's a good idea to expand this set. Already, we're not supporting all possible separators whether they be spaces or not. Given just U+0020 and U+00A0, a person can easily do a str.replace() to get to anything else. Raymond
Raymond Hettinger wrote:
1. Why mark a non-locale aware form with a flag that indicates locale awareness in another language. 2. It seems to be basic bad design to require an apostrophe to emit commas.
Okay, so how about: comma - always use a comma apostrophe - use the locale And for the decimal point: dot - always use a dot semicolon - use the locale -- Greg
participants (5)
-
Antoine Pitrou
-
Greg Ewing
-
Nick Coghlan
-
Raymond Hettinger
-
spir