[Python-checkins] r70423 - peps/trunk/pep-0378.txt

raymond.hettinger python-checkins at python.org
Mon Mar 16 23:16:12 CET 2009


Author: raymond.hettinger
Date: Mon Mar 16 23:16:11 2009
New Revision: 70423

Log:
Rearranged and updated to reflect Guido's comments.

Modified:
   peps/trunk/pep-0378.txt

Modified: peps/trunk/pep-0378.txt
==============================================================================
--- peps/trunk/pep-0378.txt	(original)
+++ peps/trunk/pep-0378.txt	Mon Mar 16 23:16:11 2009
@@ -33,7 +33,7 @@
 for the locale module describe these and `many other challenges`_
 in detail.
 
-.. _`many other challenges`:  http://docs.python.org/library/locale.html#background-details-hints-tips-and-caveats
+.. _`many other challenges`:  http://www.python.org/doc/2.6.1/library/locale.html#background-details-hints-tips-and-caveats
 
 It is not the goal to replace the locale module, to perform
 internationalization tasks, or accommodate every possible
@@ -44,6 +44,47 @@
 .. _`Babel`: http://babel.edgewall.org/
 
 
+Main Proposal (from Eric Smith)
+===============================
+
+Make both the thousands separator and decimal separator user
+specifiable but not locale aware.  For simplicity, limit the
+choices to a COMMA, DOT, SPACE, APOSTROPHE or UNDERSCORE.
+The SPACE can be either U+0020 or U+00A0.
+
+Whenever a separator is followed by a precision, it is a
+decimal separator and an optional separator preceding it is a
+thousands separator.  When the precision is absent, a lone
+specifier means a thousands separator::
+
+[[fill]align][sign][#][0][width][tsep][dsep precision]][type]
+
+Examples::
+
+  format(1234, "8.1f")     -->    '  1234.0'
+  format(1234, "8,1f")     -->    '  1234,0'
+  format(1234, "8.,1f")    -->    ' 1.234,0'
+  format(1234, "8 ,f")     -->    ' 1 234,0'
+  format(1234, "8d")       -->    '    1234'
+  format(1234, "8,d")      -->    '   1,234'
+  format(1234, "8_d")      -->    '   1_234'
+
+This proposal meets mosts needs, but it comes at the expense
+of taking a bit more effort to parse.  Not every possible
+convention is covered, but at least one of the options (spaces
+or underscores) should be readable, understandable, and useful
+to folks from many diverse backgrounds.
+
+As shown in the examples, the *width* argument means the total
+length including the thousands separators and decimal separators.
+
+No change is proposed for the locale module.
+
+The thousands separator is defined as shown above for types
+'d', 'e', 'f', 'g', 'E', 'G'and 'F'. To allow future extensions, it is
+undefined for other types: binary, octal, hex, character, etc.
+
+
 Current Version of the Mini-Language
 ====================================
 
@@ -54,25 +95,20 @@
 * PEP 3101 Advanced String Formatting
 
 
-Research so far
-===============
+Research into what Other Languages Do
+=====================================
 
 Scanning the web, I've found that thousands separators are
 usually one of COMMA, DOT, SPACE, APOSTROPHE or UNDERSCORE.  
 
-Visual Basic and its brethren (like `MS Excel`_) use a completely
-different style and have ultra-flexible custom format
-specifiers like::
-
-    "_($* #,##0_)".
-
-.. _`MS Excel`: http://www.brainbell.com/tutorials/ms-office/excel/Create_Custom_Number_Formats.htm
-
-`COBOL`_ uses picture clauses like::
+`C-Sharp`_ provides both styles (picture formatting and type specifiers).
+The type specifier approach is locale aware.  The picture formatting only
+offers a COMMA as a thousands separator::
 
-    PICTURE $***,**9.99CR
+    String.Format("{0:n}", 12400)     ==>    "12,400"
+    String.Format("{0:0,0}", 12400)   ==>    "12,400"
 
-.. _`COBOL`: http://en.wikipedia.org/wiki/Cobol#Syntactic_features
+.. _`C-Sharp`: http://blog.stevex.net/index.php/string-formatting-in-csharp/
 
 `Common Lisp`_ uses a COLON before the ``~D`` decimal type specifier to
 emit a COMMA as a thousands separator.  The  general form of ``~D`` is
@@ -86,18 +122,28 @@
 
 .. _`Common Lisp`: http://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node200.html
 
-`C-Sharp`_ provides both styles (picture formatting and type specifiers).
-The type specifier approach is locale aware.  The picture formatting only
-offers a COMMA as a thousands separator::
 
-    String.Format("{0:n}", 12400)     ==>    "12,400"
-    String.Format("{0:0,0}", 12400)   ==>    "12,400"
+* The `ADA language`_ allows UNDERSCORES in its numeric literals.
 
-.. _`C-Sharp`: http://blog.stevex.net/index.php/string-formatting-in-csharp/
+.. _`ADA language`: http://archive.adaic.com/standards/83lrm/html/lrm-02-04.html
+
+Visual Basic and its brethren (like `MS Excel`_) use a completely
+different style and have ultra-flexible custom format
+specifiers like::
+
+    "_($* #,##0_)".
+
+.. _`MS Excel`: http://www.brainbell.com/tutorials/ms-office/excel/Create_Custom_Number_Formats.htm
+
+`COBOL`_ uses picture clauses like::
+
+    PICTURE $***,**9.99CR
+
+.. _`COBOL`: http://en.wikipedia.org/wiki/Cobol#Syntactic_features
 
 
-Proposal I (from Nick Coghlan)
-==============================
+Alternative Proposal (from Nick Coghlan)
+========================================
 
 A comma will be added to the format() specifier mini-language::
 
@@ -126,86 +172,14 @@
   format(1234, "08,d")     -->    '0001,234'
   format(1234.5, "08,.1f") -->    '01,234.5'
 
-The ',' option is defined as shown above for types 'd', 'f',
-and 'F'. It is undefined for other types (binary, octal, hex,
-character, exponential, general, percentage, etc.)
-                                
-
-Proposal II (from Eric Smith)
-=============================
-
-Make both the thousands separator and decimal separator user
-specifiable but not locale aware.  For simplicity, limit the
-choices to a COMMA, DOT, SPACE, APOSTROPHE or UNDERSCORE.
-The SPACE can be either U+0020 or U+00A0.
-
-Whenever a separator is followed by a precision, it is a
-decimal separator and an optional separator preceding it is a
-thousands separator.  When the precision is absent, a lone
-specifier means a thousands separator::
-
-[[fill]align][sign][#][0][width][tsep][dsep precision]][type]
-
-Examples::
-
-  format(1234, "8.1f")     -->    '  1234.0'
-  format(1234, "8,1f")     -->    '  1234,0'
-  format(1234, "8.,1f")    -->    ' 1.234,0'
-  format(1234, "8 ,f")     -->    ' 1 234,0'
-  format(1234, "8d")       -->    '    1234'
-  format(1234, "8,d")      -->    '   1,234'
-  format(1234, "8_d")      -->    '   1_234'
-
-This proposal meets mosts needs, but it comes at the expense
-of taking a bit more effort to parse.  Not every possible
-convention is covered, but at least one of the options (spaces
-or underscores) should be readable, understandable, and useful
-to folks from many diverse backgrounds.
-
-As shown in the examples, the *width* argument means the total
-length including the thousands separators and decimal separators.
-
-No change is proposed for the locale module.
-
-The thousands separator is defined as shown above for types
-'d', 'f', and 'F'. It is undefined for other types (binary,
-octal, hex, character, exponential, general, percentage, etc.)
-
-
-Comparison
-==========
-
-The difference between the two proposals is that the first is hard-wired
-to a COMMA for a thousands separator and a DOT as a decimal separator.
-The second allows either separator to be one of several possibilities.
-
-The PEP author recommends Proposal II.
-
-
-Other Ideas
-===========
-
-* Lie Ryan suggested a convenience function of the form::
-
-    create_format(self, type='i', base=16, seppos=4, sep=':',
-                  charset='0123456789abcdef', maxwidth=32,
-                  minwidth=32, pad='0')
-
-* Eric Smith would like the C version of the mini-language
-  parser to be exposed with hooks that would make it easier
-  to write custom *__format__* methods.  That way, methods like
-  *Decimal.__format__* would not have to be written from scratch.
-
-* Antoine Pitrou noted that the provision for a SPACE separator
-  should also allow a non-breaking space (U+00A0).
-
-* A poster on the newgroup, Wolfgang Rohdewald, noted that a
-  convention in Switzerland is to use an APOSTROPHE as a
-  thousands separator, ``12`000.99``.
-
-* The `ADA language`_ allows UNDERSCORES in its numeric literals.
-
-.. _`ADA language`: http://archive.adaic.com/standards/83lrm/html/lrm-02-04.html
+The ',' option is defined as shown above for types 'd', 'e',
+'f', 'g', 'E', 'G'and 'F'. To allow future extensions, it is
+undefined for other types: binary, octal, hex, character,
+etc.
+
+This alternative proposal has the virtue of being simpler
+than the main proposal but is much less flexible and meets
+the needs of fewer users right out of the box.
 
 
 Commentary


More information about the Python-checkins mailing list