[Python-checkins] r45928 - peps/trunk/pep-3101.txt
talin
python-checkins at python.org
Sun May 7 03:49:44 CEST 2006
Author: talin
Date: Sun May 7 03:49:43 2006
New Revision: 45928
Modified:
peps/trunk/pep-3101.txt
Log:
Updated based on collected feedback.
Modified: peps/trunk/pep-3101.txt
==============================================================================
--- peps/trunk/pep-3101.txt (original)
+++ peps/trunk/pep-3101.txt Sun May 7 03:49:43 2006
@@ -8,7 +8,7 @@
Content-Type: text/plain
Created: 16-Apr-2006
Python-Version: 3.0
-Post-History:
+Post-History: 28-Apr-2006
Abstract
@@ -50,8 +50,8 @@
The specification will consist of 4 parts:
- - Specification of a set of methods to be added to the built-in
- string class.
+ - Specification of a new formatting method to be added to the
+ built-in string class.
- Specification of a new syntax for format strings.
@@ -99,13 +99,13 @@
"My name is Fred :-{}"
The element within the braces is called a 'field'. Fields consist
- of a name, which can either be simple or compound, and an optional
- 'conversion specifier'.
+ of a 'field name', which can either be simple or compound, and an
+ optional 'conversion specifier'.
- Simple names are either names or numbers. If numbers, they must
- be valid decimal numbers; if names, they must be valid Python
- identifiers. A number is used to identify a positional argument,
- while a name is used to identify a keyword argument.
+ Simple field names are either names or numbers. If numbers, they
+ must be valid base-10 integers; if names, they must be valid
+ Python identifiers. A number is used to identify a positional
+ argument, while a name is used to identify a keyword argument.
Compound names are a sequence of simple names seperated by
periods:
@@ -118,8 +118,9 @@
positional argument 0.
Each field can also specify an optional set of 'conversion
- specifiers'. Conversion specifiers follow the field name, with a
- colon (':') character separating the two:
+ specifiers' which can be used to adjust the format of that field.
+ Conversion specifiers follow the field name, with a colon (':')
+ character separating the two:
"My name is {0:8}".format('Fred')
@@ -130,11 +131,15 @@
The conversion specifier consists of a sequence of zero or more
characters, each of which can consist of any printable character
- except for a non-escaped '}'. The format() method does not
- attempt to intepret the conversion specifiers in any way; it
- merely passes all of the characters between the first colon ':'
- and the matching right brace ('}') to the various underlying
- formatters (described later.)
+ except for a non-escaped '}'.
+
+ Conversion specifiers can themselves contain replacement fields;
+ this will be described in a later section. Except for this
+ replacement, the format() method does not attempt to intepret the
+ conversion specifiers in any way; it merely passes all of the
+ characters between the first colon ':' and the matching right
+ brace ('}') to the various underlying formatters (described
+ later.)
Standard Conversion Specifiers
@@ -142,16 +147,19 @@
For most built-in types, the conversion specifiers will be the
same or similar to the existing conversion specifiers used with
the '%' operator. Thus, instead of '%02.2x", you will say
- '{0:2.2x}'.
+ '{0:02.2x}'.
There are a few differences however:
- The trailing letter is optional - you don't need to say '2.2d',
- you can instead just say '2.2'. If the letter is omitted, the
- value will be converted into its 'natural' form (that is, the
- form that it take if str() or unicode() were called on it)
- subject to the field length and precision specifiers (if
- supplied).
+ you can instead just say '2.2'. If the letter is omitted, a
+ default will be assumed based on the type of the argument.
+ The defaults will be as follows:
+
+ string or unicode object: 's'
+ integer: 'd'
+ floating-point number: 'f'
+ all other types: 's'
- Variable field width specifiers use a nested version of the {}
syntax, allowing the width specifier to be either a positional
@@ -159,10 +167,6 @@
"{0:{1}.{2}d}".format(a, b, c)
- (Note: It might be easier to parse if these used a different
- type of delimiter, such as parens - avoiding the need to create
- a regex that handles the recursive case.)
-
- The support for length modifiers (which are ignored by Python
anyway) is dropped.
@@ -171,7 +175,7 @@
conversion specifiers are identical to the arguments to the
strftime() function:
- "Today is: {0:%x}".format(datetime.now())
+ "Today is: {0:%a %b %d %H:%M:%S %Y}".format(datetime.now())
Controlling Formatting
@@ -203,19 +207,37 @@
User-Defined Formatting Classes
- The code that interprets format strings can be called explicitly
- from user code. This allows the creation of custom formatter
- classes that can override the normal formatting rules.
-
- The string and unicode classes will have a class method called
- 'cformat' that does all the actual work of formatting; The
- format() method is just a wrapper that calls cformat.
+ There will be times when customizing the formatting of fields
+ on a per-type basis is not enough. An example might be an
+ accounting application, which displays negative numbers in
+ parentheses rather than using a negative sign.
+
+ The string formatting system facilitates this kind of application-
+ specific formatting by allowing user code to directly invoke
+ the code that interprets format strings and fields. User-written
+ code can intercept the normal formatting operations on a per-field
+ basis, substituting their own formatting methods.
+
+ For example, in the aforementioned accounting application, there
+ could be an application-specific number formatter, which reuses
+ the string.format templating code to do most of the work. The
+ API for such an application-specific formatter is up to the
+ application; here are several possible examples:
+
+ cell_format( "The total is: {0}", total )
+
+ TemplateString( "The total is: {0}" ).format( total )
+
+ Creating an application-specific formatter is relatively straight-
+ forward. The string and unicode classes will have a class method
+ called 'cformat' that does all the actual work of formatting; The
+ built-in format() method is just a wrapper that calls cformat.
The parameters to the cformat function are:
-- The format string (or unicode; the same function handles
both.)
- -- A field format hook (see below)
+ -- A callable 'format hook', which is called once per field
-- A tuple containing the positional arguments
-- A dict containing the keyword arguments
@@ -223,15 +245,16 @@
string, and return a new string (or unicode) with all of the
fields replaced with their formatted values.
- For each field, the cformat function will attempt to call the
- field format hook with the following arguments:
+ The format hook is a callable object supplied by the user, which
+ is invoked once per field, and which can override the normal
+ formatting for that field. For each field, the cformat function
+ will attempt to call the field format hook with the following
+ arguments:
- field_hook(value, conversion, buffer)
+ format_hook(value, conversion, buffer)
The 'value' field corresponds to the value being formatted, which
- was retrieved from the arguments using the field name. (The
- field_hook has no control over the selection of values, only
- how they are formatted.)
+ was retrieved from the arguments using the field name.
The 'conversion' argument is the conversion spec part of the
field, which will be either a string or unicode object, depending
@@ -299,11 +322,34 @@
- Other variations include Ruby's #{}, PHP's {$name}, and so
on.
-
+
+ Some specific aspects of the syntax warrant additional comments:
+
+ 1) The use of the backslash character for escapes. A few people
+ suggested doubling the brace characters to indicate a literal
+ brace rather than using backslash as an escape character. This is
+ also the convention used in the .Net libraries. Here's how the
+ previously-given example would look with this convention:
+
+ "My name is {0} :-{{}}".format('Fred')
+
+ One problem with this syntax is that it conflicts with the use of
+ nested braces to allow parameterization of the conversion
+ specifiers:
+
+ "{0:{1}.{2}}".format(a, b, c)
+
+ (There are alternative solutions, but they are too long to go
+ into here.)
+
+ 2) The use of the colon character (':') as a separator for
+ conversion specifiers. This was chosen simply because that's
+ what .Net uses.
+
Sample Implementation
- A rought prototype of the underlying 'cformat' function has been
+ A rough prototype of the underlying 'cformat' function has been
coded in Python, however it needs much refinement before being
submitted.
More information about the Python-checkins
mailing list