[Python-3000] String formatting: Conversion specifiers
Talin
talin at acm.org
Wed Jun 7 07:36:49 CEST 2006
Nick Coghlan wrote:
> Talin wrote:
>
>> So I decided to sit down and rethink the whole conversion specifier
>> system. I looked at the docs for the '%' operator, and some other
>> languages, and here is what I came up with (this is an excerpt from
>> the revised PEP.)
>
>
> Generally nice, but I'd format the writeup a bit differently (see below)
> and reorder the elements so that an arbitrary character can be supplied
> as the fill character and the old ' ' sign flag behaviour remains
> available.
>
> I'd also design it so that the standard conversion specifiers are
> available 'for free' (i.e., they work for any class, unless the class
> author deliberately replaces them with something else).
>
> Cheers,
> Nick.
>
I've taken your proposal as a base, and made some additional changes to
it. In addition, I've gone ahead and implemented a prototype of the
built-in formatter based on the revised text.
Note that I decided not to have different specifier syntax for each
different data type - the reason is because I have a single parser that
parses the conversion specifier, and it always parses precision, sign,
etc., even if they are not used by that particular format type. So
instead, it is simply the case that some specifier options aren't use
for some format types.
Here is the new text for the section:
Standard Conversion Specifiers
If an object does not define its own conversion specifiers, a
standard set of conversion specifiers are used. These are similar
in concept to the conversion specifiers used by the existing '%'
operator, however there are also a number of significant
differences. The standard conversion specifiers fall into three
major categories: string conversions, integer conversions and
floating point conversions.
The general form of a standard conversion specifier is:
[[fill]align][sign][width][.precision][type]
The brackets ([]) indicate an optional field.
Then the optional align flag can be one of the following:
'<' - Forces the field to be left-aligned within the available
space (This is the default.)
'>' - Forces the field to be right-aligned within the
available space.
'=' - Forces the padding to be placed between immediately
after the sign, if any. This is used for printing fields
in the form '+000000120'.
Note that unless a minimum field width is defined, the field
width will always be the same size as the data to fill it, so
that the alignment option has no meaning in this case.
The optional 'fill' character defines the character to be used to
pad the field to the minimum width. The alignment flag must be
supplied if the character is a number other than 0 (otherwise the
character would be interpreted as part of the field width
specifier). A '0' fill character without an alignment flag
implies an alignment type of '='.
The 'sign' field can be one of the following:
'+' - indicates that a sign should be used for both
positive as well as negative numbers
'-' - indicates that a sign should be used only for negative
numbers (this is the default behaviour)
' ' - indicates that a leading space should be used on
positive numbers
'()' - indicates that negative numbers should be surrounded
by parentheses
'width' is a decimal integer defining the minimum field width. If
not specified, then the field width will be determined by the
content.
The 'precision' field is a decimal number indicating how many
digits should be displayed after the decimal point.
Finally, the 'type' determines how the data should be presented.
If the type field is absent, an appropriate type will be assigned
based on the value to be formatted ('d' for integers and longs,
'g' for floats, and 's' for everything else.)
The available string conversion types are:
's' - String format. Invokes str() on the object.
This is the default conversion specifier type.
'r' - Repr format. Invokes repr() on the object.
There are several integer conversion types. All invoke int() on
the object before attempting to format it.
The available integer conversion types are:
'b' - Binary. Outputs the number in base 2.
'c' - Character. Converts the integer to the corresponding
unicode character before printing.
'd' - Decimal Integer. Outputs the number in base 10.
'o' - Octal format. Outputs the number in base 8.
'x' - Hex format. Outputs the number in base 16, using lower-
case letters for the digits above 9.
'X' - Hex format. Outputs the number in base 16, using upper-
case letters for the digits above 9.
There are several floating point conversion types. All invoke
float() on the object before attempting to format it.
The available floating point conversion types are:
'e' - Exponent notation. Prints the number in scientific
notation using the letter 'e' to indicate the exponent.
'E' - Exponent notation. Same as 'e' except it uses an upper
case 'E' as the separator character.
'f' - Fixed point. Displays the number as a fixed-point
number.
'F' - Fixed point. Same as 'f'.
'g' - General format. This prints the number as a fixed-point
number, unless the number is too large, in which case
it switches to 'e' exponent notation.
'G' - General format. Same as 'g' except switches to 'E'
if the number gets to large.
'n' - Number. This is the same as 'g', except that it uses the
current locale setting to insert the appropriate
number separator characters.
'%' - Percentage. Multiplies the number by 100 and displays
in fixed ('f') format, followed by a percent sign.
Objects are able to define their own conversion specifiers to
replace the standard ones. An example is the 'datetime' class,
whose conversion specifiers might look something like the
arguments to the strftime() function:
"Today is: {0:a b d H:M:S Y}".format(datetime.now())
Finally, I have two questions:
1) Where would be a good place to stick the rough prototype? I don't
want to post it here, its rather long.
2) I'd like to know if anyone out there wants to take over the task of
implementing 3102 so that I can focus my attention on 3101. I have
fairly limited bandwidth at the moment, and 3101 is by far the more
complex proposal.
-- Talin
More information about the Python-3000
mailing list