[Python-3000] Format specifier proposal

Talin talin at acm.org
Tue Aug 14 05:13:35 CEST 2007


Ron Adam wrote:
> 
>>     :f<+015.5 # Floating point, left aligned, always show sign,
>>               # leading zeros, field width 15 (min), 5 decimal places.
> 
> Which has precedence... left alignment or zero padding?
> 
> Or should this be an error?

The answer is: Just ignore that proposal entirely :)

------

So I sat down with Guido and as I expected he has simplified my thoughts 
greatly. Based on the conversation we had, I think we both agree on what 
should be done:


1) There will be a new built-in function "format" that formats a single 
field. This function takes two arguments, a value to format, and a 
format specifier string.

The "format" function does exactly the following:

    def format(value, spec):
       return value.__format__(spec)

(I believe this even works if value is 'None'.)

In other words, any type conversion or fallbacks must be done by 
__format__; Any interpretation or parsing of the format specifier is 
also done by __format__.

"format" does not, however, handle the "!r" specifier. That is done by 
the caller of this function (usually the Formatter class.)


2) The various type-specific __format__ methods are allowed to know 
about other types - so 'int' knows about 'float' and so on.

Note that other than the special case of int <--> float, this knowledge 
is one way only, meaning that the dependency graph is a acyclic.

For most types, if they see a type letter that they don't recognize, 
they should coerce to their nearest built-in type (int, float, etc.) and 
re-invoke __format__.


3) In addition to int.__format__, float.__format__, and str.__format__, 
there will also be object.__format__, which simply coerces the object to 
a string, and calls __format__ on the result.

   class object:
      def __format__(self, spec):
         return str(self).__format__(spec)

So in other words, all objects are formattable if they can be converted 
to a string.


4) Explicit type coercion is a separate field from the format spec:

     {name[:format_spec][!coercion]}

Where 'coercion' can be 'r' (to convert to repr()), 's' (to convert to 
string.) Other letters may be added later based on need.

The coercion field cases the formatter class to attempt to coerce the 
value to the specified type before calling format(value, format_spec)


5) Mini-language for format specifiers:

So I do like your (Ron's) latest proposal, and I am thinking about it 
quite a bit.

Guido suggested (and I am favorable to the idea) that we simply keep the 
2.5 format syntax, or the slightly more advanced variation that's in the 
PEP now.

This has a couple of advantages:

-- It means that Python programmers won't have to learn a new syntax.
-- It makes the 2to3 conversion of format strings trivial. (Although 
there are some other difficulties with automatic conversion of '%', but 
they are unrelated to format specifiers.)

Originally I liked the idea of putting the type letter at the front, 
instead of at the back like it is in 2.5. However, when you think about 
it, it actually makes sense to have it at the back. Because the type 
letter is now optional, it won't need to be there most of the time. The 
type letter is really just an optional modifier flag, not a "type" at all.

Two features of your proposal that aren't supported in the old syntax are:

   -- Arbitrary fill characters, as opposed to just '0' and ' '.
   -- Taking the string value from the left or right.

I'm not sure how much we need the first. The second sounds kind of 
useful though.

I'm thinking that we might be able to take your ideas and simply extend 
the old 2.5 syntax, so that it would be backwards compatible. On the 
other hand, it seems to me that once we have a *real* implementation 
(which we will soon), it will be relatively easy for people to 
experiment with new features and syntactical innovations.


6) Finally, Guido stressed that he wants to make sure that the 
implementation supports fields within fields, such as:

    {0:{1}.{2}}

Fortunately, the 'format' function doesn't have to handle this (it only 
formats a single value.) This would be done by the higher-level code.

-- Talin


More information about the Python-3000 mailing list