[Python-3000] More PEP 3101 changes incoming
Talin
talin at acm.org
Thu Aug 2 04:01:02 CEST 2007
I had a long discussion with Guido today, where he pointed out numerous
flaws and inconsistencies in my PEP that I had overlooked. I won't go
into all of the details of what he found, but I'd like to focus on what
came out of the discussion. I'm going to be updating the PEP to
incorporate the latest thinking, but I thought I would post it on Py3K
first to see what people think.
The first change is to divide the conversion specifiers into two parts,
which we will call "alignment specifiers" and "format specifiers". So
the new syntax for a format field will be:
valueSpec [,alignmentSpec] [:formatSpec]
In other words, alignmentSpec is introduced by a comma, and conversion
spec is introduced by a colon. This use of comma and colon is taken
directly from .Net. although our alignment and conversion specifiers
themselves look nothing like the ones in .Net.
Alignment specifiers now includes the former 'fill', 'align' and 'width'
properties. So for example, to indicate a field width of 8:
"Property count {0,8}".format(propertyCount)
The 'formatSpec' now includes the former 'sign' and 'type' parameters:
"Number of errors: {0:+d}".format(errCount)
In the preceding example, this would indicate an integer field preceded
by a sign for both positive and negative numbers.
There are still some things to be worked out. For example, there are
currently 3 different meanings of 'width': Minimum width, maximum width,
and number of digits of decimal precision. The previous version of the
PEP followed the 2.x convention, which was 'n.n' - 'min.prec' for
floats, and 'min.max' for everything else. However, that seems confusing.
(I'm actually still working out the details - and in fact a little bit
of a bikeshed discussion would be welcome at this point, as I could use
some help ironing out these kinds of little inconsistencies.)
In general, you can think of the difference between format specifier and
alignment specifier as:
Format Specifier: Controls how the value is converted to a string.
Alignment Specifier: Controls how the string is placed on the line.
Another change in the behavior is that the __format__ special method can
only be used to override the format specifier - it can't be used to
override the alignment specifier. The reason is simple: __format__ is
used to control how your object is string-ified. It shouldn't get
involved in things like left/right alignment or field width, which are
really properties of the field, not the object being printed.
The __format__ special method can basically completely change how the
format specifier is interpreted. So for example for Date objects you can
have a format specifier that looks like the input to strftime().
However, there are times when you want to override the __format__ hook.
The primary use case is the 'r' conversion specifier, which is used to
get the repr() of an object.
At the moment I'm leaning towards using the exclamation mark ('!') to
indicate this, in a way that's analogous to the CSS "! important" flag -
it basically means "No, I really mean it!" Two possible syntax
alternatives are:
"The repr is {0!r}".format(obj)
"The repr is {0:r!}".format(obj)
In the first option, we use '!' in place of the colon. In the second
case, we use '!' as a suffix.
Another change suggested by Guido is explicit support for the Decimal
type. Under the current proposal, a format specifier of 'f' will cause
the Decimal object to be coerced to float before printing. That's not
what we want, because it will cause a loss of precision. Instead, the
rule should be that Decimal can use all of the same formatting types as
float, but it won't try to convert the Decimal to float as an
intermediate step.
Here's some pseudo code outlining how the new formatting algorithm for
fields will work:
def format_field(value, alignmentSpec, formatSpec):
if value has a __format__ attribute, and no '!' flag:
s = value.__format__(value, formatSpec)
else:
if the formatSpec is 'r':
s = repr(value)
else if the formatSpec is 'd' or one of the integer types:
# Coerce to int
s = formatInteger(int(value), formatSpec)
else if the formatSpec is 'f' or one of the float types:
if value is a Decimal:
s = formatDecimal(value, formatSpec)
else:
# Coerce to float
s = formatFloat(float(value), formatSpec)
else:
s = str(value)
# Now that we have 's', apply the alignment options
return applyAlignment(s, alignmentSpec)
My goal is that some time in the next several weeks I would like to get
working a C implementation of just this function. Most of the complexity
of the PEP implementation is right here IMHO.
Before I edit the PEP I'm going to let this marinate for a week and see
what the discussion brings up.
-- Talin
More information about the Python-3000
mailing list