[Python-Dev] format, int, and IntEnum

Nick Coghlan ncoghlan at gmail.com
Thu Aug 15 17:06:41 CEST 2013


On 15 August 2013 05:03, Eric V. Smith <eric at trueblade.com> wrote:
> On 8/15/2013 12:27 AM, Nick Coghlan wrote:
>> I think Eric is overinterpreting the spec, there. While that particular
>> sentence requires that the empty format string will be equivalent to a
>> plain str() operation for builtin types, it is only a recommendation for
>> other types. For enums, I believe they should be formatted like their
>> base types (so !s and !r will show the enum name, anything without
>> coercion will show the value) .
>
> I don't think I'm over-interpreting the spec (but of course I'd say
> that!). The spec is very precise on the meaning of "format specifier":
> it means the entire string (the second argument to __format__). I'll
> grant that in the sentence in question it uses "format specification",
> not "format specifier", though.

By overinterpreting, I meant interpreting that part to mean literally
calling str() when the specifier was empty, but not calling it
otherwise. That's not what it means: it means that "str(x)" and
"format(x)" will typically produce the *same result*, not that format
will actually call str. If a subclass overrides __str__ but not
__format__ (or vice-versa), then they can and *should* diverge.

So, ideally, we would be more consistent, and either *always* call
str() for subclasses when no type specifier is given, or *never* call
it and always use the value.

In this case, for integers, a missing type specifier for numeric types
is defined as meaning "d", so calling str() on subclasses is wrong -
it should be using the value, not the string representation. So "no
type specifier" + "int subclass" *should* have meant calling int(self)
prior to formatting rather than str(self), and subtypes look bool
would have needed to override both __str__ and __format__.

The problem now is that changing this behaviour in the base class
would break subclasses like bool that expect format(x) to produce the
same result as str(x) even though they have overridden __str__ but not
__format__.

So, I think the path to consistency here needs to be that, if a
builtin subclass overrides __str__ without overriding __format__, then
the formatting result will *always* be "format(str(x), fmtspec)",
regardless of whether fmtspec is empty or not. This will make bool
formatting work as expected, even when using things like the field
width specifiers.

There's a slight backwards compatibility risk even with that more
conservative change, though - attempting to use numeric formatting
codes with affected subtypes will now throw an exception, and for
codes common to numbers and strings the output will change :(

>>> format(True, "10")
'         1'
>>> format(str(True), "10")
'True      '
>>> format(True, "10,")
'         1'
>>> format(str(True), "10,")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Cannot specify ',' with 's'.

So we may just have to live with the wart and tell people to always
override both to ensure consistent behaviour :(

> I still think the best thing to do is implement __format__ for IntEnum,
> and there implement whatever behavior is decided. I don't think changing
> the meaning of existing objects (specifically int here) is a good course
> of action.

I don't think changing the processing of int is proposed - just the
handling of subclasses. However, since it seems to me that "make bool
formatting work consistently" is a more conservative change (if we
change the base class behaviour at all), then Enum subclasses will
need __format__ defined regardless.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list