[Python-Dev] format, int, and IntEnum

Eric V. Smith eric at trueblade.com
Thu Aug 15 19:37:17 CEST 2013


On 08/15/2013 11:06 AM, Nick Coghlan wrote:
> On 15 August 2013 05:03, Eric V. Smith <eric at trueblade.com> wrote:
>> On 8/15/2013 12:27 AM, Nick Coghlan wrote:
>>> I think Eric is overinterpreting the spec, there. While that particular
>>> sentence requires that the empty format string will be equivalent to a
>>> plain str() operation for builtin types, it is only a recommendation for
>>> other types. For enums, I believe they should be formatted like their
>>> base types (so !s and !r will show the enum name, anything without
>>> coercion will show the value) .
>>
>> I don't think I'm over-interpreting the spec (but of course I'd say
>> that!). The spec is very precise on the meaning of "format specifier":
>> it means the entire string (the second argument to __format__). I'll
>> grant that in the sentence in question it uses "format specification",
>> not "format specifier", though.
> 
> By overinterpreting, I meant interpreting that part to mean literally
> calling str() when the specifier was empty, but not calling it
> otherwise. That's not what it means: it means that "str(x)" and
> "format(x)" will typically produce the *same result*, not that format
> will actually call str. If a subclass overrides __str__ but not
> __format__ (or vice-versa), then they can and *should* diverge.
> 
> So, ideally, we would be more consistent, and either *always* call
> str() for subclasses when no type specifier is given, or *never* call
> it and always use the value.
> 
> In this case, for integers, a missing type specifier for numeric types
> is defined as meaning "d", so calling str() on subclasses is wrong -
> it should be using the value, not the string representation. So "no
> type specifier" + "int subclass" *should* have meant calling int(self)
> prior to formatting rather than str(self), and subtypes look bool
> would have needed to override both __str__ and __format__.

I'm pretty sure that when I implemented this we went over this point in
extreme detail, and that we specifically wanted it to apply to an empty
"format specifier", not an empty "type specifier" (which the PEP and the
documentation call a "presentation type"). Your interpretation might
make more sense, but it's not what we discussed at the time (to my
recollection), and I think it's too late to change now, unfortunately.

> The problem now is that changing this behaviour in the base class
> would break subclasses like bool that expect format(x) to produce the
> same result as str(x) even though they have overridden __str__ but not
> __format__.
> 
> So, I think the path to consistency here needs to be that, if a
> builtin subclass overrides __str__ without overriding __format__, then
> the formatting result will *always* be "format(str(x), fmtspec)",
> regardless of whether fmtspec is empty or not. This will make bool
> formatting work as expected, even when using things like the field
> width specifiers.
> 
> There's a slight backwards compatibility risk even with that more
> conservative change, though - attempting to use numeric formatting
> codes with affected subtypes will now throw an exception, and for
> codes common to numbers and strings the output will change :(
> 
>>>> format(True, "10")
> '         1'
>>>> format(str(True), "10")
> 'True      '
>>>> format(True, "10,")
> '         1'
>>>> format(str(True), "10,")
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> ValueError: Cannot specify ',' with 's'.
> 
> So we may just have to live with the wart and tell people to always
> override both to ensure consistent behaviour :(

I think that's true.

>> I still think the best thing to do is implement __format__ for IntEnum,
>> and there implement whatever behavior is decided. I don't think changing
>> the meaning of existing objects (specifically int here) is a good course
>> of action.
> 
> I don't think changing the processing of int is proposed - just the
> handling of subclasses. However, since it seems to me that "make bool
> formatting work consistently" is a more conservative change (if we
> change the base class behaviour at all), then Enum subclasses will
> need __format__ defined regardless.

Agreed. And I think the real discussion here needs to be: what should
__format__ for IntEnum (or maybe Enum) do? Is not being consistent with
what int.__format__ accepts okay? (For example, "+" or "10,".) Or does
being a "drop-in replacement for int" mean that any format string for
int must also work for IntEnum? As I've said, I think breaking a few
format strings is okay, but of course it's debatable.

Eric.



More information about the Python-Dev mailing list