str() vs format(): trivia question

urllib.urlencode currently uses `str()` on its non-bytes objects before encoding the result. This causes a compatibility break when integer module constants are converted to IntEnum, as `str(IntEnum.MEMBER)` no longer returns the integer representation; however, `format()` does still return the integer representation. The fix is to add a separate branch to check if the argument is an Enum, and use the value if so -- but it got me wondering: in general, are there differences between calling str() vs calling format() on Python objects? -- ~Ethan~

On 4/20/2021 10:56 AM, Ethan Furman wrote:
If there's no format string, then object___format___impl() just calls PyObject_Str(). So unless the object overrides __format__, they're the same. int does override __format__, in _PyLong_FormatAdvancedWriter(). It has a comment that says: /* check for the special case of zero length format spec, make it equivalent to str(obj) */ So even in the case of it, str(obj) should be the same as format(obj). I thoughtPEP 3101 specifies this behavior, but I don't see it there. It's hopelessly out of date, anyway. The docs for format() say these weasel words: "The default /format_spec/ is an empty string which usually gives the same effect as calling |str(value)| <https://docs.python.org/3/library/stdtypes.html#str>.". They're vague because a user defined type could do anything in __format__, including ignore this advice. But I think all of the built-in types conform to it. Eric

On 4/20/2021 11:13 AM, Eric V. Smith wrote:
I should mention that if you're going to implement __format__ and you don't care about the format specifier, then I'd do what object.__format__ does and raise an error for any non-empty format specifier. That way you can add a format specifier in the future and not worry that people are relying on passing in arbitrary format specs, which is a backward compatibility problem. That's why the error case was added to object.__format__: see https://bugs.python.org/issue7994 . Eric
-- Eric V. Smith

I'd guess it is totally up to the object, since str() calls `__str__` and format() calls `__format__`. Of course this now begs the question whether those enums should perhaps change their `__format__` to match their `__str__`...? But that would not suit your purpose. Then again, how would one get the pretty IntEnum-specific representation in a format- or f-string? I guess f"{flag!s}" would work. On Tue, Apr 20, 2021 at 7:59 AM Ethan Furman <ethan@stoneleaf.us> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On 4/20/21 8:46 AM, Guido van Rossum wrote:
I'd guess it is totally up to the object, since str() calls `__str__` and format() calls `__format__`. Of course this now begs the question whether those enums should perhaps change their `__format__` to match their `__str__`...?
When Enum was designed we made sure and captured `__repr__` and `__str__`, but not `__format__`. So at this point, `__format__` is the mixed-in data type's -- so `int.__format__` in the case of IntEnum. However, if a user updates the `__str__` of their Enum, then that will be used in the format: ```python from enum import IntEnum class Color(IntEnum): RED = 1 format(Color.RED) # '1' class Color(IntEnum): RED = 1 def __str__(self): return 'one' format(Color.RED) # 'one' ```
Yup, that does work. There is at least one user who is depending on `format()` using `int.__format__` because they filed a bug report when I broke it. Moving forward, I'm not sure having format() and str() ever be different is a good idea, especially since users who need, for example, Color.RED to be '1' can simply add a `__str__ = int.__str__` to their own custom base IntEnum class and be good to go. If we deprecate the current behavior now we could change it in 3.12. Thoughts? -- ~Ethan~

It has always bugged me that for Enums mixed in with int or str (a common pattern in my code), `f"{MyEnum.X}"` is not the same as `str(MyEnum.X)`. I'd be happy to see it changed!

On Tue, Apr 20, 2021 at 11:12 AM Ethan Furman <ethan@stoneleaf.us> wrote:
So to be clear, that one user wants f"{Color.RED}" to return "1" and not " Color.RED" (or something like that). And you want f"{Color.RED}" and str(Color.RED) to return the same value. Then together that means that str(Color.RED) must also return "1". Did I get that right? And are you happy with that outcome? -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On 4/20/21 12:01 PM, Guido van Rossum wrote:
On Tue, Apr 20, 2021 at 11:12 AM Ethan Furman wrote:
Almost right. They should both return `Color.RED`. Any users who want something different will need to do some work on their end: class MyIntEnum(IntEnum): def __format__ = int.__format__ class Color(MyIntEnum): RED = 1 format(Color.RED) # '1' The deprecation period will give that user, and others like them, time to add their own Enum base classes with the `__format__` method they desire. -- ~Ethan~

20.04.21 22:01, Guido van Rossum пише:
So to be clear, that one user wants f"{Color.RED}" to return "1" and not " Color.RED" (or something like that).
The user should write f"{int(Color.RED)}" or f"{Color.RED.value}". I have also an idea to support of additional conversion characters, so the use could write f"{Color.RED!i}". Opened a discussion for this on Python-ideas. https://mail.python.org/archives/list/python-ideas@python.org/thread/3AALXB6...

20.04.21 17:56, Ethan Furman пише:
format() without format specifier and str() should return the same value in general, otherwise it will confuse users. But str() for enum should in general return a symbolic name, not the attached value which is an implementation detail. This is the purpose of enums. I think that __format__ should return the same as __str__ by default. This can break some user code, but custom __str__ can break it as well. Some breakage is inevitable when we convert some constants in the stdlib to enums. If user want to get an integer representation of an enum member, it should use IntEnum.MEMBER.value or int(IntEnum.MEMBER).

On 4/20/2021 10:56 AM, Ethan Furman wrote:
If there's no format string, then object___format___impl() just calls PyObject_Str(). So unless the object overrides __format__, they're the same. int does override __format__, in _PyLong_FormatAdvancedWriter(). It has a comment that says: /* check for the special case of zero length format spec, make it equivalent to str(obj) */ So even in the case of it, str(obj) should be the same as format(obj). I thoughtPEP 3101 specifies this behavior, but I don't see it there. It's hopelessly out of date, anyway. The docs for format() say these weasel words: "The default /format_spec/ is an empty string which usually gives the same effect as calling |str(value)| <https://docs.python.org/3/library/stdtypes.html#str>.". They're vague because a user defined type could do anything in __format__, including ignore this advice. But I think all of the built-in types conform to it. Eric

On 4/20/2021 11:13 AM, Eric V. Smith wrote:
I should mention that if you're going to implement __format__ and you don't care about the format specifier, then I'd do what object.__format__ does and raise an error for any non-empty format specifier. That way you can add a format specifier in the future and not worry that people are relying on passing in arbitrary format specs, which is a backward compatibility problem. That's why the error case was added to object.__format__: see https://bugs.python.org/issue7994 . Eric
-- Eric V. Smith

I'd guess it is totally up to the object, since str() calls `__str__` and format() calls `__format__`. Of course this now begs the question whether those enums should perhaps change their `__format__` to match their `__str__`...? But that would not suit your purpose. Then again, how would one get the pretty IntEnum-specific representation in a format- or f-string? I guess f"{flag!s}" would work. On Tue, Apr 20, 2021 at 7:59 AM Ethan Furman <ethan@stoneleaf.us> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On 4/20/21 8:46 AM, Guido van Rossum wrote:
I'd guess it is totally up to the object, since str() calls `__str__` and format() calls `__format__`. Of course this now begs the question whether those enums should perhaps change their `__format__` to match their `__str__`...?
When Enum was designed we made sure and captured `__repr__` and `__str__`, but not `__format__`. So at this point, `__format__` is the mixed-in data type's -- so `int.__format__` in the case of IntEnum. However, if a user updates the `__str__` of their Enum, then that will be used in the format: ```python from enum import IntEnum class Color(IntEnum): RED = 1 format(Color.RED) # '1' class Color(IntEnum): RED = 1 def __str__(self): return 'one' format(Color.RED) # 'one' ```
Yup, that does work. There is at least one user who is depending on `format()` using `int.__format__` because they filed a bug report when I broke it. Moving forward, I'm not sure having format() and str() ever be different is a good idea, especially since users who need, for example, Color.RED to be '1' can simply add a `__str__ = int.__str__` to their own custom base IntEnum class and be good to go. If we deprecate the current behavior now we could change it in 3.12. Thoughts? -- ~Ethan~

It has always bugged me that for Enums mixed in with int or str (a common pattern in my code), `f"{MyEnum.X}"` is not the same as `str(MyEnum.X)`. I'd be happy to see it changed!

On Tue, Apr 20, 2021 at 11:12 AM Ethan Furman <ethan@stoneleaf.us> wrote:
So to be clear, that one user wants f"{Color.RED}" to return "1" and not " Color.RED" (or something like that). And you want f"{Color.RED}" and str(Color.RED) to return the same value. Then together that means that str(Color.RED) must also return "1". Did I get that right? And are you happy with that outcome? -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On 4/20/21 12:01 PM, Guido van Rossum wrote:
On Tue, Apr 20, 2021 at 11:12 AM Ethan Furman wrote:
Almost right. They should both return `Color.RED`. Any users who want something different will need to do some work on their end: class MyIntEnum(IntEnum): def __format__ = int.__format__ class Color(MyIntEnum): RED = 1 format(Color.RED) # '1' The deprecation period will give that user, and others like them, time to add their own Enum base classes with the `__format__` method they desire. -- ~Ethan~

20.04.21 22:01, Guido van Rossum пише:
So to be clear, that one user wants f"{Color.RED}" to return "1" and not " Color.RED" (or something like that).
The user should write f"{int(Color.RED)}" or f"{Color.RED.value}". I have also an idea to support of additional conversion characters, so the use could write f"{Color.RED!i}". Opened a discussion for this on Python-ideas. https://mail.python.org/archives/list/python-ideas@python.org/thread/3AALXB6...

20.04.21 17:56, Ethan Furman пише:
format() without format specifier and str() should return the same value in general, otherwise it will confuse users. But str() for enum should in general return a symbolic name, not the attached value which is an implementation detail. This is the purpose of enums. I think that __format__ should return the same as __str__ by default. This can break some user code, but custom __str__ can break it as well. Some breakage is inevitable when we convert some constants in the stdlib to enums. If user want to get an integer representation of an enum member, it should use IntEnum.MEMBER.value or int(IntEnum.MEMBER).
participants (6)
-
Brandt Bucher
-
Eric V. Smith
-
Ethan Furman
-
Guido van Rossum
-
MRAB
-
Serhiy Storchaka