Enum and the Standard Library (and __str__ and __repr__)
TL;DR Changes may be coming to Enum str() and repr() -- your (informed) opinion requested. :-) Python-Dev thread [0], summary below:
As you may have noticed, Enums are starting to pop up all over the stdlib [1].
To facilitate transforming existing module constants to IntEnums there is `IntEnum._convert_`. In Issue36548 [2] Serhiy modified the __repr__ of RegexFlag:
import re re.I re.IGNORECASE
I think for converted constants that that looks nice. For anyone that wants the actual value, it is of course available as the `.value` attribute:
re.I.value 2
I'm looking for arguments relating to:
- should _convert_ make the default __repr__ be module_name.member_name?
- should _convert_ make the default __str__ be the same, or be the numeric value?
After discussions with Guido I made a (largely done) PR [3] which: for stdlib global constants (such as RE) - repr() -> uses `module.member_name` - str() -> uses `member_name` for stdlib non-global constants, and enums in general - repr() -> uses `class.member_name` - str() -> uses `member_name` The questions I would most appreciate an answer to at this point: - do you think the change has merit? - why /shouldn't/ we make the change? As a reminder, the underlying issue is trying to keep at least the stdlib Enum representations the same for those that are replacing preexisting constants. -- ~Ethan~ [0] https://mail.python.org/archives/list/python-dev@python.org/message/CHQW6THT... [1] I'm working on making their creation faster. If anyone wanted to convert EnumMeta to C I would be grateful. [2] https://bugs.python.org/issue36548 [3] https://github.com/python/cpython/pull/22392
On Wed, Nov 4, 2020 at 6:11 AM Ethan Furman <ethan@stoneleaf.us> wrote:
TL;DR Changes may be coming to Enum str() and repr() -- your (informed) opinion requested. :-)
After discussions with Guido I made a (largely done) PR [3] which:
for stdlib global constants (such as RE) - repr() -> uses `module.member_name` - str() -> uses `member_name`
for stdlib non-global constants, and enums in general - repr() -> uses `class.member_name` - str() -> uses `member_name`
The questions I would most appreciate an answer to at this point:
- do you think the change has merit? - why /shouldn't/ we make the change?
Does this affect my own enums too, or just stdlib ones? I'm not entirely sure on that point. Specifically, will code like this be affected, and if so, what is the correct way to be compatible with multiple versions? from enum import IntFlag, auto class UF(IntFlag): SALLY = auto() PHASEPING = auto() ... for flag in UF: print("#define %s %d" % (str(flag), int(flag)), file=f) Currently, str(UF.SALLY) is "UF.SALLY", but this would change. I'm guessing the recommendation is "don't do that then"? (For instance, using flag.name to get "SALLY", and type(flag).__name__ to get "UF", since these flags won't only come from a single class.) I'm fine with the change, but I tend to use the stdlib enums as magic tokens rather than numbers (I don't care one iota that re.I is 2), so any clean repr will work just as well for me. +0.9. ChrisA
On 11/3/20 11:26 AM, Chris Angelico wrote:
On Wed, Nov 4, 2020 at 6:11 AM Ethan Furman wrote:
TL;DR Changes may be coming to Enum str() and repr() -- your (informed) opinion requested. :-)
Does this affect my own enums too, or just stdlib ones? I'm not entirely sure on that point.
That is the primary question under discussion. Unless somebody has a compelling reason not to change the stdlib enums, that is going to happen.
Specifically, will code like this be affected, and if so, what is the correct way to be compatible with multiple versions?
from enum import IntFlag, auto class UF(IntFlag): SALLY = auto() PHASEPING = auto() ... for flag in UF: print("#define %s %d" % (str(flag), int(flag)), file=f)
Currently, str(UF.SALLY) is "UF.SALLY", but this would change. I'm guessing the recommendation is "don't do that then"? (For instance, using flag.name to get "SALLY", and type(flag).__name__ to get "UF", since these flags won't only come from a single class.)
Assuming the change is made for all Enum, `str(UF.SALLY)` would produce `SALLY`. If that is a common pattern for you then you could make your own base class and inherit from that: class C_Enum(Enum): def __str__(self): return f"{self.__class__.__name__}.{self._name_}" -- ~Ethan~
On Tue, Nov 03, 2020 at 11:10:37AM -0800, Ethan Furman wrote:
After discussions with Guido I made a (largely done) PR [3] which:
for stdlib global constants (such as RE) - repr() -> uses `module.member_name` - str() -> uses `member_name`
What's `RE`? Do you mean the enums in the re module, `re.I` etc?
for stdlib non-global constants, and enums in general - repr() -> uses `class.member_name` - str() -> uses `member_name`
How does that work? If I do this: class MyEnum(Enum): RED = 'red' RED = MyEnum.RED is that a global constant or a non-global constant? How do I hook into whatever magic is used to decide one way or the other?
The questions I would most appreciate an answer to at this point:
- do you think the change has merit? - why /shouldn't/ we make the change?
It's not clear to me what the change actually is. You explained what the behaviour is being changed to but not what it is being changed from. My quick tests are inconclusive: re.I has repr and str both return 're.IGNORECASE'. Will that change? MyEnum.RED above has repr "<MyEnum.RED: 'red'>" and str 'MyEnum.RED'. Presumably that will change to 'MyEnum.RED' and 'RED' respectively. -- Steve
On 11/3/20 3:58 PM, Steven D'Aprano wrote:
On Tue, Nov 03, 2020 at 11:10:37AM -0800, Ethan Furman wrote:
After discussions with Guido I made a (largely done) PR [3] which:
for stdlib global constants (such as RE) - repr() -> uses `module.member_name` - str() -> uses `member_name`
What's `RE`? Do you mean the enums in the re module, `re.I` etc?
Yes, re.RegexFlag.
for stdlib non-global constants, and enums in general - repr() -> uses `class.member_name` - str() -> uses `member_name`
How does that work? If I do this:
class MyEnum(Enum): RED = 'red'
RED = MyEnum.RED
is that a global constant or a non-global constant?
Up to you to decide. ;-) RegexFlag is a bit of a special case, so we'll look at socket instead. Before Enum, socket had global constants such as AF_INET AF_UNIX SOCK_STREAM SOCK_DGRAM SOCK_RAW then Enum came along and: IntEnum._convert_( 'AddressFamily', __name__, lambda C: C.isupper() and C.startswith('AF_')) IntEnum._convert_( 'SocketKind', __name__, lambda C: C.isupper() and C.startswith('SOCK_')) which replaces the existing AF_* and SOCK_* constants with enums instead.
How do I hook into whatever magic is used to decide one way or the other?
The only magic involved is choosing an appropriate `__str__` and `__repr__`. In 3.10, Enum._convert_ will choose a "global" `__str__` and `__repr__`.
The questions I would most appreciate an answer to at this point:
- do you think the change has merit? - why /shouldn't/ we make the change?
It's not clear to me what the change actually is. You explained what the behaviour is being changed to but not what it is being changed from.
My apologies. For `socket` (where Enum is replacing preexisting global constants): str(socket.AF_INET) # from "AddressFamily.AF_INET" to "AF_INET" repr(socket.AF_INET) # from "<AddressFamily.AF_INET: 2>" to "socket.AF_INET" For `uuid`, which has a new Enum, SafeUUID (no preexisting global constants): str(SafeUUID.safe) # from "SafeUUID.safe" to "safe" repr(SafeUUID.unsafe) # from "<SafeUUID.unsafe: -1>" to "SafeUUID.unsafe"
re.I has repr and str both return 're.IGNORECASE'. Will that change?
Yes -- the str would change to 'IGNORECASE'.
MyEnum.RED above has repr "<MyEnum.RED: 'red'>" and str 'MyEnum.RED'. Presumably that will change to 'MyEnum.RED' and 'RED' respectively.
Assuming that regular Enum is changed (not just stdlib enums), then yes. -- ~Ethan~
participants (3)
-
Chris Angelico
-
Ethan Furman
-
Steven D'Aprano