As you may have noticed, Enums are starting to pop up all over the stdlib [1]. To facilitate transforming existing module constants to IntEnums there is `IntEnum._convert_`. In Issue36548 [2] Serhiy modified the __repr__ of RegexFlag:
import re re.I re.IGNORECASE
I think for converted constants that that looks nice. For anyone that wants the actual value, it is of course available as the `.value` attribute:
re.I.value 2
I'm looking for arguments relating to: - should _convert_ make the default __repr__ be module_name.member_name? - should _convert_ make the default __str__ be the same, or be the numeric value? Thank you for your time! -- ~Ethan~ [1] I'm working on making their creation faster. If anyone wanted to convert EnumMeta to C I would be grateful. [2] https://bugs.python.org/issue36548
Okay, let me take a shot at this. I actually like the status quo for regular enums, when repr() shows the type, name and value, and str() shows "classname.flagname", so I'd stick to that for converted flags. Even though this violates the rule of thumb that repr() should look like a valid expression -- perhaps a stronger rule of thumb is that repr() should show more than str(). Showing just (the str of) the value seems unkind, since e.g. showing '4' makes me think it's just an int. (Then again I can see that for *converted* flags that's not unreasonable.) But yeah, backwards compatibility. However, I don't think we got any complaints about the `re` flags, did we? On Fri, Sep 18, 2020 at 2:53 PM Ethan Furman <ethan@stoneleaf.us> wrote:
As you may have noticed, Enums are starting to pop up all over the stdlib [1].
To facilitate transforming existing module constants to IntEnums there is `IntEnum._convert_`. In Issue36548 [2] Serhiy modified the __repr__ of RegexFlag:
import re re.I re.IGNORECASE
I think for converted constants that that looks nice. For anyone that wants the actual value, it is of course available as the `.value` attribute:
re.I.value 2
I'm looking for arguments relating to:
- should _convert_ make the default __repr__ be module_name.member_name?
- should _convert_ make the default __str__ be the same, or be the numeric value?
Thank you for your time!
-- ~Ethan~
[1] I'm working on making their creation faster. If anyone wanted to convert EnumMeta to C I would be grateful.
[2] https://bugs.python.org/issue36548 _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CHQW6THT... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 9/18/20 3:12 PM, Guido van Rossum wrote:
On 9/18/20 2:44 PM, Ethan Furman wrote:
I'm looking for arguments relating to:
- should _convert_ make the default __repr__ be module_name.member_name?
I actually like the status quo for regular enums, when repr() shows the type, name and value, and str() shows "classname.flagname", so I'd stick to that for converted flags. Even though this violates the rule of thumb that repr() should look like a valid expression -- perhaps a stronger rule of thumb is that repr() should show more than str().
Well, if the repr is re.IGNORECASE and the str is 2, then we've met that bar. ;-)
- should _convert_ make the default __str__ be the same, or be the numeric value?
Showing just (the str of) the value seems unkind, since e.g. showing '4'makes me think it's just an int. (Then again I can see that for *converted* flags that's not unreasonable.)
But yeah, backwards compatibility. However, I don't think we got any complaints about the `re` flags, did we?
The only complaints I'm aware of were before the re constants became an Enum, but my social media activity consists almost entirely of Stackoverflow. So at this point, I think the choices are: Standard Enum __repr__ __str__ <RegexFlag.IGNORECASE: 2> RegexFlag.IGNORECASE and Modified Converted Constant __repr__ __str__ re.IGNORECASE 2 I will admit I fancy the MCC variant more, but we should make a choice and then be consistent. -- ~Ethan~
On Fri, Sep 18, 2020 at 6:19 PM Ethan Furman <ethan@stoneleaf.us> wrote:
On 9/18/20 3:12 PM, Guido van Rossum wrote:
On 9/18/20 2:44 PM, Ethan Furman wrote:
I'm looking for arguments relating to:
- should _convert_ make the default __repr__ be module_name.member_name?
I actually like the status quo for regular enums, when repr() shows the type, name and value, and str() shows "classname.flagname", so I'd stick to that for converted flags. Even though this violates the rule of thumb that repr() should look like a valid expression -- perhaps a stronger rule of thumb is that repr() should show more than str().
Well, if the repr is re.IGNORECASE and the str is 2, then we've met that bar. ;-)
- should _convert_ make the default __str__ be the same, or be the numeric value?
Showing just (the str of) the value seems unkind, since e.g. showing '4' makes me think it's just an int. (Then again I can see that for *converted* flags that's not unreasonable.)
But yeah, backwards compatibility. However, I don't think we got any complaints about the `re` flags, did we?
The only complaints I'm aware of were before the re constants became an Enum, but my social media activity consists almost entirely of Stackoverflow.
:-) So at this point, I think the choices are:
Standard Enum __repr__ __str__ <RegexFlag.IGNORECASE: 2> RegexFlag.IGNORECASE
and
Modified Converted Constant __repr__ __str__ re.IGNORECASE 2
I will admit I fancy the MCC variant more, but we should make a choice and then be consistent.
Hm, there's also what re actually does (tried in 3.8, 3.9 and 3.10): ```
import re print(str(re.I)) RegexFlag.IGNORECASE print(repr(re.I)) re.IGNORECASE
I honestly think we've already lost consistency.
Possibly regular enums (Enum, IntEnum, IntFlag) should just all return "
class.name", e.g. 'Color.RED', for both str() and repr(), and "converted"
enums should return "module.name", e.g. 're.IGNORE' for both? It restores
the rule of thumb, and it's not unusual. Maybe it's a new trend -- PEP
585's list[int] returns "list[int]" for both str() and repr(). :-)
At the same time it's as old as Python -- for most builtins other than
strings, repr() and str() are the same, and modeled after repr().
Historically, I only introduced the difference between str() and repr()
because of strings -- I wanted the REPL to clearly show the difference
between the number 42 and the string '42', but at the same time I wanted
both to print as just '42'. Of course numpy took a different fork in that
road...
Another brainstorm (or brainfart): maybe repr() should show the
module/class and the name, and str() should only show the name. We'd then
get
# Mock-up! print(str(re.i)) IGNORE print(repr(re.i)) re.IGNORE
and similar for Color.RED:
# Another mock-up! print(str(Color.RED)) RED print(repr(Color.RED)) Color.RED
--
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*
<http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 9/18/20 6:41 PM, Guido van Rossum wrote:
On Fri, Sep 18, 2020 at 6:19 PM Ethan Furman wrote:
So at this point, I think the choices are:
Standard Enum __repr__ __str__ <RegexFlag.IGNORECASE: 2> RegexFlag.IGNORECASE
and
Modified Converted Constant __repr__ __str__ re.IGNORECASE 2
I will admit I fancy the MCC variant more, but we should make a choice and then be consistent.
Hm, there's also what re actually does (tried in 3.8, 3.9 and 3.10): ```
import re print(str(re.I)) RegexFlag.IGNORECASE print(repr(re.I)) re.IGNORECASE
Well, the `str(re.I) == "RegexFlag.IGNORECASE"` is due to a bug I just fixed a couple days ago. The intent, according to a comment in the issue, was that str() and repr() would return the same thing, "re.IGNORECASE".
I honestly think we've already lost consistency.
I counted roughly 25 Enums in the stdlib at this point, and only two of them have modified reprs or strs; and one of those is an internal class. It's worth noting that two others are buggy -- one is being fancy with values and didn't get the custom __new__ correct, they other left in some commas when converting from it's original data type. Most of those 25 were created via [Enum|IntEnum]._convert_ .
Possibly regular enums (Enum, IntEnum, IntFlag) should just all return "class.name" , e.g. 'Color.RED', for both str() and repr(), and "converted" enums should return "module.name ", e.g. 're.IGNORE' for both? It restores the rule of thumb, and it's not unusual. Maybe it's a new trend -- PEP 585's list[int] returns "list[int]" for both str() and repr(). :-)
We could certainly go that route. The value is readily available, even if not printed by default.
At the same time it's as old as Python -- for most builtins other than strings, repr() and str() are the same, and modeled after repr(). Historically, I only introduced the difference between str() and repr() because of strings -- I wanted the REPL to clearly show the difference between the number 42 and the string '42', but at the same time I wanted both to print as just '42'.
Thank you for that -- it's an invaluable debugging tool which I have been grateful for many times.
Another brainstorm (or brainfart): maybe repr() should show the module/class and the name, and str() should only show the name. We'd then get ```
# Mock-up! print(str(re.i)) IGNORE print(repr(re.i)) re.IGNORE
and similar for Color.RED:
# Another mock-up! print(str(Color.RED)) RED print(repr(Color.RED)) Color.RED
I think that's too terse -- the first bit, whether class or module, repr or str, is very important -- especially if you have several enums using some of the same names for their members. -- ~Ethan~
Ethan Furman writes:
I counted roughly 25 Enums in the stdlib at this point, and only two of them have modified reprs or strs; and one of those is an internal class. It's worth noting that two others are buggy -- one is being fancy with values and didn't get the custom __new__ correct, they other left in some commas when converting from it's original data type.
+1 for consistency and documentation of guidelines. I have no opinion on what, if anything, to change to get consistency in the stdlib implementations.
On Sat, Sep 19, 2020 at 1:29 AM Ethan Furman <ethan@stoneleaf.us> wrote:
On 9/18/20 6:41 PM, Guido van Rossum wrote: [...]
Another brainstorm (or brainfart): maybe repr() should show the module/class and the name, and str() should only show the name. We'd then get ```
# Mock-up! print(str(re.i)) IGNORE print(repr(re.i)) re.IGNORE
and similar for Color.RED:
# Another mock-up! print(str(Color.RED)) RED print(repr(Color.RED)) Color.RED
I think that's too terse -- the first bit, whether class or module, repr or str, is very important -- especially if you have several enums using some of the same names for their members.
Hm, the more I think about it the more I like this proposal. :-) When exploring in the REPL, my proposal *would* show the class name (but not the module name -- one can cause obfuscation that way too, but it would become unwieldy, and the custom seems to be top stop short of that). But when printing casually, wouldn't it be nice if we could cause end-user-friendly output to be produced by default? End users probably don't care about the class name, but they would care about the color name. E.g. class Color(Enum): red = 0 green = 1 blue = 2 class Flowers(Enum): roses = 0 violets = 1 def send_bouquet(flowers, color): print("I'm sending you a bouquet of", color, flowers) Looking over the stdlib enums, these usually already have their "kind" encoded in the name, e.g. AF_INET, SOCK_STREAM, SIGINT. And the re flags are pretty unique as well (IGNORE, MULTILINE, DOTALL). -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 9/19/20 11:32 AM, Guido van Rossum wrote:
On Sat, Sep 19, 2020 at 1:29 AM Ethan Furman wrote:
On 9/18/20 6:41 PM, Guido van Rossum wrote:
Another brainstorm (or brainfart): maybe repr() should show the module/class and the name, and str() should only show the name. We'd then get ```
# Mock-up! print(str(re.i)) IGNORE print(repr(re.i)) re.IGNORE
and similar for Color.RED:
# Another mock-up! print(str(Color.RED)) RED print(repr(Color.RED)) Color.RED
I think that's too terse -- the first bit, whether class or module, repr or str, is very important -- especially if you have several enums using some of the same names for their members.
Hm, the more I think about it the more I like this proposal. :-)
When exploring in the REPL, my proposal *would* show the class name (but not the module name -- one can cause obfuscation that way too, but it would become unwieldy, and the custom seems to be to stop short of that).
But when printing casually, wouldn't it be nice if we could cause end-user-friendly output to be produced by default? End users probably don't care about the class name, but they would care about the color name. E.g.
class Color(Enum): red = 0 green = 1 blue = 2
class Flowers(Enum): roses = 0 violets = 1
def send_bouquet(flowers, color): print("I'm sending you a bouquet of", color, flowers)
That's a fun example -- but member names are supposed to be UPPER_CASE, so your string would be, for example: I'm sending you a bouquet of RED VIOLETS Of course, if we went with the idea of __str__ just returning the value, then: class Color(Enum): RED = 'red' GREEN = 'green' BLUE = 'blue' class Flowers(Enum): ROSES = 'roses' VIOLETS = violets' which would indeed give us: I'm sending you a bouquet of red violets and the above enums could be even simpler if using enum.auto() and a custom NamedEnum: class NamedEnum(Enum): """ member values are the lower-cased member names """ def _generate_next_value_(name, start, count, last_values): return name.lower() class Flowers(NamedEnum): ROSES = auto() VIOLETS = auto() -- ~Ethan~
That would be more palatable if it wasn't so common to use manually assigned numerical values (as most of the examples in the enum module docs do) or the default auto(). I'm just trying to present an argument that if the str() of an enum was its name and the repr() was its "full name" (at least including the class) that would be pretty sweet. On Sat, Sep 19, 2020 at 1:02 PM Ethan Furman <ethan@stoneleaf.us> wrote:
On 9/19/20 11:32 AM, Guido van Rossum wrote:
On Sat, Sep 19, 2020 at 1:29 AM Ethan Furman wrote:
On 9/18/20 6:41 PM, Guido van Rossum wrote:
Another brainstorm (or brainfart): maybe repr() should show the module/class and the name, and str() should only show the name. We'd then get ```
> # Mock-up! > print(str(re.i)) IGNORE > print(repr(re.i)) re.IGNORE >
and similar for Color.RED:
> # Another mock-up! > print(str(Color.RED)) RED > print(repr(Color.RED)) Color.RED >
I think that's too terse -- the first bit, whether class or module, repr or str, is very important -- especially if you have several enums using some of the same names for their members.
Hm, the more I think about it the more I like this proposal. :-)
When exploring in the REPL, my proposal *would* show the class name (but not the module name -- one can cause obfuscation that way too, but it would become unwieldy, and the custom seems to be to stop short of that).
But when printing casually, wouldn't it be nice if we could cause end-user-friendly output to be produced by default? End users probably don't care about the class name, but they would care about the color name. E.g.
class Color(Enum): red = 0 green = 1 blue = 2
class Flowers(Enum): roses = 0 violets = 1
def send_bouquet(flowers, color): print("I'm sending you a bouquet of", color, flowers)
That's a fun example -- but member names are supposed to be UPPER_CASE, so your string would be, for example:
I'm sending you a bouquet of RED VIOLETS
Of course, if we went with the idea of __str__ just returning the value, then:
class Color(Enum): RED = 'red' GREEN = 'green' BLUE = 'blue'
class Flowers(Enum): ROSES = 'roses' VIOLETS = violets'
which would indeed give us:
I'm sending you a bouquet of red violets
and the above enums could be even simpler if using enum.auto() and a custom NamedEnum:
class NamedEnum(Enum): """ member values are the lower-cased member names """ def _generate_next_value_(name, start, count, last_values): return name.lower()
class Flowers(NamedEnum): ROSES = auto() VIOLETS = auto()
-- ~Ethan~ _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/QJX2D56F... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 9/19/20 2:08 PM, Guido van Rossum wrote:
I'm just trying to present an argument that if the str() of an enum was its name and the repr() was its "full name" (at least including the class) that would be pretty sweet.
Well, we're still early enough in the 3.10 cycle we can make that change and see if anybody hollers. ;-) -- ~Ethan~
On Sat, Sep 19, 2020 at 11:44 AM Guido van Rossum <guido@python.org> wrote:
Another brainstorm (or brainfart): maybe repr() should show the module/class and the name, and str() should only show the name. We'd then get ```
# Mock-up! print(str(re.i)) IGNORE print(repr(re.i)) re.IGNORE
and similar for Color.RED:
# Another mock-up! print(str(Color.RED)) RED print(repr(Color.RED)) Color.RED
+1. There's actually a bit of a weird edge case with IntFlag at the moment, and this would bypass that, at least for the str().
from enum import IntFlag, auto class UF(IntFlag): ... CT_LOW_GRAVITY = auto() ... FLYING = auto() ... UF.CT_LOW_GRAVITY | UF.FLYING <UF.FLYING|CT_LOW_GRAVITY: 3> str(UF.CT_LOW_GRAVITY | UF.FLYING) 'UF.FLYING|CT_LOW_GRAVITY'
The "UF." prefix is put on the start of the combined group, which means it's not actually evallable (for what it's worth), and it's that bit inconsistent. I'm absolutely fine with removing the classname altogether, so this would show as "FLYING|CT_LOW_GRAVITY". ChrisA
On Fri, Sep 18, 2020 at 10:42 PM Guido van Rossum <guido@python.org> wrote:
At the same time it's as old as Python -- for most builtins other than strings, repr() and str() are the same, and modeled after repr(). Historically, I only introduced the difference between str() and repr() because of strings -- I wanted the REPL to clearly show the difference between the number 42 and the string '42', but at the same time I wanted both to print as just '42'. Of course numpy took a different fork in that road...
That's an interesting history tidbit, thanks for sharing, Guido. Like Ethan, I also find that distinction invaluable! While researching the first edition of Fluent Python, I found the 1996 paper "How to Display an Object as a String: printString and displayString" [1]. In it, the author Bobby Woolf explains that VisualWorks Smalltalk's `Object` class provides two methods: """ • printString—Displays the object the way the developer wants to see it. • displayString—Displays the object the way the user wants to see it. """ I love these simple definitions, and they are followed by most Python classes that have distinct __repr__ and __str__. [1] http://esug.org/data/HistoricalDocuments/TheSmalltalkReport/ST07/04wo.pdf Developers or users rarely care about the numeric value of an Enum, so I am for __repr__ providing a "more qualified name" and __str__ providing a "less qualified name", but still qualified with at least one dot in it—eg. Color.RED and not RED). In the rare cases where someone cares about the underlying integer, let them get the value. Cheers, Luciano -- Luciano Ramalho | Author of Fluent Python (O'Reilly, 2015) | http://shop.oreilly.com/product/0636920032519.do | Technical Principal at ThoughtWorks | Twitter: @ramalhoorg
On Fri, 18 Sep 2020 18:14:35 -0700 Ethan Furman <ethan@stoneleaf.us> wrote:
So at this point, I think the choices are:
Standard Enum __repr__ __str__ <RegexFlag.IGNORECASE: 2> RegexFlag.IGNORECASE
and
Modified Converted Constant __repr__ __str__ re.IGNORECASE 2
My vote goes towards the "Standard Enum" choice. "2" in particular is entirely uninformative in this case, since all the code I've ever seen and written uses the symbolic constant, not the numeric value (which as far as I'm concerned is an implementation detail). Regards Antoine.
19.09.20 00:44, Ethan Furman пише:
I'm looking for arguments relating to:
- should _convert_ make the default __repr__ be module_name.member_name?
In most cases enums with _convert_ are used to replace old module globals. They are accessible as module_name.member_name and always used as module_name.member_name in user code. Also module_name.member_name is usually shorter than module_name.class_name.member_name or <class_name.member_name: value>. And the main advantage to me is using repr in compound objects: "foo.Command(action=foo.READ, kind=foo.FILE)" can be copied just from the debug output to the test code in contrary to "foo.Command(action=<Action.READ: 128>, kind=<ObjectKind.FILE: <object object at 0x7fcedc383f10>>)" which needs a lot of editing (and I often need to copy a list or a dict of such objects). I always override the default __repr__ in production code.
- should _convert_ make the default __str__ be the same, or be the numeric value?
I do not think that exposing the numeric value in __str__ would be useful. Numeric values are often arbitrary, this is why we use names at first place. The only exception is StrEnum -- overriding __str__ of str subclass may be not safe. Some code will call str() implicitly, other will read the string content of the object directly, and they will be different. I would consider returning just the member name from __str__. It have its pros and contras, so in the face of ambiguity it is better to restore the default implementation: __str__ = object.__str__.
On 9/22/20 12:11 AM, Serhiy Storchaka wrote:
The only exception is StrEnum -- overriding __str__ of str subclass may be not safe. Some code will call str() implicitly, other will read the string content of the object directly, and they will be different.
Following up on that: >>> import enum >>> >>> class TestStr(enum.StrEnum): ... One = '1' ... Two = '2' ... Three = '3' ... >>> isinstance(TestStr.One, str) True >>> str(TestStr.One) 'TestStr.One' >>> TestStr.One == '1' True I agree, str.__str__ needs to be used in this case. Thanks, Serhiy! -- ~Ethan~
22.09.20 16:57, Ethan Furman пише:
On 9/22/20 12:11 AM, Serhiy Storchaka wrote:
The only exception is StrEnum -- overriding __str__ of str subclass may be not safe. Some code will call str() implicitly, other will read the string content of the object directly, and they will be different.
Following up on that:
>>> import enum >>> >>> class TestStr(enum.StrEnum): ... One = '1' ... Two = '2' ... Three = '3' ... >>> isinstance(TestStr.One, str) True
>>> str(TestStr.One) 'TestStr.One'
>>> TestStr.One == '1' True
I agree, str.__str__ needs to be used in this case.
It is more interesting to compare '%s' % (TestStr.One,) and '{}'.format(TestStr.One). Also str.upper(TestStr.One) and int(TestStr.One) ignore __str__.
participants (7)
-
Antoine Pitrou
-
Chris Angelico
-
Ethan Furman
-
Guido van Rossum
-
Luciano Ramalho
-
Serhiy Storchaka
-
Stephen J. Turnbull