Re: [Python-ideas] PEP for enum library type?

Hi all, Sorry to jump into this discussion so late, but I just finished reading through this thread and had a few thoughts.. (Sorry this message is a bit long. TL;DR: Please check the list of required/desired/undesired properties at the end and let me know if I've gotten anything seriously wrong with my interpretation of the discussion thus far) It seems to me that this particular thread started out as a call to step away from existing implementations and take a look at this issue from the direction of "what do we want/need" instead, but then it quickly got sidetracked back into discussing all the details of various existing/proposed implementations. I'd like to try to take a step back (again) for a minute and raise the question: What do we actually want to get out of this whole endeavor? First of all, as I see it, there are two main (and fairly distinct) use cases for enums in Python: 1. Predefined "unique values" for passing around within Python code 2. API-defined constants for interacting with non-Python libraries, etc (i.e. C defines/enums that need to be represented in Python, or database field values, etc) In non-Python code, typically enums have always been represented under the covers as ints, and therefore must be passed back and forth as numbers. The fact that they have an integer value, however, is purely an implementation artifact. It comes from the fact that C and some other languages don't have a rich enough type system to properly make enums their own distinct types, but Python does not have this limitation, and I think we should be careful not to constrain the way we do things within Python just because of the limitations of other languages. Where possible I believe we should conceptually be thinking of enums not as "sequences of ints" but more as "collections of singletons". That is, they are simply objects, with a defined name and type, which compare equal to themselves but not to others, and are generally related to others by some sort of grouping mechanism (and the same name always maps to the same object). In this context, the idea of assigning a "value" to an enum makes little sense and is arguably completely unnecessary. (and if we eliminate this aspect, it mitigates many of the issues that have been brought up about evaluation order and accidental duplication, in addition to potentially making the base design a lot simpler) Obviously, the second use case does require an association between enums and (typically int) values, but that could be viewed as simply a special case of the larger definition of "enums", rather than the other way around. I do think one thing worth noting, however, is that (at least in my experience) the cases which require associating names with values pretty much also always require that every name has a specific value, so the value for each and every enum within the group should probably be being defined explicitly anyway (I have encountered very few cases where it's actually useful to mix-and-match "I care about this value but I don't care about those ones"). It doesn't seem unreasonable, therefore, to define two different categories of enums: one that has no concept of "value" (for pure-Python), and one which does have associated values but all values have to be specified explicitly (for the "mapping constants" case). On a related note, to be honest, I'm not really sure I can think of any realistic use cases for "string enums" (or really anything other than ints in general). Does anybody have an example of where this would actually be useful as opposed to just using "pure" (valueless) enums (which would already have string names)? Anyway, in the interest of trying to get the discussion back onto more theoretical ground, I also wanted to try to summarize the more general thoughts/impressions I've gleaned from the discussions up to this point. From what I can tell, these are some of the properties that there seems to be some general consensus enums probably should or shouldn't have: Required properties (without these, any implementation is not generally useful, or is at least something different than an "enum"): 1. Enums must be groupable in some way (i.e. "Colors", or "Error values") 2. Enums within the same group must not compare equal to each other (unless two names are intentionally defined to map to the same enum (i.e. "aliases")) 3. (Within reason and the limitations of Python) Once defined, an enum's properties (i.e. its name, identity, group membership, relationships to other objects, etc) must be treated as immutable (i.e. not change out from under the programmer unexpectedly). Conceptually they should be considered to be "constants". Desirable properties (which ones are more or less desirable will vary for different people, but from what I've heard I think everybody sorta agrees that all of these could be good things as long as they don't cause other problems): 1. Enums should represent themselves (str/repr) by symbolic names, not as ints, etc. 2. Enums from different groups should preferably not compare equal to each other (even if they have the same associated int value). 3. It should be possible to determine what group an enum belongs to. 4. Enums/groups should be definable inline using a fairly simple Python syntax. 5. It should also be relatively easy to define enums/groups programmatically. 6. By default, enums should be referenceable as relatively simple identifiers (i.e. no need for quoting, function-calls, etc, just variables/attributes/etc) 7. If the programmer doesn't care about the value of an enum, she shouldn't have to explicitly state a meaningless value. 8. (If an enum does have an associated value) it should be easy to compare with and/or convert back and forth between enums and values (so that they can be used with existing APIs). 9. It would be nice to be able to associate docstrings, and possibly other metadata with enums. Undesirable properties: 1. Enum syntax should not be "too magic". (In particular, it's pretty clear at this point that creating new enums as a side-effect of name lookups (even as convenient as it does make the syntax) is ultimately not gonna fly) 2. The syntax for defining enums should not be so onerous or verbose that it's significantly harder to use than the existing idioms people are already using. 3. The syntax for defining enums should not be so alien that it will completely baffle programmers who are already used to typical Python constructs. 4. It shouldn't be necessary to quote enum names when defining them (since they won't be quoted when they're used) I want to check: Is this a valid summary of things? Anything I missed, or do people have substantial objections to any of the required/desirable/undesirable points I mentioned? Obviously, it may not be possible to achieve all of the desirable properties at the same time, but I think it's useful to start with an idea of what we'd ideally like, and then we can sit down and see how close we can actually get to it.. (Actually, on pondering these properties, I've started to put together a slightly different enum implementation which I think has some potential (it's somewhat a cross between Barry's and Tim's with a couple of ideas of my own). I think I'll flesh it out a little more and then put it up for comment as a separate thread, if people don't mind..) --Alex

Anyway, in the interest of trying to get the discussion back onto more theoretical ground, I also wanted to try to summarize the more general thoughts/impressions I've gleaned from the discussions up to this point. From what I can tell, these are some of the properties that there seems to be some general consensus enums probably should or shouldn't have:
Required properties (without these, any implementation is not generally useful, or is at least something different than an "enum"):
1. Enums must be groupable in some way (i.e. "Colors", or "Error values") 2. Enums within the same group must not compare equal to each other (unless two names are intentionally defined to map to the same enum (i.e. "aliases")) 3. (Within reason and the limitations of Python) Once defined, an enum's properties (i.e. its name, identity, group membership, relationships to other objects, etc) must be treated as immutable (i.e. not change out from under the programmer unexpectedly). Conceptually they should be considered to be "constants".
On 21 February 2013 16:18, Alex Stewart <foogod@gmail.com> wrote: ... On a related note, to be honest, I'm not really sure I can think of any realistic use cases for "string enums" (or really anything other than ints in general). Does anybody have an example of where this would actually be useful as opposed to just using "pure" (valueless) enums (which would already have string names)? Backwards compatibility. Of course, although you have restricted your wording to "enums" we need constants. And it would be nice to replace magic string contants in the Python stdlib itself, by real -"unquoted" constants - and if the new constants are equivalent to the strings existing in current code, there is no breaking. Think of """ my_text.encode("utf--8", errors=str.IGNORE) """, instead of """ my_text.encode("utf--8", errors="ignore") """, for example. ... 4. They should subclass INTs in way it is possible to interact with low level libraries already using C Enums (only aplicable to those that need an int value) - their use should be as transparent as True and False are today.
Desirable properties (which ones are more or less desirable will vary for different people, but from what I've heard I think everybody sorta agrees that all of these could be good things as long as they don't cause other problems):
1. Enums should represent themselves (str/repr) by symbolic names, not as ints, etc.
2. Enums from different groups should preferably not compare equal to each other (even if they have the same associated int value). I am not shure about this requirement. I agree about "valueless" Enums, but as far as they have values associated, they should behave just like
I would say that, while not "required", as low level languages lack this, the whole idea is simply not worth implementing without this. This is the real motivation - and NB. also for "Constants", not only "Enums". their values. I did not hear of anyone injured by " True == 1" in Python.
3. It should be possible to determine what group an enum belongs to. Enums/groups should be definable inline using a fairly simple Python syntax. 4. It should also be relatively easy to define enums/groups programmatically. 5. By default, enums should be referenceable as relatively simple identifiers (i.e. no need for quoting, function-calls, etc, just variables/attributes/etc) 6. If the programmer doesn't care about the value of an enum, she shouldn't have to explicitly state a meaningless value. 7. (If an enum does have an associated value) it should be easy to compare with and/or convert back and forth between enums and values (so that they can be used with existing APIs). 8. It would be nice to be able to associate docstrings, and possibly other metadata with enums.
Undesirable properties:
1. Enum syntax should not be "too magic". (In particular, it's pretty clear at this point that creating new enums as a side-effect of name lookups (even as convenient as it does make the syntax) is ultimately not gonna fly) 2.The syntax for defining enums should not be so onerous or verbose that it's significantly harder to use than the existing idioms people are already using. 3. The syntax for defining enums should not be so alien that it will completely baffle programmers who are already used to typical Python constructs. 4. It shouldn't be necessary to quote enum names when defining them (since they won't be quoted when they're used)
I want to check: Is this a valid summary of things? Anything I missed, or do people have substantial objections to any of the required/desirable/undesirable points I mentioned?
Obviously, it may not be possible to achieve all of the desirable properties at the same time, but I think it's useful to start with an idea of what we'd ideally like, and then we can sit down and see how close we can actually get to it..
(Actually, on pondering these properties, I've started to put together a slightly different enum implementation which I think has some potential (it's somewhat a cross between Barry's and Tim's with a couple of ideas of my own). I think I'll flesh it out a little more and then put it up for comment as a separate thread, if people don't mind..)
Ok - I added there what I think are the most important. What lead me to ressurect this subject was a period when I was dealing with a couple of Image packages related packages - PIL, GIMP-Python and Pygame -- all of them make intense use of Constants and Enums...all of which print as "int"s. All the three have the common trait that they can't rely on other 3rd party Python packages besides the stdlib - since there is a relatively large codebase depending only on them and Python. js -><-
--Alex

On Thursday, February 21, 2013 11:54:49 AM UTC-8, Joao S. O. Bueno wrote:
4. They should subclass INTs in way it is possible to interact with low level libraries already using C Enums (only aplicable to those that need an int value) - their use should be as transparent as True and False are today.
Ok, I can see that would be a desirable trait (and personally I'm with you on that one), though I'm not sure it really qualifies as "required" (there are potential uses for int-enums which would still be useful/valid even if it couldn't do that automatically), but I'll definitely add that to my "desirable" list..
1. Enums should represent themselves (str/repr) by symbolic names, not as ints, etc. I would say that, while not "required", as low level languages lack this, the whole idea is simply not worth implementing without this. This is the real motivation - and NB. also for "Constants", not only "Enums".
Frankly, I think this one is so easy it's really not worth fussing about that much. It's pretty much a given that any solution we come up with will have this functionality anyway..
2. Enums from different groups should preferably not compare equal to each
other (even if they have the same associated int value). I am not shure about this requirement. I agree about "valueless" Enums, but as far as they have values associated, they should behave just like their values. I did not hear of anyone injured by " True == 1" in Python.
Well, it's explicitly not a requirement (that's why it's under "desirable" instead of "required"). Here's the question, though: Do you believe if a solution was proposed which had this property it would be a bad thing? Note that this isn't really exactly the same as the "True == 1" case. In my opinion the argument in favor of this is along the lines of: class FooOptions (Enum): YES = 1 MAYBE = 2 NO = 3 class BarOptions (Enum): NO = 1 MAYBE = 2 YES = 3 def foofunc(choice): if choice == FooOptions.YES: do_something() Now what if somebody calls foofunc(BarOptions.NO)? Admittedly, this is a somewhat exaggerated case, but I still think it would be better if in these sorts of cases FooOptions.YES != BarOptions.NO (if one explicitly did want to test if the values were the same (regardless of enum type) they could do something like "if int(choice) == FooOptions.YES" instead) --Alex

On 22/02/13 08:18, Alex Stewart wrote:
In non-Python code, typically enums have always been represented under the covers as ints, and therefore must be passed back and forth as numbers. The fact that they have an integer value, however, is purely an implementation artifact. It comes from the fact that C and some other languages don't have a rich enough type system to properly make enums their own distinct type
I don't think that's true for all languages. For example, enums in Pascal are definitely distinct types from integers, yet the language explicitly assigns them ordinal values and defines an ordering for them. Wirth must have thought there was a benefit in doing that.
It doesn't seem unreasonable, therefore, to define two different categories of enums: one that has no concept of "value" (for pure-Python), and one which does have associated values but all values have to be specified explicitly
That sounds reasonable. However, I'm wondering if there isn't a third case: where you don't care about the values, but you do want them to have a defined ordering?
1. Enum syntax should not be "too magic". (In particular, it's pretty clear at this point that creating new enums as a side-effect of name lookups (even as convenient as it does make the syntax) is ultimately not gonna fly)
4. It shouldn't be necessary to quote enum names when defining them (since they won't be quoted when they're used)
Satisfying both of these constraints without new syntax seems to be pretty much impossible. That seems to be the main sticking point in these discussions.
I want to check: Is this a valid summary of things?
It looks pretty comprehensive to me! -- Greg

On 2013-02-21 23:24, Greg Ewing wrote:
On 22/02/13 08:18, Alex Stewart wrote:
In non-Python code, typically enums have always been represented under the covers as ints, and therefore must be passed back and forth as numbers. The fact that they have an integer value, however, is purely an implementation artifact. It comes from the fact that C and some other languages don't have a rich enough type system to properly make enums their own distinct type
I don't think that's true for all languages. For example, enums in Pascal are definitely distinct types from integers, yet the language explicitly assigns them ordinal values and defines an ordering for them. Wirth must have thought there was a benefit in doing that.
[snip] It means that sets of enums can be implemented using bitsets.

MRAB wrote:
On 2013-02-21 23:24, Greg Ewing wrote:
For example, enums in Pascal are definitely distinct types from integers, yet the language explicitly assigns them ordinal values and defines an ordering for them. Wirth must have thought there was a benefit in doing that.
It means that sets of enums can be implemented using bitsets.
That could still have been done without exposing the ordinal values. Yet, Pascal provides an ord() function. I suppose one viewpoint might be that as long as there is a defined ordering, there's always going to be at least an implied mapping to natural numbers, so there's not much point in trying to hide it. -- Greg

On 21/02/2013 23:24, Greg Ewing wrote:
On 22/02/13 08:18, Alex Stewart wrote:
In non-Python code, typically enums have always been represented under the covers as ints, and therefore must be passed back and forth as numbers. The fact that they have an integer value, however, is purely an implementation artifact. It comes from the fact that C and some other languages don't have a rich enough type system to properly make enums their own distinct type
I don't think that's true for all languages. For example, enums in Pascal are definitely distinct types from integers, yet the language explicitly assigns them ordinal values and defines an ordering for them. Wirth must have thought there was a benefit in doing that.
Not necessarily. With respect to the great man, he may not have given the matter as much thought, or considered (or been interested in) as many use cases as the contributors to this thread have.
It doesn't seem unreasonable, therefore, to define two different categories of enums: one that has no concept of "value" (for pure-Python), and one which does have associated values but all values have to be specified explicitly
That sounds reasonable. However, I'm wondering if there isn't a third case: where you don't care about the values, but you do want them to have a defined ordering? The same thought occurred to me, although I could not think of a use case on the spot. But I think that there surely are such.
1. Enum syntax should not be "too magic". (In particular, it's pretty clear at this point that creating new enums as a side-effect of name lookups (even as convenient as it does make the syntax) is ultimately not gonna fly)
4. It shouldn't be necessary to quote enum names when defining them (since they won't be quoted when they're used)
Satisfying both of these constraints without new syntax seems to be pretty much impossible. That seems to be the main sticking point in these discussions.
I want to check: Is this a valid summary of things?
It looks pretty comprehensive to me!
Apart from the point about maybe wanting to order enums, I agree. I think Alex is right when he says you want to create singleton enums that don't equate to anything else. We don't think of RED, GREEN, BLUE as being identical with the ints 1,2,3, and so don't want our implementation of them to compare equal to those ints. Rob Cliffe

On 02/21/2013 05:58 PM, Rob Cliffe wrote:
On 21/02/2013 23:24, Greg Ewing wrote:
On 22/02/13 08:18, Alex Stewart wrote:
It doesn't seem unreasonable, therefore, to define two different categories of enums: one that has no concept of "value" (for pure-Python), and one which does have associated values but all values have to be specified explicitly
That sounds reasonable. However, I'm wondering if there isn't a third case: where you don't care about the values, but you do want them to have a defined ordering?
The same thought occurred to me, although I could not think of a use case on the spot. But I think that there surely are such.
If you want order, but don't care about values, let them be ints and assign names in the order you want.
1. Enum syntax should not be "too magic". (In particular, it's pretty clear at this point that creating new enums as a side-effect of name lookups (even as convenient as it does make the syntax) is ultimately not gonna fly)
4. It shouldn't be necessary to quote enum names when defining them (since they won't be quoted when they're used)
Satisfying both of these constraints without new syntax seems to be pretty much impossible. That seems to be the main sticking point in these discussions.
I want to check: Is this a valid summary of things?
It looks pretty comprehensive to me!
Apart from the point about maybe wanting to order enums, I agree. I think Alex is right when he says you want to create singleton enums that don't equate to anything else. We don't think of RED, GREEN, BLUE as being identical with the ints 1,2,3, and so don't want our implementation of them to compare equal to those ints.
That depends entirely an what you are interfacing with: if it's a graphical library then you surely will want to compare equal with an int; if something else where the int value is meaningless, base the enum on a string. ;) -- ~Ethan~

On 21 February 2013 22:58, Rob Cliffe <rob.cliffe@btinternet.com> wrote:
Apart from the point about maybe wanting to order enums, I agree. I think Alex is right when he says you want to create singleton enums that don't equate to anything else. We don't think of RED, GREEN, BLUE as being identical with the ints 1,2,3, and so don't want our implementation of them to compare equal to those ints.
Again, I thinnkt hat my 2002 code that expected a (1) working when it got a (True) instead was a great, great thing. And I think that lots of places where magical strings today ("str.decode" errors argument, "compile" mode argument) should be lifted up to expect constants and still keep working if one pass them the strings "replace" or "eval". Requiring that things compare differently, IMHO, is too much statically-typed-minded - it is not the way of Python. As for Alex's surreal example of someone defining "NO" and "YES" with different values in different Enums and passing the wrong one to a given call - this person will have it comming. Think of this case instead:
disk_io.IDDLE == "iddle" True network.IDDLE == "iddle" True disk_io.IDDLE == network.IDDLE False
So, if the compared enums do have an explicit value assigned to them (and I agree there are some cases the value is not needed) , they should compare just as that value - anything else is more close to C++ than Python. And no one is preventing people from using decorators or use other mecanisms for certain function calls to accept only values of a given Enum, just as no one ever was prohibited from statically verifying any other values in a call.

On Friday, February 22, 2013 4:53:27 AM UTC-8, Joao S. O. Bueno wrote:
Again, I thinnkt hat my 2002 code that expected a (1) working when it got a (True) instead was a great, great thing. And I think that lots of places where magical strings today ("str.decode" errors argument, "compile" mode argument) should be lifted up to expect constants and still keep working if one pass them the strings "replace" or "eval".
Ok, first of all, nobody's suggesting that wouldn't work. I'm not talking about comparing enums with base types (which I have no problem working like this), I'm talking specifically about comparing enums directly with other enums from a different category. Requiring that things compare differently, IMHO, is too much
statically-typed-minded - it is not the way of Python. As for Alex's surreal example of someone defining "NO" and "YES" with different values in different Enums and passing the wrong one to a given call - this person will have it comming.
I think this is unnecessarily dismissive. Obviously, my example was simplified, but what if these two enums were created by two completely different people in different modules that just happen to be used in the same program? Does the hapless programmer "have it coming" then? You say that my suggestion is more like C++ than Python, but I counter that your suggestion seems to me a lot more like Perl than Python. It smacks of one of the things I dislike most about Perl, which also happens to be, in my opinion, one of the language features most responsible for (often subtle) bugs in Perl code even by experienced programmers: The contents of variables can have very different semantic meanings depending solely on the context in which they're evaluated (scalar vs array, etc). In this case what you're saying is that just because entirely unrelated programmers happened to use the same int value underneath, that Align.TOP and Errno.EPERM and Choice.OK should all be considered to mean exactly the same thing everywhere (even if none of the programmers actually intended or foresaw that possibility), but what that meaning is actually depends entirely on what context they happen to be evaluated in. Another way of looking at this is that what you're proposing is effectively implicit type-casting between enums of different types. Python does support implicit casting in a very few well-defined cases, but not most of the time, and only when doing so does not discard important semantic information (i.e. "meaning"). For ints, floats, etc, they really only have one piece of semantic information ("a number value"). Enums, however, have more than that. They have (potentially) a number value, but they also have names, and they (preferably) have information about "what type of thing am I". Automatically throwing out that extra information on a comparison where both objects have the extra info and it's not the same on both objects is, in my opinion, just wrong. If there were no extra semantic meaning to enums, then this wouldn't be an issue, but the symbolic information they carry is clearly important, because it's the whole reason we're having this discussion in the first place instead of just using ints everywhere. (Indeed, it could easily be argued that the entire point to things like int-enums is that the programmer isn't supposed to have to care what the underlying value is, but should be able to think of things entirely symbolically.) In pretty much all the cases I can imagine (and that have been suggested thus far), it seems to me that passing a different class of enum than the other side is testing against is almost always going to be an accident, or at the very least sloppy code (if there are cases where it is intentional, it's unlikely to be common, and the programmer should be able to explicitly cast between them somehow). Given this, I think that the Principle of Least Surprise enters into things here as well. If somebody makes a coding mistake and uses the wrong type of enum somewhere, which is likely to be the least surprising outcome?: 1. If you use an enum of the wrong type, it won't be considered equivalent to any of the expected options. 2. If you use an enum of the wrong type, it might not be considered to be equivalent to any of the expected options, or it might match one of them and do something either the same or completely different than you expect, but you have no way to tell what will happen without looking at an underlying constant which you're not supposed to have to care about. --Alex

On 22 February 2013 15:21, Alex Stewart <foogod@gmail.com> wrote:
On Friday, February 22, 2013 4:53:27 AM UTC-8, Joao S. O. Bueno wrote:
Again, I thinnkt hat my 2002 code that expected a (1) working when it got a (True) instead was a great, great thing. And I think that lots of places where magical strings today ("str.decode" errors argument, "compile" mode argument) should be lifted up to expect constants and still keep working if one pass them the strings "replace" or "eval".
Ok, first of all, nobody's suggesting that wouldn't work. I'm not talking about comparing enums with base types (which I have no problem working like this), I'm talking specifically about comparing enums directly with other enums from a different category. (big snip)
Ok. I do agree with your argumentation. And in fact, my previous example of ----------
disk_io.IDDLE == "iddle" True network.IDDLE == "iddle" True disk_io.IDDLE == network.IDDLE False
If the final behavior in some cases is this, no one will get harmed. We alread hav e NaN == Nan -> False and the language does not explode. :-) So, indeed - values in distinct enums should compare differently, even if they compare equal to their underlying value. (sorry for making you write the extensive e-mail) js -><-

(sorry for making you write the extensive e-mail)
No worries.. Having to write it all out actually helped me clarify my own thoughts and make sure that my intuitive position did actually make logical sense to me, so it was probably a good thing to do anyway :) --Alex

On 02/22/2013 10:21 AM, Alex Stewart wrote:
On Friday, February 22, 2013 4:53:27 AM UTC-8, Joao S. O. Bueno wrote: Requiring that things compare differently, IMHO, is too much statically-typed-minded - it is not the way of Python. As for Alex's surreal example of someone defining "NO" and "YES" with different values in different Enums and passing the wrong one to a given call - this person will have it comming.
I think this is unnecessarily dismissive. Obviously, my example was simplified, but what if these two enums were created by two completely different people in different modules that just happen to be used in the same program? Does the hapless programmer "have it coming" then?
+1
You say that my suggestion is more like C++ than Python, but I counter that your suggestion seems to me a lot more like Perl than Python. It smacks of one of the things I dislike most about Perl, which also happens to be, in my opinion, one of the language features most responsible for (often subtle) bugs in Perl code even by experienced programmers: The contents of variables can have very different semantic meanings depending solely on the context in which they're evaluated (scalar vs array, etc). In this case what you're saying is that just because entirely unrelated programmers happened to use the same int value underneath, that Align.TOP and Errno.EPERM and Choice.OK should all be considered to mean exactly the same thing everywhere (even if none of the programmers actually intended or foresaw that possibility), but what that meaning is actually depends entirely on what context they happen to be evaluated in.
+1
Another way of looking at this is that what you're proposing is effectively implicit type-casting between enums of different types. Python does support implicit casting in a very few well-defined cases, but not most of the time, and only when doing so does not discard important semantic information (i.e. "meaning"). For ints, floats, etc, they really only have one piece of semantic information ("a number value"). Enums, however, have more than that. They have (potentially) a number value, but they also have names, and they (preferably) have information about "what type of thing am I". Automatically throwing out that extra information on a comparison where both objects have the extra info and it's not the same on both objects is, in my opinion, just wrong. If there were no extra semantic meaning to enums, then this wouldn't be an issue, but the symbolic information they carry is clearly important, because it's the whole reason we're having this discussion in the first place instead of just using ints everywhere.
+1
In pretty much all the cases I can imagine (and that have been suggested thus far), it seems to me that passing a different class of enum than the other side is testing against is almost always going to be an accident, or at the very least sloppy code (if there are cases where it is intentional, it's unlikely to be common, and the programmer should be able to explicitly cast between them somehow). Given this, I think that the Principle of Least Surprise enters into things here as well. If somebody makes a coding mistake and uses the wrong type of enum somewhere, which is likely to be the least surprising outcome?:
1. If you use an enum of the wrong type, it won't be considered equivalent to any of the expected options. 2. If you use an enum of the wrong type, it might not be considered to be equivalent to any of the expected options, or it might match one of them and do something either the same or completely different than you expect, but you have no way to tell what will happen without looking at an underlying constant which you're not supposed to have to care about.
+1000 Just to be clear, option 1 is what should happen. -- ~Ethan~

On Feb 21, 2013, at 15:24, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 22/02/13 08:18, Alex Stewart wrote:
It doesn't seem unreasonable, therefore, to define two different categories of enums: one that has no concept of "value" (for pure-Python), and one which does have associated values but all values have to be specified explicitly
That sounds reasonable. However, I'm wondering if there isn't a third case: where you don't care about the values, but you do want them to have a defined ordering?
And a fourth case: You don't care about the values, but you want to be able to | them together into a set. For example, consider file open flags, mmap access flags, chmod flags, etc. You need some way to pass READ | TEXT, WRITE | SHARED, etc. occasionally, but most of the time you just pass READ or WRITE. It wouldn't be too onerous to require an explicit {READ, SHARED} set for the uncommon case, but set(READ) for the common case would be horrible. And you don't want every function that takes enums to typeswitch so it can handle a single value as if it were a set. The obvious answer to this is the one every other language uses: take READ | SHARED. This doesn't have to mean flag enums have integral values. However, the only alternative I can think of is that enums have set syntax, so READ | SHARED returns something like READ.__class__({READ, SHARED}), which of course also means that set has to automatically be a valid value for the type. Which is pretty magical and complex. Of course the answer could be not to allow that. Call mmap(file, access=WRITE, contention=SHARED), not mmap(WRITE | SHARED). But if we're talking about changing the stdlib and hopefully most popular third party libs, that would be a pretty drastic change. Finally, the answer could be: if you want that, you have to give them int values, and accept ints instead of enums in your APIs, at which point I think the benefit of valueless enums has gone way down, because many places you'd think you want them (including all over the stdlib), you don't.

On 02/22/2013 09:07 AM, Andrew Barnert wrote:
On Feb 21, 2013, at 15:24, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 22/02/13 08:18, Alex Stewart wrote:
It doesn't seem unreasonable, therefore, to define two different categories of enums: one that has no concept of "value" (for pure-Python), and one which does have associated values but all values have to be specified explicitly
That sounds reasonable. However, I'm wondering if there isn't a third case: where you don't care about the values, but you do want them to have a defined ordering?
And a fourth case: You don't care about the values, but you want to be able to | them together into a set.
For example, consider file open flags, mmap access flags, chmod flags, etc. You need some way to pass READ | TEXT, WRITE | SHARED, etc. occasionally, but most of the time you just pass READ or WRITE. It wouldn't be too onerous to require an explicit {READ, SHARED} set for the uncommon case, but set(READ) for the common case would be horrible. And you don't want every function that takes enums to typeswitch so it can handle a single value as if it were a set. The obvious answer to this is the one every other language uses: take READ | SHARED.
This doesn't have to mean flag enums have integral values. However, the only alternative I can think of is that enums have set syntax, so READ | SHARED returns something like READ.__class__({READ, SHARED}), which of course also means that set has to automatically be a valid value for the type. Which is pretty magical and complex.
Of course the answer could be not to allow that. Call mmap(file, access=WRITE, contention=SHARED), not mmap(WRITE | SHARED). But if we're talking about changing the stdlib and hopefully most popular third party libs, that would be a pretty drastic change.
Finally, the answer could be: if you want that, you have to give them int values, and accept ints instead of enums in your APIs, at which point I think the benefit of valueless enums has gone way down, because many places you'd think you want them (including all over the stdlib), you don't.
We shouldn't get hung up on "don't care about the values" == "no value assigned". I personally don't see any value in a valueless enum; I do see value in three other types: sequence (based on int), bitmask (based on int), and unique (or string -- based on str). It seems to me that we're throwing around "valueless" because we don't want some enums to support math operations -- so use the string version for that type. As for your example above: Python 3.2.3 (default, Oct 19 2012, 19:53:16) [GCC 4.7.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. -->from yaenum import BitMaskEnum, enum -->BitMaskEnum.create('MMap', 'READ WRITE TEXT SHARED', export=globals()) <class 'yaenum.MMap'> -->READ, WRITE, TEXT, SHARED (MMap('READ', value=1), MMap('WRITE', value=2), MMap('TEXT', value=4), MMap('SHARED', value=8)) -->READ | TEXT MMap('READ|TEXT', value=5) -->WRITE | SHARED MMap('WRITE|SHARED', value=10)

On Friday, February 22, 2013 9:07:32 AM UTC-8, Andrew Barnert wrote:
On Feb 21, 2013, at 15:24, Greg Ewing <greg....@canterbury.ac.nz<javascript:>> wrote:
That sounds reasonable. However, I'm wondering if there isn't a
third case: where you don't care about the values, but you do want them to have a defined ordering?
This is a valid point. I think there are a lot of cases where enums are useful without any ordering, but there are arguably some cases where order would be useful (the immediate example that comes to mind is "logging levels", where DEBUG, INFO, and ERROR don't necessarily need to have any numerical meaning, but it would be nice to be able to say that DEBUG < INFO < ERROR).. I think it should be possible to build this into an existing enum implementation which doesn't inherently rely on underlying numbers, though.. I'll look into it.. And a fourth case: You don't care about the values, but you want to be able
to | them together into a set.
Heh.. I was kinda hoping nobody was going to bring this up yet :) Good point, though, and FWIW, I had considered this too but didn't bring it up initially because I was afraid it would muddy the discussion too much too early. I do however agree that this is a use case that will ultimately be important to support, and it's been in the back of my mind. This actually applies both to the "valueless" case and the "valued" case, but in somewhat different ways. As you mentioned, in the "valueless" case, the most obvious way to deal with this is with sets. I think we could probably ameliorate the single-value case a bit by just having enum objects also behave like single-item sets if used with set operations (which I don't think is too magic), and then we can define the '|' operator to just create (or add to) an enum-set.. The valued case is actually more complicated, because ideally if READ = enum(1) and SHARED = enum(4), then saying "READ | SHARED" should produce something that has an int() value of the or'd values (5), but it would also be nice if it still represented itself symbolically (as "READ | SHARED", for example, instead of "5"), and though not strictly required, it would probably also be nice if they could be inspected using the same set-type operations as valueless enums/sets ("if READ in openmode", for example). This would probably require some sort of new "enum-set" class/type which supported amalgomated valued enums, but I think it would still be doable without too much magic (I hope). Then of course there's extra issues like: if we also support string-enums, what does 'or' mean for string constants? (etc, etc..) But in summary, I think the valueless case is actually pretty easy to implement, but doing it well with valued-enums is much more work, which once again reinforces my opinion that valueless enums are useful to have, and preferable to use when lowlevel-type-compatibility is not explicitly required by some API.. --Alex

On 02/21/2013 11:18 AM, Alex Stewart wrote:
On a related note, to be honest, I'm not really sure I can think of any realistic use cases for "string enums" (or really anything other than ints in general). Does anybody have an example of where this would actually be useful as opposed to just using "pure" (valueless) enums (which would already have string names)?
The tk library is a good example of where string enums would be useful; they also provide an easy "valueless" entity (at least as far as numbers go).
I want to check: Is this a valid summary of things? Anything I missed, or do people have substantial objections to any of the required/desirable/undesirable points I mentioned?
Looks good. -- ~Ethan~
participants (7)
-
Alex Stewart
-
Andrew Barnert
-
Ethan Furman
-
Greg Ewing
-
Joao S. O. Bueno
-
MRAB
-
Rob Cliffe