[Python-ideas] PEP for enum library type?

Alex Stewart foogod at gmail.com
Thu Feb 21 20:18:55 CET 2013


Hi all,

Sorry to jump into this discussion so late, but I just finished reading
through this thread and had a few thoughts..

(Sorry this message is a bit long.  TL;DR: Please check the list of
required/desired/undesired properties at the end and let me know if I've
gotten anything seriously wrong with my interpretation of the discussion
thus far)

It seems to me that this particular thread started out as a call to step
away from existing implementations and take a look at this issue from the
direction of "what do we want/need" instead, but then it quickly got
sidetracked back into discussing all the details of various
existing/proposed implementations.  I'd like to try to take a step back
(again) for a minute and raise the question:  What do we actually want to
get out of this whole endeavor?

First of all, as I see it, there are two main (and fairly distinct) use
cases for enums in Python:

   1. Predefined "unique values" for passing around within Python code
   2. API-defined constants for interacting with non-Python libraries, etc
   (i.e. C defines/enums that need to be represented in Python, or database
   field values, etc)

In non-Python code, typically enums have always been represented under the
covers as ints, and therefore must be passed back and forth as numbers.
 The fact that they have an integer value, however, is purely an
implementation artifact.  It comes from the fact that C and some other
languages don't have a rich enough type system to properly make enums their
own distinct types, but Python does not have this limitation, and I think
we should be careful not to constrain the way we do things within Python
just because of the limitations of other languages.

Where possible I believe we should conceptually be thinking of enums not as
"sequences of ints" but more as "collections of singletons".  That is, they
are simply objects, with a defined name and type, which compare equal to
themselves but not to others, and are generally related to others by some
sort of grouping mechanism (and the same name always maps to the same
object).  In this context, the idea of assigning a "value" to an enum makes
little sense and is arguably completely unnecessary.  (and if we eliminate
this aspect, it mitigates many of the issues that have been brought up
about evaluation order and accidental duplication, in addition to
potentially making the base design a lot simpler)

Obviously, the second use case does require an association between enums
and (typically int) values, but that could be viewed as simply a special
case of the larger definition of "enums", rather than the other way around.
 I do think one thing worth noting, however, is that (at least in my
experience) the cases which require associating names with values pretty
much also always require that every name has a specific value, so the value
for each and every enum within the group should probably be being defined
explicitly anyway (I have encountered very few cases where it's actually
useful to mix-and-match "I care about this value but I don't care about
those ones").  It doesn't seem unreasonable, therefore, to define two
different categories of enums: one that has no concept of "value" (for
pure-Python), and one which does have associated values but all values have
to be specified explicitly (for the "mapping constants" case).

On a related note, to be honest, I'm not really sure I can think of any
realistic use cases for "string enums" (or really anything other than ints
in general).  Does anybody have an example of where this would actually be
useful as opposed to just using "pure" (valueless) enums (which would
already have string names)?

Anyway, in the interest of trying to get the discussion back onto more
theoretical ground, I also wanted to try to summarize the more general
thoughts/impressions I've gleaned from the discussions up to this point.
 From what I can tell, these are some of the properties that there seems to
be some general consensus enums probably should or shouldn't have:

Required properties (without these, any implementation is not generally
useful, or is at least something different than an "enum"):

   1. Enums must be groupable in some way (i.e. "Colors", or "Error values")
   2. Enums within the same group must not compare equal to each other
   (unless two names are intentionally defined to map to the same enum (i.e.
   "aliases"))
   3. (Within reason and the limitations of Python) Once defined, an enum's
   properties (i.e. its name, identity, group membership, relationships to
   other objects, etc) must be treated as immutable (i.e. not change out from
   under the programmer unexpectedly).  Conceptually they should be considered
   to be "constants".

Desirable properties (which ones are more or less desirable will vary for
different people, but from what I've heard I think everybody sorta agrees
that all of these could be good things as long as they don't cause other
problems):

   1. Enums should represent themselves (str/repr) by symbolic names, not
   as ints, etc.
   2. Enums from different groups should preferably not compare equal to
   each other (even if they have the same associated int value).
   3. It should be possible to determine what group an enum belongs to.
   4. Enums/groups should be definable inline using a fairly simple Python
   syntax.
   5. It should also be relatively easy to define enums/groups
   programmatically.
   6. By default, enums should be referenceable as relatively simple
   identifiers (i.e. no need for quoting, function-calls, etc, just
   variables/attributes/etc)
   7. If the programmer doesn't care about the value of an enum, she
   shouldn't have to explicitly state a meaningless value.
   8. (If an enum does have an associated value) it should be easy to
   compare with and/or convert back and forth between enums and values (so
   that they can be used with existing APIs).
   9. It would be nice to be able to associate docstrings, and possibly
   other metadata with enums.

Undesirable properties:

   1. Enum syntax should not be "too magic".  (In particular, it's pretty
   clear at this point that creating new enums as a side-effect of name
   lookups (even as convenient as it does make the syntax) is ultimately not
   gonna fly)
   2. The syntax for defining enums should not be so onerous or verbose
   that it's significantly harder to use than the existing idioms people are
   already using.
   3. The syntax for defining enums should not be so alien that it will
   completely baffle programmers who are already used to typical Python
   constructs.
   4. It shouldn't be necessary to quote enum names when defining them
   (since they won't be quoted when they're used)

I want to check: Is this a valid summary of things?  Anything I missed, or
do people have substantial objections to any of the
required/desirable/undesirable points I mentioned?

Obviously, it may not be possible to achieve all of the desirable
properties at the same time, but I think it's useful to start with an idea
of what we'd ideally like, and then we can sit down and see how close we
can actually get to it..

(Actually, on pondering these properties, I've started to put together a
slightly different enum implementation which I think has some potential
(it's somewhat a cross between Barry's and Tim's with a couple of ideas of
my own).  I think I'll flesh it out a little more and then put it up for
comment as a separate thread, if people don't mind..)

--Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130221/9b73c9e3/attachment.html>


More information about the Python-ideas mailing list