[Python-Dev] PEP 435 -- Adding an Enum type to the Python standard library

Eli Bendersky eliben at gmail.com
Fri Apr 12 14:55:00 CEST 2013


Hello python-dev,

We're happy to present the revised PEP 435, collecting valuable feedback
from python-ideas discussions as well as in-person discussions and
decisions made during the latest PyCon language summit. We believe the
proposal is now better than the original one, providing both a wider set of
features and more convenient ways to use those features.

Link to the PEP: http://www.python.org/dev/peps/pep-0435/ [it's also pasted
fully below for convenience].

Reference implementation is available as the recently released flufl.enum
version 4.0 - you can get it either from PyPi or
https://launchpad.net/flufl.enum. flufl.enum 4.0 was developed in parallel
with revising PEP 435.

Comments welcome,

Barry and Eli

----------------------------------

PEP: 435
Title: Adding an Enum type to the Python standard library
Version: $Revision$
Last-Modified: $Date$
Author: Barry Warsaw <barry at python.org>,
        Eli Bendersky <eliben at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 2013-02-23
Python-Version: 3.4
Post-History: 2013-02-23


Abstract
========

This PEP proposes adding an enumeration type to the Python standard library.
Specifically, it proposes moving the existing ``flufl.enum`` package by
Barry
Warsaw into the standard library.  Much of this PEP is based on the "using"
[1]_ document from the documentation of ``flufl.enum``.

An enumeration is a set of symbolic names bound to unique, constant values.
Within an enumeration, the values can be compared by identity, and the
enumeration itself can be iterated over.


Decision
========

TODO: update decision here once pronouncement is made.


Status of discussions
=====================

The idea of adding an enum type to Python is not new - PEP 354 [2]_ is a
previous attempt that was rejected in 2005.  Recently a new set of
discussions
was initiated [3]_ on the ``python-ideas`` mailing list.  Many new ideas
were
proposed in several threads; after a lengthy discussion Guido proposed
adding
``flufl.enum`` to the standard library [4]_.  During the PyCon 2013 language
summit the issue was discussed further.  It became clear that many
developers
want to see an enum that subclasses ``int``, which can allow us to replace
many integer constants in the standard library by enums with friendly string
representations, without ceding backwards compatibility.  An additional
discussion among several interested core developers led to the proposal of
having ``IntEnum`` as a special case of ``Enum``.

The key dividing issue between ``Enum`` and ``IntEnum`` is whether comparing
to integers is semantically meaningful.  For most uses of enumerations, it's
a **feature** to reject comparison to integers; enums that compare to
integers
lead, through transitivity, to comparisons between enums of unrelated types,
which isn't desirable in most cases.  For some uses, however, greater
interoperatiliby with integers is desired. For instance, this is the case
for
replacing existing standard library constants (such as ``socket.AF_INET``)
with enumerations.

This PEP is an attempt to formalize this decision as well as discuss a
number
of variations that were discussed and can be considered for inclusion.


Motivation
==========

*[Based partly on the Motivation stated in PEP 354]*

The properties of an enumeration are useful for defining an immutable,
related
set of constant values that have a defined sequence but no inherent semantic
meaning.  Classic examples are days of the week (Sunday through Saturday)
and
school assessment grades ('A' through 'D', and 'F').  Other examples include
error status values and states within a defined process.

It is possible to simply define a sequence of values of some other basic
type,
such as ``int`` or ``str``, to represent discrete arbitrary values.
However,
an enumeration ensures that such values are distinct from any others
including,
importantly, values within other enumerations, and that operations without
meaning ("Wednesday times two") are not defined for these values.  It also
provides a convenient printable representation of enum values without
requiring
tedious repetition while defining them (i.e. no ``GREEN = 'green'``).


Module and type name
====================

We propose to add a module named ``enum`` to the standard library.  The main
type exposed by this module is ``Enum``.  Hence, to import the ``Enum`` type
user code will run::

    >>> from enum import Enum


Proposed semantics for the new enumeration type
===============================================

Creating an Enum
----------------

Enumerations are created using the class syntax, which makes them easy to
read
and write.  An alternative creation method is described in `Convenience
API`_.
To define an enumeration, derive from the ``Enum`` class and add attributes
with assignment to their integer values::

    >>> from enum import Enum
    >>> class Colors(Enum):
    ...     red = 1
    ...     green = 2
    ...     blue = 3

Enumeration values have nice, human readable string representations::

    >>> print(Colors.red)
    Colors.red

...while their repr has more information::

    >>> print(repr(Colors.red))
    <EnumValue: Colors.red [value=1]>

The enumeration value names are available through the class members::

    >>> for member in Colors.__members__:
    ...     print(member)
    red
    green
    blue

Let's say you wanted to encode an enumeration value in a database.  You
might
want to get the enumeration class object from an enumeration value::

    >>> cls = Colors.red.enum
    >>> print(cls.__name__)
    Colors

Enums also have a property that contains just their item name::

    >>> print(Colors.red.name)
    red
    >>> print(Colors.green.name)
    green
    >>> print(Colors.blue.name)
    blue

The str and repr of the enumeration class also provides useful information::

    >>> print(Colors)
    <Colors {red: 1, green: 2, blue: 3}>
    >>> print(repr(Colors))
    <Colors {red: 1, green: 2, blue: 3}>

The ``Enum`` class supports iteration.  Iteration is defined as the
sorted order of the item values::

    >>> class FiveColors(Enum):
    ...     pink = 4
    ...     cyan = 5
    ...     green = 2
    ...     blue = 3
    ...     red = 1
    >>> [v.name for v in FiveColors]
    ['red', 'green', 'blue', 'pink', 'cyan']

Enumeration values are hashable, so they can be used in dictionaries and
sets::

    >>> apples = {}
    >>> apples[Colors.red] = 'red delicious'
    >>> apples[Colors.green] = 'granny smith'
    >>> apples
    {<EnumValue: Colors.green [value=2]>: 'granny smith', <EnumValue:
Colors.red [value=1]>: 'red delicious'}

To programmatically access enumeration values, use ``getattr``::

    >>> getattr(Colors, 'red')
    <EnumValue: Colors.red [value=1]>

Comparisons
-----------

Enumeration values are compared by identity::

    >>> Colors.red is Colors.red
    True
    >>> Colors.blue is Colors.blue
    True
    >>> Colors.red is not Colors.blue
    True
    >>> Colors.blue is Colors.red
    False

Ordered comparisons between enumeration values are *not* supported.  Enums
are
not integers (but see `IntEnum`_ below)::

    >>> Colors.red < Colors.blue
    Traceback (most recent call last):
    ...
    NotImplementedError
    >>> Colors.red <= Colors.blue
    Traceback (most recent call last):
    ...
    NotImplementedError
    >>> Colors.blue > Colors.green
    Traceback (most recent call last):
    ...
    NotImplementedError
    >>> Colors.blue >= Colors.green
    Traceback (most recent call last):
    ...
    NotImplementedError

Equality comparisons are defined though::

    >>> Colors.blue == Colors.blue
    True
    >>> Colors.green != Colors.blue
    True

Comparisons against non-enumeration values will always compare not equal::

    >>> Colors.green == 2
    False
    >>> Colors.blue == 3
    False
    >>> Colors.green != 3
    True
    >>> Colors.green == 'green'
    False


Extending enumerations by subclassing
-------------------------------------

You can extend previously defined Enums by subclassing::

    >>> class MoreColors(Colors):
    ...     pink = 4
    ...     cyan = 5

When extended in this way, the base enumeration's values are identical to
the
same named values in the derived class::

    >>> Colors.red is MoreColors.red
    True
    >>> Colors.blue is MoreColors.blue
    True

However, these are not doing comparisons against the integer
equivalent values, because if you define an enumeration with similar
item names and integer values, they will not be identical::

    >>> class OtherColors(Enum):
    ...     red = 1
    ...     blue = 2
    ...     yellow = 3
    >>> Colors.red is OtherColors.red
    False
    >>> Colors.blue is not OtherColors.blue
    True

These enumeration values are not equal, nor do they and hence may exist
in the same set, or as distinct keys in the same dictionary::

    >>> Colors.red == OtherColors.red
    False
    >>> len(set((Colors.red, OtherColors.red)))
    2

You may not define two enumeration values with the same integer value::

    >>> class Bad(Enum):
    ...     cartman = 1
    ...     stan = 2
    ...     kyle = 3
    ...     kenny = 3 # Oops!
    ...     butters = 4
    Traceback (most recent call last):
    ...
    ValueError: Conflicting enums with value '3': 'kenny' and 'kyle'

You also may not duplicate values in derived enumerations::

    >>> class BadColors(Colors):
    ...     yellow = 4
    ...     chartreuse = 2 # Oops!
    Traceback (most recent call last):
    ...
    ValueError: Conflicting enums with value '2': 'green' and 'chartreuse'


Enumeration values
------------------

The examples above use integers for enumeration values.  Using integers is
short and handy (and provided by default by the `Convenience API`_), but not
strictly enforced.  In the vast majority of use-cases, one doesn't care what
the actual value of an enumeration is.  But if the value *is* important,
enumerations can have arbitrary values.  The following example uses
strings::

    >>> class SpecialId(Enum):
    ...   selector = '$IM($N)'
    ...   adaptor = '~$IM'
    ...
    >>> SpecialId.selector
    <EnumValue: SpecialId.selector [value=$IM($N)]>
    >>> SpecialId.selector.value
    '$IM($N)'
    >>> a = SpecialId.adaptor
    >>> a == '~$IM'
    False
    >>> a == SpecialId.adaptor
    True
    >>> print(a)
    SpecialId.adaptor
    >>> print(a.value)
    ~$IM

Here ``Enum`` is used to provide readable (and syntactically valid!) names
for
some special values, as well as group them together.

While ``Enum`` supports this flexibility, one should only use it in
very special cases.  Code will be most readable when actual values of
enumerations aren't important and enumerations are just used for their
naming and comparison properties.


IntEnum
-------

A variation of ``Enum`` is proposed where the enumeration values also
subclasses ``int`` - ``IntEnum``.  These values can be compared to
integers; by extension, enumerations of different types can also be
compared to each other::

    >>> from enum import IntEnum
    >>> class Shape(IntEnum):
    ...   circle = 1
    ...   square = 2
    ...
    >>> class Request(IntEnum):
    ...   post = 1
    ...   get = 2
    ...
    >>> Shape == 1
    False
    >>> Shape.circle == 1
    True
    >>> Shape.circle == Request.post
    True

However they still can't be compared to ``Enum``::

    >>> class Shape(IntEnum):
    ...   circle = 1
    ...   square = 2
    ...
    >>> class Colors(Enum):
    ...   red = 1
    ...   green = 2
    ...
    >>> Shape.circle == Colors.red
    False

``IntEnum`` values behave like integers in other ways you'd expect::

    >>> int(Shape.circle)
    1
    >>> ['a', 'b', 'c'][Shape.circle]
    'b'
    >>> [i for i in range(Shape.square)]
    [0, 1]

For the vast majority of code, ``Enum`` is strongly recommended.
Since ``IntEnum`` breaks some semantic promises of an enumeration (by
being comparable to integers, and thus by transitivity to other
unrelated enumerations), it should be used only in special cases where
there's no other choice; for example, when integer constants are
replaced with enumerations and backwards compatibility is required
with code that still expects integers.


Pickling
--------

Enumerations created with the class syntax can also be pickled and
unpickled::

    >>> from enum.tests.fruit import Fruit
    >>> from pickle import dumps, loads
    >>> Fruit.tomato is loads(dumps(Fruit.tomato))
    True


Convenience API
---------------

The ``Enum`` class is callable, providing the following convenience API::

    >>> Animals = Enum('Animals', 'ant bee cat dog')
    >>> Animals
    <Animals {ant: 1, bee: 2, cat: 3, dog: 4}>
    >>> Animals.ant
    <EnumValue: Animals.ant [value=1]>
    >>> Animals.ant.value
    1

The semantics of this API resemble ``namedtuple``. The first argument of
the call to ``Enum`` is the name of the enumeration.  The second argument is
a source of enumeration value names.  It can be a whitespace-separated
string
of names, a sequence of names or a sequence of 2-tuples with key/value
pairs.
The last option enables assigning arbitrary values to enumerations; the
others
auto-assign increasing integers starting with 1.  A new class derived from
``Enum`` is returned.  In other words, the above assignment to ``Animals``
is
equivalent to::

    >>> class Animals(Enum):
    ...   ant = 1
    ...   bee = 2
    ...   cat = 3
    ...   dog = 4

Examples of alternative name/value specifications::

    >>> Enum('Animals', ['ant', 'bee', 'cat', 'dog'])
    <Animals {ant: 1, bee: 2, cat: 3, dog: 4}>
    >>> Enum('Animals', (('ant', 'one'), ('bee', 'two'), ('cat', 'three'),
('dog', 'four')))
    <Animals {dog: four, ant: one, cat: three, bee: two}>

The second argument can also be a dictionary mapping names to values::

    >>> levels = dict(debug=10, info=20, warning=30, severe=40)
    >>> Enum('Levels', levels)
    <Levels {debug: 10, info: 20, warning: 30, severe: 40}>


Proposed variations
===================

Some variations were proposed during the discussions in the mailing list.
Here's some of the more popular ones.


Not having to specify values for enums
--------------------------------------

Michael Foord proposed (and Tim Delaney provided a proof-of-concept
implementation) to use metaclass magic that makes this possible::

    class Color(Enum):
        red, green, blue

The values get actually assigned only when first looked up.

Pros: cleaner syntax that requires less typing for a very common task (just
listing enumeration names without caring about the values).

Cons: involves much magic in the implementation, which makes even the
definition of such enums baffling when first seen.  Besides, explicit is
better than implicit.


Using special names or forms to auto-assign enum values
-------------------------------------------------------

A different approach to avoid specifying enum values is to use a special
name
or form to auto assign them.  For example::

    class Color(Enum):
        red = None          # auto-assigned to 0
        green = None        # auto-assigned to 1
        blue = None         # auto-assigned to 2

More flexibly::

    class Color(Enum):
        red = 7
        green = None        # auto-assigned to 8
        blue = 19
        purple = None       # auto-assigned to 20

Some variations on this theme:

#. A special name ``auto`` imported from the enum package.
#. Georg Brandl proposed ellipsis (``...``) instead of ``None`` to achieve
the
   same effect.

Pros: no need to manually enter values. Makes it easier to change the enum
and
extend it, especially for large enumerations.

Cons: actually longer to type in many simple cases.  The argument of
explicit
vs. implicit applies here as well.


Use-cases in the standard library
=================================

The Python standard library has many places where the usage of enums would
be
beneficial to replace other idioms currently used to represent them.  Such
usages can be divided to two categories: user-code facing constants, and
internal constants.

User-code facing constants like ``os.SEEK_*``, ``socket`` module constants,
decimal rounding modes and HTML error codes could require backwards
compatibility since user code may expect integers.  ``IntEnum`` as described
above provides the required semantics; being a subclass of ``int``, it does
not
affect user code that expects integers, while on the other hand allowing
printable representations for enumeration values::

    >>> import socket
    >>> family = socket.AF_INET
    >>> family == 2
    True
    >>> print(family)
    SocketFamily.AF_INET

Internal constants are not seen by user code but are employed internally by
stdlib modules.  These can be implemented with ``Enum``.  Some examples
uncovered by a very partial skim through the stdlib: ``binhex``,
``imaplib``,
``http/client``, ``urllib/robotparser``, ``idlelib``,
``concurrent.futures``,
``turtledemo``.

In addition, looking at the code of the Twisted library, there are many use
cases for replacing internal state constants with enums.  The same can be
said
about a lot of networking code (especially implementation of protocols) and
can be seen in test protocols written with the Tulip library as well.


Differences from PEP 354
========================

Unlike PEP 354, enumeration values are not defined as a sequence of strings,
but as attributes of a class.  This design was chosen because it was felt
that
class syntax is more readable.

Unlike PEP 354, enumeration values require an explicit integer value.  This
difference recognizes that enumerations often represent real-world values,
or
must interoperate with external real-world systems.  For example, to store
an
enumeration in a database, it is better to convert it to an integer on the
way
in and back to an enumeration on the way out.  Providing an integer value
also
provides an explicit ordering.  However, there is no automatic conversion to
and from the integer values, because explicit is better than implicit.

Unlike PEP 354, this implementation does use a metaclass to define the
enumeration's syntax, and allows for extended base-enumerations so that the
common values in derived classes are identical (a singleton model).  While
PEP
354 dismisses this approach for its complexity, in practice any perceived
complexity, though minimal, is hidden from users of the enumeration.

Unlike PEP 354, enumeration values should only be tested by identity
comparison.  This is to emphasize the fact that enumeration values are
singletons, much like ``None``.


Acknowledgments
===============

This PEP describes the ``flufl.enum`` package by Barry Warsaw.
``flufl.enum``
is based on an example by Jeremy Hylton.  It has been modified and extended
by Barry Warsaw for use in the GNU Mailman [5]_ project.  Ben Finney is the
author of the earlier enumeration PEP 354.


References
==========

.. [1] http://pythonhosted.org/flufl.enum/docs/using.html
.. [2] http://www.python.org/dev/peps/pep-0354/
.. [3]
http://mail.python.org/pipermail/python-ideas/2013-January/019003.html
.. [4]
http://mail.python.org/pipermail/python-ideas/2013-February/019373.html
.. [5] http://www.list.org


Copyright
=========

This document has been placed in the public domain.


Todo
====

 * Mark PEP 354 "superseded by" this one, if accepted

..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20130412/e3e39eea/attachment-0001.html>


More information about the Python-Dev mailing list