PEP 435: pickling enums created with the functional API

One of the contended issues with PEP 435 on which Guido pronounced was the functional API, that allows created enumerations dynamically in a manner similar to namedtuple: Color = Enum('Color', 'red blue green') The biggest complaint reported against this API is interaction with pickle. As promised, I want to discuss here how we're going to address this concern. At this point, the pickle docs say that module-top-level classes can be pickled. This obviously works for the normal Enum classes, but is a problem with the functional API because the class is created dynamically and has no __module__. To solve this, the reference implementation is used the same approach as namedtuple (*). In the metaclass's __new__ (this is an excerpt, the real code has some safeguards): module_name = sys._getframe(1).f_globals['__name__'] enum_class.__module__ = module_name According to an earlier discussion, this is works on CPython, PyPy and Jython, but not on IronPython. The alternative that works everywhere is to define the Enum like this: Color = Enum('the_module.Color', 'red blue green') The reference implementation supports this as well. Some points for discussion: 1) We can say that using the functional API when pickling can happen is not recommended, but maybe a better way would be to just explain the way things are and let users decide? 2) namedtuple should also support the fully qualified name syntax. If this is agreed upon, I can create an issue. 3) Antoine mentioned that work is being done in 3.4 to enable pickling of nested classes (http://www.python.org/dev/peps/pep-3154/). If that gets implemented, I don't see a reason why Enum and namedtuple can't be adjusted to find the __qualname__ of the class they're internal to. Am I missing something? 4) Using _getframe(N) here seems like an overkill to me. What we really need is just the module in which the current execution currently is (i.e. the metaclass's __new__ in our case). Would it make sense to add a new function somewhere in the stdlib of 3.4 (in sys or inspect or ...) that just provides the current module name? It seems that all Pythons should be able to easily provide it, it's certainly a very small subset of the functionality provided by walking the callframe stack. This function can then be used for build fully qualified names for pickling of Enum and namedtuple. Moreover, it can be general even more widely - dynamic class building is quite common in Python code, and as Nick mentioned somewhere earlier, the extra power of metaclasses in the recent 3.x's will probably make it even more common. Eli (*) namedtuple uses an explicit function to build the resulting class, not a metaclass (ther's no class syntax for namedtuple).

2013/5/7 Eli Bendersky <eliben@gmail.com>:
4) Using _getframe(N) here seems like an overkill to me. What we really need is just the module in which the current execution currently is (i.e. the metaclass's __new__ in our case). Would it make sense to add a new function somewhere in the stdlib of 3.4 (in sys or inspect or ...) that just provides the current module name? It seems that all Pythons should be able to easily provide it, it's certainly a very small subset of the functionality provided by walking the callframe stack. This function can then be used for build fully qualified names for pickling of Enum and namedtuple. Moreover, it can be general even more widely - dynamic class building is quite common in Python code, and as Nick mentioned somewhere earlier, the extra power of metaclasses in the recent 3.x's will probably make it even more common.
What about adding simple syntax (I proposed this earlier, but no one commented) that take care of assigning name and module, something like: def name = expression which would be rough equivalent for: name = expression name.__name__ = 'name' name.__module__ = __name__ -- 闇に隠れた黒い力 弱い心を操る

On 05/07/2013 07:48 AM, Piotr Duda wrote:
What about adding simple syntax (I proposed this earlier, but no one commented) that take care of assigning name and module, something like:
def name = expression
which would be rough equivalent for:
name = expression name.__name__ = 'name' name.__module__ = __name__
How is that different from --> name = Enum('module.name', ... ) ? -- ~Ethan~

2013/5/7 Ethan Furman <ethan@stoneleaf.us>:
On 05/07/2013 07:48 AM, Piotr Duda wrote:
What about adding simple syntax (I proposed this earlier, but no one commented) that take care of assigning name and module, something like:
def name = expression
which would be rough equivalent for:
name = expression name.__name__ = 'name' name.__module__ = __name__
How is that different from
--> name = Enum('module.name', ... )
?
It's DRY. -- 闇に隠れた黒い力 弱い心を操る

On 05/07/2013 08:01 AM, Piotr Duda wrote:
2013/5/7 Ethan Furman <ethan@stoneleaf.us>:
On 05/07/2013 07:48 AM, Piotr Duda wrote:
What about adding simple syntax (I proposed this earlier, but no one commented) that take care of assigning name and module, something like:
def name = expression
which would be rough equivalent for:
name = expression name.__name__ = 'name' name.__module__ = __name__
How is that different from
--> name = Enum('module.name', ... )
?
It's DRY.
How? You need to provide a complete example: Do you mean something like: --> def mymodule.Color('red green blue') ? -- ~Ethan~

2013/5/7 Ethan Furman <ethan@stoneleaf.us>:
On 05/07/2013 08:01 AM, Piotr Duda wrote:
2013/5/7 Ethan Furman <ethan@stoneleaf.us>:
On 05/07/2013 07:48 AM, Piotr Duda wrote:
What about adding simple syntax (I proposed this earlier, but no one commented) that take care of assigning name and module, something like:
def name = expression
which would be rough equivalent for:
name = expression name.__name__ = 'name' name.__module__ = __name__
How is that different from
--> name = Enum('module.name', ... )
?
It's DRY.
How? You need to provide a complete example:
Do you mean something like:
--> def mymodule.Color('red green blue')
def Color = Enum('red green blue') -- 闇に隠れた黒い力 弱い心を操る

On Tue, May 7, 2013 at 8:35 AM, Piotr Duda <duda.piotr@gmail.com> wrote:
2013/5/7 Ethan Furman <ethan@stoneleaf.us>:
On 05/07/2013 08:01 AM, Piotr Duda wrote:
2013/5/7 Ethan Furman <ethan@stoneleaf.us>:
On 05/07/2013 07:48 AM, Piotr Duda wrote:
What about adding simple syntax (I proposed this earlier, but no one commented) that take care of assigning name and module, something like:
def name = expression
which would be rough equivalent for:
name = expression name.__name__ = 'name' name.__module__ = __name__
How is that different from
--> name = Enum('module.name', ... )
?
It's DRY.
How? You need to provide a complete example:
Do you mean something like:
--> def mymodule.Color('red green blue')
def Color = Enum('red green blue')
It's an interesting idea, but as NIck suggested we should probably discuss it on the python-ideas list. It occurred to me while thinking about the duplication in "Color = Enum(Color, '...')" that if "Enum" had some magical way to know the name of the variable it's assigned to, the duplication would not be needed. But then, it obviously is fragile because what's this: somedict[key] = Enum(Color, ...). A special syntax raises more questions though, because it has to be defined very precisely. Feel free to come up with a complete proposal to python-ideas, defining the interesting semantics. Eli

On 05/07/2013 08:47 AM, Eli Bendersky wrote:
def Color = Enum('red green blue')
It's an interesting idea, but as NIck suggested we should probably discuss it on the python-ideas list. [...]
A special syntax raises more questions though, because it has to be defined very precisely. Feel free to come up with a complete proposal to python-ideas, defining the interesting semantics.
We don't need a special syntax, we can already do this: @Enum('red green blue') def Color(): pass Here, Enum would take the one argument, and return a function working as a function decorator. That decorator would ignore the body of the function and return the Enum. It's awful, but then so is the idea of creating special syntax just for the functional form of Enum--if we're willing to go down that road, let's just add new syntax for enums and be done with it. As for the non-pickleability of enums created with the functional interface, why can't it use the same mechanism (whatever it is) as the three-argument form of type? Types created that way are dynamic, yet have a __module__ and are pickleable. //arry/

On Wed, May 8, 2013 at 12:53 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
On 05/07/2013 07:48 AM, Piotr Duda wrote:
What about adding simple syntax (I proposed this earlier, but no one commented) that take care of assigning name and module, something like:
def name = expression
which would be rough equivalent for:
name = expression name.__name__ = 'name' name.__module__ = __name__
How is that different from
--> name = Enum('module.name', ... )
With the repetition, you're setting yourself up for bugs in future maintenance when either the module name or the assigned name change. I like Piotr's suggestion of simply assigning to __name__ and __module__ after the fact, though - much simpler than my naming context idea. Cheers, Nick.

On Tue, May 7, 2013 at 11:34 PM, Eli Bendersky <eliben@gmail.com> wrote:
One of the contended issues with PEP 435 on which Guido pronounced was the functional API, that allows created enumerations dynamically in a manner similar to namedtuple:
Color = Enum('Color', 'red blue green')
The biggest complaint reported against this API is interaction with pickle. As promised, I want to discuss here how we're going to address this concern.
At this point, the pickle docs say that module-top-level classes can be pickled. This obviously works for the normal Enum classes, but is a problem with the functional API because the class is created dynamically and has no __module__.
To solve this, the reference implementation is used the same approach as namedtuple (*). In the metaclass's __new__ (this is an excerpt, the real code has some safeguards):
module_name = sys._getframe(1).f_globals['__name__'] enum_class.__module__ = module_name
According to an earlier discussion, this is works on CPython, PyPy and Jython, but not on IronPython. The alternative that works everywhere is to define the Enum like this:
Color = Enum('the_module.Color', 'red blue green')
The reference implementation supports this as well.
Some points for discussion:
1) We can say that using the functional API when pickling can happen is not recommended, but maybe a better way would be to just explain the way things are and let users decide?
It's probably worth creating a section in the pickle docs and explaining the vagaries of naming things and the dependency on knowing the module name. The issue comes up with defining classes in __main__ and when implementing pseudo-modules as well (see PEP 395).
2) namedtuple should also support the fully qualified name syntax. If this is agreed upon, I can create an issue.
Yes, I think that part should be done.
3) Antoine mentioned that work is being done in 3.4 to enable pickling of nested classes (http://www.python.org/dev/peps/pep-3154/). If that gets implemented, I don't see a reason why Enum and namedtuple can't be adjusted to find the __qualname__ of the class they're internal to. Am I missing something?
The class based form should still work (assuming only classes are involved), the stack inspection will likely fail.
4) Using _getframe(N) here seems like an overkill to me.
It's not just overkill, it's fragile - it only works if you call the constructor directly. If you use a convenience function in a utility module, it will try to load your pickles from there rather than wherever you bound the name.
What we really need is just the module in which the current execution currently is (i.e. the metaclass's __new__ in our case). Would it make sense to add a new function somewhere in the stdlib of 3.4 (in sys or inspect or ...) that just provides the current module name? It seems that all Pythons should be able to easily provide it, it's certainly a very small subset of the functionality provided by walking the callframe stack. This function can then be used for build fully qualified names for pickling of Enum and namedtuple. Moreover, it can be general even more widely - dynamic class building is quite common in Python code, and as Nick mentioned somewhere earlier, the extra power of metaclasses in the recent 3.x's will probably make it even more common.
Yes, I've been thinking along these lines myself, although in a slightly more expanded form that also touches on the issues that stalled PEP 406 (the import engine API that tries to better encapsulate the import state). It may also potentially address some issues with initialisation of C extensions (I don't remember the exact details off the top of my head, but there's some info we want to get from the import machinery to modules initialised from Cython, but the loader API and the C module initialisation API both get in the way). Specifically, what I'm talking about is some kind of implicit context similar to the approach the decimal module uses to control operations on Decimal instances. In this case, what we're trying to track is the "active module", either __main__ (if the code has been triggered directly through an operation in that module), or else the module currently being imported (if the import machinery has been invoked). The bare minimum would just be to store the __name__ (using sys.modules to get access to the full module if needed) in a way that adequately handles nested, circular and threaded imports, but there may be a case for tracking a richer ModuleContext object instead. However, there's also a separate question of whether implicitly tracking the active module is really what we want. Do we want that, or is what we actually want the ability to define an arbitrary "naming context" in order to use functional APIs to construct classes without losing the pickle integration of class statements? What if there was a variant of the class statement that bound the result of a function call rather than using the normal syntax: class Animal from enum.Enum(members="dog cat bear") And it was only class statements in that form which manipulated the naming context? (you could also use the def keyword rather than class) Either form would essentially be an ordinary assignment statement, *except* that they would manipulate the naming context to record the name being bound *and* relevant details of the active module. Regardless, I think the question is not really well enough defined to be a topic for python-dev, even though it came up in a python-dev discussion - it's more python-ideas territory. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Le Wed, 8 May 2013 01:03:38 +1000, Nick Coghlan <ncoghlan@gmail.com> a écrit :
What if there was a variant of the class statement that bound the result of a function call rather than using the normal syntax:
class Animal from enum.Enum(members="dog cat bear")
Apparently you're trying hard to invent syntaxes just to avoid subclassing. Regards Antoine.

On 8 May 2013 01:26, "Antoine Pitrou" <solipsis@pitrou.net> wrote:
Le Wed, 8 May 2013 01:03:38 +1000, Nick Coghlan <ncoghlan@gmail.com> a écrit :
What if there was a variant of the class statement that bound the result of a function call rather than using the normal syntax:
class Animal from enum.Enum(members="dog cat bear")
Apparently you're trying hard to invent syntaxes just to avoid subclassing.
Yeah, just accepting an auto-numbered "members" arg still seems cleaner to me. If we decouple autonumbering from using the functional API, then the rules for pickle support are simple: * use the class syntax; or * pass a fully qualified name. The fragile getframe hack should not be propagated beyond namedtuple. Cheers, Nick.
Regards
Antoine.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe:
http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com

On Tue, May 7, 2013 at 8:03 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
One of the contended issues with PEP 435 on which Guido pronounced was
functional API, that allows created enumerations dynamically in a manner similar to namedtuple:
Color = Enum('Color', 'red blue green')
The biggest complaint reported against this API is interaction with
As promised, I want to discuss here how we're going to address this concern.
At this point, the pickle docs say that module-top-level classes can be pickled. This obviously works for the normal Enum classes, but is a
with the functional API because the class is created dynamically and has no __module__.
To solve this, the reference implementation is used the same approach as namedtuple (*). In the metaclass's __new__ (this is an excerpt, the real code has some safeguards):
module_name = sys._getframe(1).f_globals['__name__'] enum_class.__module__ = module_name
According to an earlier discussion, this is works on CPython, PyPy and Jython, but not on IronPython. The alternative that works everywhere is to define the Enum like this:
Color = Enum('the_module.Color', 'red blue green')
The reference implementation supports this as well.
Some points for discussion:
1) We can say that using the functional API when pickling can happen is not recommended, but maybe a better way would be to just explain the way
On Tue, May 7, 2013 at 11:34 PM, Eli Bendersky <eliben@gmail.com> wrote: the pickle. problem things
are and let users decide?
It's probably worth creating a section in the pickle docs and explaining the vagaries of naming things and the dependency on knowing the module name. The issue comes up with defining classes in __main__ and when implementing pseudo-modules as well (see PEP 395).
Any pickle-expert volunteers to do this? I guess we can start by creating a documentation issue.
2) namedtuple should also support the fully qualified name syntax. If this is agreed upon, I can create an issue.
Yes, I think that part should be done.
OK, I'll create an issue.
3) Antoine mentioned that work is being done in 3.4 to enable pickling of nested classes (http://www.python.org/dev/peps/pep-3154/). If that gets implemented, I don't see a reason why Enum and namedtuple can't be adjusted to find the __qualname__ of the class they're internal to. Am I missing something?
The class based form should still work (assuming only classes are involved), the stack inspection will likely fail.
I can probably be made to work with a bit more effort than the current "hack", but I don't see why it wouldn't be doable.
4) Using _getframe(N) here seems like an overkill to me.
It's not just overkill, it's fragile - it only works if you call the constructor directly. If you use a convenience function in a utility module, it will try to load your pickles from there rather than wherever you bound the name.
In theory you can climb the frame stack until the desired place, but this is specifically what my proposal of adding a function tries to avoid.
What we really need is just the module in which the current execution currently is (i.e. the metaclass's __new__ in our case). Would it make sense to add a new function somewhere in the stdlib of 3.4 (in sys or inspect or ...) that just provides the current module name? It seems that all Pythons should be able to easily provide it, it's certainly a very small subset of the functionality provided by walking the callframe stack. This function can then be used for build fully qualified names for pickling of Enum and namedtuple. Moreover, it can be general even more widely - dynamic class building is quite common in Python code, and as Nick mentioned somewhere earlier, the extra power of metaclasses in the recent 3.x's will probably make it even more common.
Yes, I've been thinking along these lines myself, although in a slightly more expanded form that also touches on the issues that stalled PEP 406 (the import engine API that tries to better encapsulate the import state). It may also potentially address some issues with initialisation of C extensions (I don't remember the exact details off the top of my head, but there's some info we want to get from the import machinery to modules initialised from Cython, but the loader API and the C module initialisation API both get in the way).
Specifically, what I'm talking about is some kind of implicit context similar to the approach the decimal module uses to control operations on Decimal instances. In this case, what we're trying to track is the "active module", either __main__ (if the code has been triggered directly through an operation in that module), or else the module currently being imported (if the import machinery has been invoked).
The bare minimum would just be to store the __name__ (using sys.modules to get access to the full module if needed) in a way that adequately handles nested, circular and threaded imports, but there may be a case for tracking a richer ModuleContext object instead.
However, there's also a separate question of whether implicitly tracking the active module is really what we want. Do we want that, or is what we actually want the ability to define an arbitrary "naming context" in order to use functional APIs to construct classes without losing the pickle integration of class statements?
What if there was a variant of the class statement that bound the result of a function call rather than using the normal syntax:
class Animal from enum.Enum(members="dog cat bear")
And it was only class statements in that form which manipulated the naming context? (you could also use the def keyword rather than class)
Either form would essentially be an ordinary assignment statement, *except* that they would manipulate the naming context to record the name being bound *and* relevant details of the active module.
Regardless, I think the question is not really well enough defined to be a topic for python-dev, even though it came up in a python-dev discussion - it's more python-ideas territory.
Wait... I agree that having a special syntax for this is a novel idea that's not well defined and can be discussed on python-ideas. But the utility function I was mentioning is a pretty simple idea, and it's well defined. It can be very useful in contexts where code is created dynamically, by removing the amount of explicit-frame-walking hacks. Eli

Le Tue, 7 May 2013 08:44:46 -0700, Eli Bendersky <eliben@gmail.com> a écrit :
4) Using _getframe(N) here seems like an overkill to me.
It's not just overkill, it's fragile - it only works if you call the constructor directly. If you use a convenience function in a utility module, it will try to load your pickles from there rather than wherever you bound the name.
In theory you can climb the frame stack until the desired place, but this is specifically what my proposal of adding a function tries to avoid.
I don't know how you could do it without walking the frame stack. Granted, you don't need all the information that the stack holds (you don't need to know about line numbers, instruction numbers and local variables, for instance :-)), but you still have to walk *some* kind of dynamically-created stack. This isn't something that is solvable statically (as opposed to e.g. a class's __qualname__, which is computed at compile-time). Regards Antoine.

On Tue, May 7, 2013 at 9:14 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Le Tue, 7 May 2013 08:44:46 -0700, Eli Bendersky <eliben@gmail.com> a écrit :
4) Using _getframe(N) here seems like an overkill to me.
It's not just overkill, it's fragile - it only works if you call the constructor directly. If you use a convenience function in a utility module, it will try to load your pickles from there rather than wherever you bound the name.
In theory you can climb the frame stack until the desired place, but this is specifically what my proposal of adding a function tries to avoid.
I don't know how you could do it without walking the frame stack. Granted, you don't need all the information that the stack holds (you don't need to know about line numbers, instruction numbers and local variables, for instance :-)), but you still have to walk *some* kind of dynamically-created stack. This isn't something that is solvable statically (as opposed to e.g. a class's __qualname__, which is computed at compile-time).
Yes, I fully realize that. I guess I should have phrased my reply differently - this is what the proposal helps *user code to avoid*. For CPython and PyPy and Jython it will be perfectly reasonable to actually climb the frame stack inside that function. For IronPython, another solution may be required if no such frame stack exists. However, even in IronPython there must be a way to get to the module name? In other words, the goal is to hide an ugly piece of exposed implementation detail behind a library call. The library call can be implemented by each platform according to its own internals, but the user won't care. Eli

On 2013-05-07, at 17:03 , Nick Coghlan wrote:
Specifically, what I'm talking about is some kind of implicit context similar to the approach the decimal module uses to control operations on Decimal instances.
Wouldn't it be a good occasion to add actual, full-fledged and correctly implemented (and working) dynamically scoped variables? Or extending exceptions to signals (in the Smalltalk/Lisp sense) providing the same feature?

On 05/07/2013 08:03 AM, Nick Coghlan wrote:
On Tue, May 7, 2013 at 11:34 PM, Eli Bendersky wrote:
4) Using _getframe(N) here seems like an overkill to me.
It's not just overkill, it's fragile - it only works if you call the constructor directly. If you use a convenience function in a utility module, it will try to load your pickles from there rather than wherever you bound the name.
What we really need is just the module in which the current execution currently is (i.e. the metaclass's __new__ in our case). Would it make sense to add a new function somewhere in the stdlib of 3.4 (in sys or inspect or ...) that just provides the current module name? It seems that all Pythons should be able to easily provide it, it's certainly a very small subset of the functionality provided by walking the callframe stack. This function can then be used for build fully qualified names for pickling of Enum and namedtuple. Moreover, it can be general even more widely - dynamic class building is quite common in Python code, and as Nick mentioned somewhere earlier, the extra power of metaclasses in the recent 3.x's will probably make it even more common.
Perhaps I am being too pendantic, or maybe I'm not thinking in low enough detail, but it seems to me that the module in which the current execution is is the module in which the currently running code was defined. What we need is a way to get where the currently running code was called from. And to support those dreaded utility functions, a way to pass along where you were called from so the utility function can lie and say, "Hey, you! Yeah, you Enum! You were called from app.main, not app.utils.misc!" -- ~Ethan~

On Tue, May 7, 2013 at 8:03 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
One of the contended issues with PEP 435 on which Guido pronounced was
functional API, that allows created enumerations dynamically in a manner similar to namedtuple:
Color = Enum('Color', 'red blue green')
The biggest complaint reported against this API is interaction with
As promised, I want to discuss here how we're going to address this concern.
At this point, the pickle docs say that module-top-level classes can be pickled. This obviously works for the normal Enum classes, but is a
with the functional API because the class is created dynamically and has no __module__.
To solve this, the reference implementation is used the same approach as namedtuple (*). In the metaclass's __new__ (this is an excerpt, the real code has some safeguards):
module_name = sys._getframe(1).f_globals['__name__'] enum_class.__module__ = module_name
According to an earlier discussion, this is works on CPython, PyPy and Jython, but not on IronPython. The alternative that works everywhere is to define the Enum like this:
Color = Enum('the_module.Color', 'red blue green')
The reference implementation supports this as well.
Some points for discussion:
1) We can say that using the functional API when pickling can happen is not recommended, but maybe a better way would be to just explain the way
On Tue, May 7, 2013 at 11:34 PM, Eli Bendersky <eliben@gmail.com> wrote: the pickle. problem things
are and let users decide?
It's probably worth creating a section in the pickle docs and explaining the vagaries of naming things and the dependency on knowing the module name. The issue comes up with defining classes in __main__ and when implementing pseudo-modules as well (see PEP 395).
2) namedtuple should also support the fully qualified name syntax. If this is agreed upon, I can create an issue.
Yes, I think that part should be done.

On 9 May 2013 13:48, "Eli Bendersky" <eliben@gmail.com> wrote:
On Tue, May 7, 2013 at 8:03 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On Tue, May 7, 2013 at 11:34 PM, Eli Bendersky <eliben@gmail.com> wrote:
One of the contended issues with PEP 435 on which Guido pronounced was
functional API, that allows created enumerations dynamically in a manner similar to namedtuple:
Color = Enum('Color', 'red blue green')
The biggest complaint reported against this API is interaction with
As promised, I want to discuss here how we're going to address this concern.
At this point, the pickle docs say that module-top-level classes can be pickled. This obviously works for the normal Enum classes, but is a
with the functional API because the class is created dynamically and has no __module__.
To solve this, the reference implementation is used the same approach as namedtuple (*). In the metaclass's __new__ (this is an excerpt, the real code has some safeguards):
module_name = sys._getframe(1).f_globals['__name__'] enum_class.__module__ = module_name
According to an earlier discussion, this is works on CPython, PyPy and Jython, but not on IronPython. The alternative that works everywhere is to define the Enum like this:
Color = Enum('the_module.Color', 'red blue green')
The reference implementation supports this as well.
Some points for discussion:
1) We can say that using the functional API when pickling can happen is not recommended, but maybe a better way would be to just explain the way
are and let users decide?
It's probably worth creating a section in the pickle docs and explaining the vagaries of naming things and the dependency on knowing the module name. The issue comes up with defining classes in __main__ and when implementing pseudo-modules as well (see PEP 395).
2) namedtuple should also support the fully qualified name syntax. If
the pickle. problem things this
is agreed upon, I can create an issue.
Yes, I think that part should be done.
As Eric noted on the tracker issue, a keyword only "module" argument may be a better choice for both than allowing dotted names. A separate parameter is easier to use with __name__ to avoid hardcoding the module name. At the very least, the PEP should provide a rationale for the current choice. Cheers, Nick.
Eli

On Thu, May 9, 2013 at 7:17 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
As Eric noted on the tracker issue, a keyword only "module" argument may be a better choice for both than allowing dotted names. A separate parameter is easier to use with __name__ to avoid hardcoding the module name.
+1. This is a good one. While adding module=__name__ is actually more typing than passing __name__ + '.Color' as the class name, the current proposal (parsing for dots) makes it very attractive to do the wrong thing and hardcode the module name. Then typing the module incorrectly is very easy, and the mistake is easily overlooked because it won't be noticed until you actually try to pickle a member. At the very least, the PEP should provide a rationale for the current
choice.
Cheers, Nick.
Eli
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
-- --Guido van Rossum (python.org/~guido)

On May 09, 2013, at 09:24 AM, Guido van Rossum wrote:
+1. This is a good one. While adding module=__name__ is actually more typing than passing __name__ + '.Color' as the class name, the current proposal (parsing for dots) makes it very attractive to do the wrong thing and hardcode the module name. Then typing the module incorrectly is very easy, and the mistake is easily overlooked because it won't be noticed until you actually try to pickle a member.
Seems reasonable. The `module` argument should be keyword-only, and obviously namedtuple should support the same API. -Barry

On Thu, May 9, 2013 at 9:31 AM, Barry Warsaw <barry@python.org> wrote:
On May 09, 2013, at 09:24 AM, Guido van Rossum wrote:
+1. This is a good one. While adding module=__name__ is actually more typing than passing __name__ + '.Color' as the class name, the current proposal (parsing for dots) makes it very attractive to do the wrong thing and hardcode the module name. Then typing the module incorrectly is very easy, and the mistake is easily overlooked because it won't be noticed until you actually try to pickle a member.
Seems reasonable. The `module` argument should be keyword-only, and obviously namedtuple should support the same API.
Yes, this was already pointed out by Eric in http://bugs.python.org/issue17941 which tracks this feature for namedtuple. Eli

On Tue, May 7, 2013 at 8:34 AM, Eli Bendersky <eliben@gmail.com> wrote:
According to an earlier discussion, this is works on CPython, PyPy and Jython, but not on IronPython. The alternative that works everywhere is to define the Enum like this:
Color = Enum('the_module.Color', 'red blue green')
The reference implementation supports this as well.
As an alternate bikeshed color, why not pass the receiving module to the class factory when pickle support is desirable? That should be less brittle than its name. The class based syntax can still be recommended to libraries that won't know ahead of time if their values need to be pickled.
Color = Enum('Color', 'red blue green', module=__main__)
Functions that wrap class factories could similarly accept and pass a module along. The fundamental problem is that the class factory cannot know what the intended destination module is without either syntax that provides this ('class' today, proposed 'def' or 'class from' in the thread, or the caller passing additional information around (module name, or module instance). Syntax changes are clearly beyond the scope of PEP 435, otherwise a true enum syntax might have been born. So that leaves us with requiring the caller to provide it. Michael

On 07/05/13 23:34, Eli Bendersky wrote:
One of the contended issues with PEP 435 on which Guido pronounced was the functional API, that allows created enumerations dynamically in a manner similar to namedtuple:
Color = Enum('Color', 'red blue green')
The biggest complaint reported against this API is interaction with pickle. As promised, I want to discuss here how we're going to address this concern.
Does this issue really need to be solved before 435 is accepted? As the Zen says: Now is better than never. Although never is often better than *right* now. Solving the pickle issue is a hard problem, but not a critical issue. namedtuple has had the same issue since its inception, only worse because there is no class syntax for namedtuple. This has not been a barrier to the success of namedtuple. Or rather, the issue is not with Enum, or namedtuple, but pickle. Any dynamically-created type will have this issue:
import pickle def example(name): ... return type(name, (object,), {}) ... instance = example("Foo")() pickle.dumps(instance) Traceback (most recent call last): File "<stdin>", line 1, in <module> _pickle.PicklingError: Can't pickle <class '__main__.Foo'>: attribute lookup __main__.Foo failed
I don't think it is unreasonable to chalk it up to a limitation of pickle, and say that unless you can meet certain conditions, you won't be able to pickle your instance. Either way, approval of PEP 435 should not be dependent on fixing the pickle issue. -- Steven

On Tue, May 7, 2013 at 6:00 PM, Steven D'Aprano <steve@pearwood.info> wrote:
On 07/05/13 23:34, Eli Bendersky wrote:
One of the contended issues with PEP 435 on which Guido pronounced was the functional API, that allows created enumerations dynamically in a manner similar to namedtuple:
Color = Enum('Color', 'red blue green')
The biggest complaint reported against this API is interaction with pickle. As promised, I want to discuss here how we're going to address this concern.
Does this issue really need to be solved before 435 is accepted? As the Zen says:
Now is better than never. Although never is often better than *right* now.
Solving the pickle issue is a hard problem, but not a critical issue. namedtuple has had the same issue since its inception, only worse because there is no class syntax for namedtuple. This has not been a barrier to the success of namedtuple.
Agreed
Or rather, the issue is not with Enum, or namedtuple, but pickle. Any dynamically-created type will have this issue:
import pickle
def example(name):
... return type(name, (object,), {}) ...
instance = example("Foo")()
pickle.dumps(instance)
Traceback (most recent call last): File "<stdin>", line 1, in <module> _pickle.PicklingError: Can't pickle <class '__main__.Foo'>: attribute lookup __main__.Foo failed
I don't think it is unreasonable to chalk it up to a limitation of pickle, and say that unless you can meet certain conditions, you won't be able to pickle your instance.
Either way, approval of PEP 435 should not be dependent on fixing the pickle issue.
Just to be clear- it was not my intention to delay PEP 435 because of this issue. I don't see it as a blocker to pronouncement and from a private correspondence with Guido, he doesn't either. I merely wanted to start a separate thread because I didn't want this discussion to overwhelm the pronouncement thread. Eli
participants (11)
-
Antoine Pitrou
-
Barry Warsaw
-
Eli Bendersky
-
Ethan Furman
-
Guido van Rossum
-
Larry Hastings
-
Michael Urman
-
Nick Coghlan
-
Piotr Duda
-
Steven D'Aprano
-
Xavier Morel