Pickling of Enums
How Enum items should be pickled, by value or by name? I think that Enum will be used to collect system-depending constants, so the value of AddressFamily.AF_UNIX can be 1 on one platform and 2 on other. If pickle enums by value, then pickled AddressFamily.AF_INET on on platform can be unpickled as AddressFamily.AF_UNIX on other platform. This looks weird and contrary to the nature of enums.
On Sat, 15 Feb 2014 21:01:36 +0200 Serhiy Storchaka <storchaka@gmail.com> wrote:
How Enum items should be pickled, by value or by name?
I think that Enum will be used to collect system-depending constants, so the value of AddressFamily.AF_UNIX can be 1 on one platform and 2 on other. If pickle enums by value, then pickled AddressFamily.AF_INET on on platform can be unpickled as AddressFamily.AF_UNIX on other platform. This looks weird and contrary to the nature of enums.
I agree with you, they should be pickled by name. An enum is a kind of global in this regard. (but of course, before AF_UNIX was an enum it was pickled by value) Regards Antoine.
On 02/15/2014 11:01 AM, Serhiy Storchaka wrote:
How Enum items should be pickled, by value or by name?
I think that Enum will be used to collect system-depending constants, so the value of AddressFamily.AF_UNIX can be 1 on one platform and 2 on other. If pickle enums by value, then pickled AddressFamily.AF_INET on on platform can be unpickled as AddressFamily.AF_UNIX on other platform. This looks weird and contrary to the nature of enums.
There is one more wrinkle to pickling by name (it's actually still there in pickle by value, just more obvious in pickle by name) -- aliases. It seems to me the most common scenario to having a name represent different values on different systems is when on system A they are different, but on system B they are the same: System A: class SystemEnum(Enum): value1 = 1 value2 = 2 System B: class SystemEnum(Enum): value1 = 1 value2 = 1 If you're on system B there is no way to pickle (by name or value) value2 such that we get value2 back on system A. The only way I know of to make that work would be to dispense with identity comparison, use the normal == comparison, and have aliases actually be separate objects (we could still use singletons, but it would be one per name instead of the current one per value, and it would also be an implementation detail). Thoughts? -- ~Ethan~
I'm confused. Hasn't this all been decided by the PEP long ago? On Tue, Feb 18, 2014 at 9:11 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
On 02/15/2014 11:01 AM, Serhiy Storchaka wrote:
How Enum items should be pickled, by value or by name?
I think that Enum will be used to collect system-depending constants, so the value of AddressFamily.AF_UNIX can be 1 on one platform and 2 on other. If pickle enums by value, then pickled AddressFamily.AF_INET on on platform can be unpickled as AddressFamily.AF_UNIX on other platform. This looks weird and contrary to the nature of enums.
There is one more wrinkle to pickling by name (it's actually still there in pickle by value, just more obvious in pickle by name) -- aliases. It seems to me the most common scenario to having a name represent different values on different systems is when on system A they are different, but on system B they are the same:
System A:
class SystemEnum(Enum): value1 = 1 value2 = 2
System B:
class SystemEnum(Enum): value1 = 1 value2 = 1
If you're on system B there is no way to pickle (by name or value) value2 such that we get value2 back on system A. The only way I know of to make that work would be to dispense with identity comparison, use the normal == comparison, and have aliases actually be separate objects (we could still use singletons, but it would be one per name instead of the current one per value, and it would also be an implementation detail).
Thoughts?
-- ~Ethan~
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ guido%40python.org
-- --Guido van Rossum (python.org/~guido)
On 02/18/2014 09:47 AM, Guido van Rossum wrote:
I'm confused. Hasn't this all been decided by the PEP long ago?
The PEP only mentions pickling briefly, as in "the normal rules apply". How pickling occurs is an implementation detail, and it turns out that pickling by name is more robust. Serhiy, as part of his argument for using the _name_ instead of the _value_ for pickling, brought up the point that different systems could have different values for the same name. If true in practice (and I believe it is) this raises the issue of aliases, which currently *cannot* be pickled by name because there is no distinct object for the alias. If you ask for Color['alias_for_red'] you'll get Color.red instead. Using identity comparison was part of the PEP. I guess the question is which is more important? Identity comparison or this (probably) rare use-case? If we stick with identity I'm not aware of any work-around for pickling enum members that are aliases on one system, but distinct on another. I've been talking about pickling specifically, but this applies to any serialization method. -- ~Ethan~
Hm. But there's an implementation that has made it unscathed through several betas and an RC. AFAICT that beta pickles enums by value. And I happen to think that that is the better choice (but I don't have time to explain this gut feeling until after 3.4 has been released). On Tue, Feb 18, 2014 at 10:01 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
On 02/18/2014 09:47 AM, Guido van Rossum wrote:
I'm confused. Hasn't this all been decided by the PEP long ago?
The PEP only mentions pickling briefly, as in "the normal rules apply". How pickling occurs is an implementation detail, and it turns out that pickling by name is more robust.
Serhiy, as part of his argument for using the _name_ instead of the _value_ for pickling, brought up the point that different systems could have different values for the same name. If true in practice (and I believe it is) this raises the issue of aliases, which currently *cannot* be pickled by name because there is no distinct object for the alias. If you ask for Color['alias_for_red'] you'll get Color.red instead.
Using identity comparison was part of the PEP.
I guess the question is which is more important? Identity comparison or this (probably) rare use-case? If we stick with identity I'm not aware of any work-around for pickling enum members that are aliases on one system, but distinct on another.
I've been talking about pickling specifically, but this applies to any serialization method.
-- ~Ethan~ _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ guido%40python.org
-- --Guido van Rossum (python.org/~guido)
On 02/18/2014 10:05 AM, Guido van Rossum wrote:
Hm. But there's an implementation that has made it unscathed through several betas and an RC. AFAICT that beta pickles enums by value. And I happen to think that that is the better choice (but I don't have time to explain this gut feeling until after 3.4 has been released).
This conversation wasn't in the PEP, but as I recall we decided to go with value instead of name for json because the receiving end may not be running Python. Is having json do it one way and pickle another a problem? -- ~Ethan~
I'm confused. AFAICT enums are pickled by value too. What am I missing? Are we confused about terminology or about behavior? (I'm just guessing that the pickling happens by value because I don't see the string AF_INET.) $ python3 Python 3.4.0rc1+ (default:2ba583191550, Feb 11 2014, 16:05:24) [GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.2.79)] on darwin Type "help", "copyright", "credits" or "license" for more information.
import socket, pickle, json, pickletools socket.AF_INET <AddressFamily.AF_INET: 2> pickle.dumps(socket.AF_INET) b'\x80\x03csocket\nAddressFamily\nq\x00K\x02\x85q\x01Rq\x02.' json.dumps(socket.AF_INET) '2' pickletools.dis(pickle.dumps(socket.AF_INET)) 0: \x80 PROTO 3 2: c GLOBAL 'socket AddressFamily' 24: q BINPUT 0 26: K BININT1 2 28: \x85 TUPLE1 29: q BINPUT 1 31: R REDUCE 32: q BINPUT 2 34: . STOP highest protocol among opcodes = 2
On Tue, Feb 18, 2014 at 10:16 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
On 02/18/2014 10:05 AM, Guido van Rossum wrote:
Hm. But there's an implementation that has made it unscathed through several betas and an RC. AFAICT that beta pickles enums by value. And I happen to think that that is the better choice (but I don't have time to explain this gut feeling until after 3.4 has been released).
This conversation wasn't in the PEP, but as I recall we decided to go with value instead of name for json because the receiving end may not be running Python.
Is having json do it one way and pickle another a problem?
-- ~Ethan~
-- --Guido van Rossum (python.org/~guido)
On 02/18/2014 11:20 AM, Guido van Rossum wrote:
I'm confused. AFAICT enums are pickled by value too. What am I missing? Are we confused about terminology or about behavior? (I'm just guessing that the pickling happens by value because I don't see the string AF_INET.)
There's an open issue [1] to switch to pickling by name. -- ~Ethan~ [1] http://bugs.python.org/issue20653
Well, I'm against that. On Tue, Feb 18, 2014 at 11:26 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
On 02/18/2014 11:20 AM, Guido van Rossum wrote:
I'm confused. AFAICT enums are pickled by value too. What am I missing? Are we confused about terminology or about behavior? (I'm just guessing that the pickling happens by value because I don't see the string AF_INET.)
There's an open issue [1] to switch to pickling by name.
-- ~Ethan~
-- --Guido van Rossum (python.org/~guido)
18.02.14 21:20, Guido van Rossum написав(ла):
I'm confused. AFAICT enums are pickled by value too. What am I missing? Are we confused about terminology or about behavior? (I'm just guessing that the pickling happens by value because I don't see the string AF_INET.)
Pickling was not even working two weeks ago. [1] [1] http://bugs.python.org/issue20534
Well, I still think it should be done by value. On Tue, Feb 18, 2014 at 11:53 AM, Serhiy Storchaka <storchaka@gmail.com>wrote:
18.02.14 21:20, Guido van Rossum написав(ла):
I'm confused. AFAICT enums are pickled by value too. What am I missing?
Are we confused about terminology or about behavior? (I'm just guessing that the pickling happens by value because I don't see the string AF_INET.)
Pickling was not even working two weeks ago. [1]
[1] http://bugs.python.org/issue20534
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ guido%40python.org
-- --Guido van Rossum (python.org/~guido)
On 02/18/2014 11:53 AM, Serhiy Storchaka wrote:
18.02.14 21:20, Guido van Rossum написав(ла):
I'm confused. AFAICT enums are pickled by value too. What am I missing? Are we confused about terminology or about behavior? (I'm just guessing that the pickling happens by value because I don't see the string AF_INET.)
Pickling was not even working two weeks ago. [1]
For the record, pickling worked just fine for protocols 2 and 3, and 4 didn't exist at the time. -- ~Ethan~
18.02.14 20:16, Ethan Furman написав(ла):
This conversation wasn't in the PEP, but as I recall we decided to go with value instead of name for json because the receiving end may not be running Python.
Is having json do it one way and pickle another a problem?
We decided to go with value instead of name for JSON because JSON doesn't support enums, but supports integers and strings, and because enums are comparable with they values, but not with they names.
json.loads(json.dumps(socket.AF_INET)) == socket.AF_INET True
We simply had no other choice.
On Tue, 18 Feb 2014 10:01:42 -0800 Ethan Furman <ethan@stoneleaf.us> wrote:
I guess the question is which is more important? Identity comparison or this (probably) rare use-case? If we stick with identity I'm not aware of any work-around for pickling enum members that are aliases on one system, but distinct on another.
I don't think identity comparison is important. Enum values are supposed to act like values, not full-blown objects. OTOH, the "pickled aliases may end up different on other systems" issue is sufficiently fringy that we may simply paper over it. Regards Antoine.
18.02.14 19:11, Ethan Furman написав(ла):
There is one more wrinkle to pickling by name (it's actually still there in pickle by value, just more obvious in pickle by name) -- aliases. It seems to me the most common scenario to having a name represent different values on different systems is when on system A they are different, but on system B they are the same:
System A:
class SystemEnum(Enum): value1 = 1 value2 = 2
System B:
class SystemEnum(Enum): value1 = 1 value2 = 1
If you're on system B there is no way to pickle (by name or value) value2 such that we get value2 back on system A. The only way I know of to make that work would be to dispense with identity comparison, use the normal == comparison, and have aliases actually be separate objects (we could still use singletons, but it would be one per name instead of the current one per value, and it would also be an implementation detail).
Thoughts?
There are aliases and aliases. If there are modern name and deprecated name, then it should be one object referred by different names on all systems. If there are different entities with accidentally equal values, then they should be different objects.
participants (4)
-
Antoine Pitrou
-
Ethan Furman
-
Guido van Rossum
-
Serhiy Storchaka