[Python-Dev] Unpickling py2 str as py3 bytes (and vice versa) - implementation (issue #6784)
Guido van Rossum
guido at python.org
Tue Mar 13 22:13:31 CET 2012
On Tue, Mar 13, 2012 at 12:42 PM, Michael Foord
<fuzzyman at voidspace.org.uk> wrote:
> On 13 Mar 2012, at 04:44, Merlijn van Deen wrote:
>> http://bugs.python.org/issue6784 ("byte/unicode pickle
>> incompatibilities between python2 and python3")
>> Hello all,
>> Currently, pickle unpickles python2 'str' objects as python3 'str'
>> objects, where the encoding to use is passed to the Unpickler.
>> However, there are cases where it makes more sense to unpickle a
>> python2 'str' as python3 'bytes' - for instance when it is actually
>> binary data, and not text.
>> Currently, the mapping is as follows, when reading a pickle:
>> python2 'str' -> python3 'str' (using an encoding supplied to Unpickler)
>> python2 'unicode' -> python3 'str'
>> or, when creating a pickle using protocol <= 2:
>> python3 'str' -> python2 'unicode'
>> python3 'bytes' -> python2 '__builtins__.bytes object'
> It does seem unfortunate that by default it is impossible for a developer to "do the right thing" as regards pickling / unpickling here. Binary data on Python 2 being unpickled as Unicode on Python 3 is presumably for the convenience of developers doing the *wrong thing* (and only works for ascii anyway).
Well, since trying to migrate data between versions using pickle is
the "wrong" thing anyway, I think the status quo is just fine.
Developers doing the "right" thing don't use pickle for this purpose.
--Guido van Rossum (python.org/~guido)
More information about the Python-Dev