[Python-Dev] Unpickling py2 str as py3 bytes (and vice versa) - implementation (issue #6784)

Guido van Rossum guido at python.org
Tue Mar 13 22:13:31 CET 2012


On Tue, Mar 13, 2012 at 12:42 PM, Michael Foord
<fuzzyman at voidspace.org.uk> wrote:
>
> On 13 Mar 2012, at 04:44, Merlijn van Deen wrote:
>
>> http://bugs.python.org/issue6784 ("byte/unicode pickle
>> incompatibilities between python2 and python3")
>>
>> Hello all,
>>
>> Currently, pickle unpickles python2 'str' objects as python3 'str'
>> objects, where the encoding to use is passed to the Unpickler.
>> However, there are cases where it makes more sense to unpickle a
>> python2 'str' as python3 'bytes' - for instance when it is actually
>> binary data, and not text.
>>
>> Currently, the mapping is as follows, when reading a pickle:
>> python2 'str' -> python3 'str' (using an encoding supplied to Unpickler)
>> python2 'unicode' -> python3 'str'
>>
>> or, when creating a pickle using protocol <= 2:
>> python3 'str' -> python2 'unicode'
>> python3 'bytes' -> python2 '__builtins__.bytes object'
>>
>
>
> It does seem unfortunate that by default it is impossible for a developer to "do the right thing" as regards pickling / unpickling here. Binary data on Python 2 being unpickled as Unicode on Python 3 is presumably for the convenience of developers doing the *wrong thing* (and only works for ascii anyway).

Well, since trying to migrate data between versions using pickle is
the "wrong" thing anyway, I think the status quo is just fine.
Developers doing the "right" thing don't use pickle for this purpose.

-- 
--Guido van Rossum (python.org/~guido)


More information about the Python-Dev mailing list