[Python-Dev] Unpickling py2 str as py3 bytes (and vice versa) - implementation (issue #6784)

Merlijn van Deen valhallasw at arctus.nl
Fri Mar 16 22:19:10 CET 2012


Hi Guido,

Let me start with thanking you for your long reply. It has clarified
some points to me, but I am still not certain about some others. I
hope I can clarify why I'm confused about this issue in the following.

First of all, let me clarify that I wrote my original mail not as 'the
guy who wants to serialize stuff' but as 'the guy who wonders what the
best way to implement it in python is'. Of course, 'not' is a
reasonable answer to that question.

On 13 March 2012 23:08, Guido van Rossum <guido at python.org> wrote:
> That was probably written before Python 3. Python 3 also dropped the
> long-term backwards compatibilities for the language and stdlib. I am
> certainly fine with adding a warning to the docs that this guarantee
> does not apply to the Python 2/3 boundary. But I don't think we should
> map 8-bit str instances from Python 2 to bytes in Python 3.

Yes, backwards compatibility was dropped, but the current pickle
module tries to work around this by using a module mapping [1] and
aids in loading 8-bit str instances by asking for an encoding [2].
Last, but not least, we can /write/ old version pickles, for which
the same module mapping is used, but in reverse. As such, the module
suggests in many ways that it should be possible to interchange
pickles between python 2 and python 3.

> My snipe was mostly in reference to the many other things that can go
> wrong with pickled data as your environment evolves (...)
I understand your point. However, my interpretation of this issue
always was 'if you only pickle built-in types, you'll be fine' - which
is apparently wrong.


Essentially - my point is this: considering the pickle module is
already using several compatibility tricks and considering I am not
the only one who would like to read binary data from a pickle in
python 3 - even though it might not be the 'right' way to do it - what
is there /against/ adding the possibility?

Last but not least, this is what people are now doing instead: [1]
    s = pickle.load(f, encoding='latin1')
    b = s.encode('latin1')
    print(zlib.decompress(b))

Which hurts my eyes.

In any case - again, thanks for taking the time to respond. I hope I
somewhat clarified why I was/am somewhat confused on the issue, and
the reasons why I think that it is still a good idea ;-)

Best,
Merlijn

[1] http://hg.python.org/cpython/file/8b2668e60aef/Lib/_compat_pickle.py
[2] http://docs.python.org/dev/library/pickle.html#module-interface
[3] http://stackoverflow.com/questions/4281619/unpicking-data-pickled-in-python-2-5-in-python-3-1-then-uncompressing-with-zlib


More information about the Python-Dev mailing list