How to spell PyInstance_NewRaw in py3k?
Issue #5180 [1] presented an interesting challenge: how to unpickle instances of old-style classes when a pickle created with 2.x is loaded in 3.x python? The problem is that pickle protocol requires that unpickled instances be created without calling the __init__ method. This is necessary because pickle file may not contain information about how __init__ method should be invoked. Instead, implementations are required to bypass __init__ and populate instance's __dict__ directly using data found in the pickle. Pure python implementation uses the following trick that happens to work in 3.x: class Empty: pass pickled = Empty() pickled.__class__ = Pickled This of course, creates a new-style class in 3.x, but if 3.x version of Pickled behaves similarly to its 2.x predecessor, it should work. The cPickle implementation, on the other hand uses 2.x C API which is not available in 3.x. Namely, the PyInstance_NewRaw function. In order to fix the bug described in issue #5180, I had to emulate PyInstance_NewRaw using type->tp_alloc. I considered an rejected the idea to use tp_new instead. [2] Is this the right way to proceed? The patch is attached to the issue. [3] [1] http://bugs.python.org/issue5180 [2] http://bugs.python.org/issue5180#msg108846 [3] http://bugs.python.org/file17792/issue5180.diff
I am reposting the same question again because it seems to have gone unnoticed. Antoine Pitrou and I had a brief discussion on the tracker, but did not reach an agreement on whether a more elaborate code is needed to replace PyInstance_NewRaw than a simple type->tp_alloc() call. I have reviewed the patch again and I am convinced that this issue comes into play only when 3.x loads 2.x pickles that contain instances of classic classes. (Specifically, this code is used to process INST and OBJ pickle opcodes that are not produced by 3.x.) This means that Antoine's concern that "tomorrow [object_new()] may entail additional operations" is not valid - there is no tomorrow for 2.x. :-) This also means that the type cannot inherit from anything other than object and thus cannot have funny tp_flags or tp_alloc that does not create a usable object. I would like to commit the patch as presented. If a corner case is discovered later where type->tp_alloc() is not sufficient, we can deal with it then. On Mon, Jun 28, 2010 at 3:59 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
Issue #5180 [1] presented an interesting challenge: how to unpickle instances of old-style classes when a pickle created with 2.x is loaded in 3.x python? The problem is that pickle protocol requires that unpickled instances be created without calling the __init__ method. This is necessary because pickle file may not contain information about how __init__ method should be invoked. Instead, implementations are required to bypass __init__ and populate instance's __dict__ directly using data found in the pickle.
Pure python implementation uses the following trick that happens to work in 3.x:
class Empty: pass
pickled = Empty() pickled.__class__ = Pickled
This of course, creates a new-style class in 3.x, but if 3.x version of Pickled behaves similarly to its 2.x predecessor, it should work.
The cPickle implementation, on the other hand uses 2.x C API which is not available in 3.x. Namely, the PyInstance_NewRaw function. In order to fix the bug described in issue #5180, I had to emulate PyInstance_NewRaw using type->tp_alloc. I considered an rejected the idea to use tp_new instead. [2]
Is this the right way to proceed? The patch is attached to the issue. [3]
[1] http://bugs.python.org/issue5180 [2] http://bugs.python.org/issue5180#msg108846 [3] http://bugs.python.org/file17792/issue5180.diff
On Wed, 14 Jul 2010 19:24:28 -0400 Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
I am reposting the same question again because it seems to have gone unnoticed. Antoine Pitrou and I had a brief discussion on the tracker, but did not reach an agreement on whether a more elaborate code is needed to replace PyInstance_NewRaw than a simple type->tp_alloc() call.
I have reviewed the patch again and I am convinced that this issue comes into play only when 3.x loads 2.x pickles that contain instances of classic classes. (Specifically, this code is used to process INST and OBJ pickle opcodes that are not produced by 3.x.) This means that Antoine's concern that "tomorrow [object_new()] may entail additional operations" is not valid - there is no tomorrow for 2.x. :-)
But there *is* a tomorrow in 3.x and that's what we are talking about. Your code is meant to emulate object_new() in 3.x.
This also means that the type cannot inherit from anything other than object and thus cannot have funny tp_flags or tp_alloc that does not create a usable object.
Why can't it inherit from something else than object?
On Thu, Jul 15, 2010 at 6:29 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Wed, 14 Jul 2010 19:24:28 -0400 Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote: ..
This means that Antoine's concern that "tomorrow [object_new()] may entail additional operations" is not valid - there is no tomorrow for 2.x. :-)
But there *is* a tomorrow in 3.x and that's what we are talking about. Your code is meant to emulate object_new() in 3.x.
Yes, I realized that after I hit the sent button. However, it is 2.x classic instances that are being unpickled and it won't be reasonable for 3.x objects that are expected to emulate them to do anything non-trivial. The need to be able to faithfully emulate classic instances unpickled from 2.x may be a valid constraint to future object_new() evolution.
This also means that the type cannot inherit from anything other than object and thus cannot have funny tp_flags or tp_alloc that does not create a usable object.
Why can't it inherit from something else than object?
Because this would not be a reasonable way to forward port 2.x classic classes and expect them to interoperate with 2.x pickles. There are many ways to break unpickling of old pickles by modifying the class in the new version of code. The serious question for me is whether a valid tp_alloc can create a partially initialized object that will crash the interpreter when its method is called. I don't think this is the case because certainly you need to be able to delete the object if tp_new fails and that can call arbitrary code.
participants (2)
-
Alexander Belopolsky
-
Antoine Pitrou