in a pickle
duncan smith
duncan at invalid.invalid
Wed Mar 6 20:18:59 EST 2019
On 06/03/2019 20:24, Peter Otten wrote:
> duncan smith wrote:
>
>> On 06/03/2019 16:14, duncan smith wrote:
>>> Hello,
>>> I've been trying to figure out why one of my classes can be
>>> pickled but not unpickled. (I realise the problem is probably with the
>>> pickling, but I get the error when I attempt to unpickle.)
>>>
>>> A relatively minimal example is pasted below.
>>>
>>>
>>>>>> import pickle
>>>>>> class test(dict):
>>> def __init__(self, keys, shape=None):
>>> self.shape = shape
>>> for key in keys:
>>> self[key] = None
>>>
>>> def __setitem__(self, key, val):
>>> print (self.shape)
>>> dict.__setitem__(self, key, val)
>>>
>>>
>>>>>> x = test([1,2,3])
>>> None
>>> None
>>> None
>>>>>> s = pickle.dumps(x)
>>>>>> y = pickle.loads(s)
>>> Traceback (most recent call last):
>>> File "<pyshell#114>", line 1, in <module>
>>> y = pickle.loads(s)
>>> File "<pyshell#111>", line 8, in __setitem__
>>> print (self.shape)
>>> AttributeError: 'test' object has no attribute 'shape'
>>>
>>>
>>> I have DUCkDuckGo'ed the issue and have tinkered with __getnewargs__ and
>>> __getnewargs_ex__ without being able to figure out exactly what's going
>>> on. If I comment out the print call, then it seems to be fine. I'd
>>> appreciate any pointers to the underlying problem. I have one or two
>>> other things I can do to try to isolate the issue further, but I think
>>> the example is perhaps small enough that someone in the know could spot
>>> the problem at a glance. Cheers.
>>>
>>> Duncan
>>>
>>
>> OK, this seems to be a "won't fix" bug dating back to 2003
>> (https://bugs.python.org/issue826897). The workaround,
>>
>>
>> class DictPlus(dict):
>> def __init__(self, *args, **kwargs):
>> self.extra_thing = ExtraThingClass()
>> dict.__init__(self, *args, **kwargs)
>> def __setitem__(self, k, v):
>> try:
>> do_something_with(self.extra_thing, k, v)
>> except AttributeError:
>> self.extra_thing = ExtraThingClass()
>> do_something_with(self.extra_thing, k, v)
>> dict.__setitem__(self, k, v)
>> def __setstate__(self, adict):
>> pass
>>
>>
>> doesn't work around the problem for me because I need the actual value
>> of self.shape from the original instance. But I only need it for sanity
>> checking, and under the assumption that the original instance was valid,
>> I don't need to do this when unpickling. I haven't managed to find a
>> workaround that exploits that (yet?). Cheers.
>
> I've been playing around with __getnewargs__(), and it looks like you can
> get it to work with a custom __new__(). Just set the shape attribute there
> rather than in __init__():
>
> $ cat pickle_dict_subclass.py
> import pickle
>
>
> class A(dict):
> def __new__(cls, keys=(), shape=None):
> obj = dict.__new__(cls)
> obj.shape = shape
> return obj
>
> def __init__(self, keys=(), shape=None):
> print("INIT")
> for key in keys:
> self[key] = None
> print("EXIT")
>
> def __setitem__(self, key, val):
> print(self.shape, ": ", key, " <-- ", val, sep="")
> super().__setitem__(key, val)
>
> def __getnewargs__(self):
> print("GETNEWARGS")
> return ("xyz", self.shape)
>
> x = A([1, 2, 3], shape="SHAPE")
> x["foo"] = "bar"
> print("pickling:")
> s = pickle.dumps(x)
> print("unpickling:")
> y = pickle.loads(s)
> print(y)
> $ python3 pickle_dict_subclass.py
> INIT
> SHAPE: 1 <-- None
> SHAPE: 2 <-- None
> SHAPE: 3 <-- None
> EXIT
> SHAPE: foo <-- bar
> pickling:
> GETNEWARGS
> unpickling:
> SHAPE: 1 <-- None
> SHAPE: 2 <-- None
> SHAPE: 3 <-- None
> SHAPE: foo <-- bar
> {1: None, 2: None, 3: None, 'foo': 'bar'}
>
> It's not clear to me how the dict items survive when they are not included
> in the __getnewargs__() result, but apparently they do.
>
>
>
Thanks Peter. The docs for pickle say "When a class instance is
unpickled, its __init__() method is usually not invoked. The default
behaviour first creates an uninitialized instance and then restores the
saved attributes." Your outputs above seem to confirm that it isn't
calling my __init__(), but it *is* calling my __setitem__() when it is
restoring the dict items. I actually have about 50-60 lines of code in
my real __setitem__(), but I probably don't need it if I'm unpickling.
My __setitem__() actually updates a secondary data structure which was
why I thought I needed to do more than just restore the items, but if
that's going to be restored in the same way I just need the dict items
to be restored. The following seems to achieve that.
def __setitem__(self, key, val):
try:
self.shape
except AttributeError:
dict.__setitem__(self, key, val)
else:
# do other stuff
A bit close to original workaround that I didn't think would work
(although that used __setstate__() for some reason). I'm not sure I
prefer it to defining __new__() though. Ideally I'd have something like,
def __setitem__(self, key, val):
if unpickling:
dict.__setitem__(self, key, val)
else:
# do other stuff
rather than potentially masking a bug. Anyway, thanks for your solution
and the outputs that pointed me to another possible solution. Cheers.
Duncan
More information about the Python-list
mailing list