[New-bugs-announce] [issue37852] Pickling doesn't work for name-mangled private methods

Josh Rosenberg report at bugs.python.org
Wed Aug 14 12:58:36 EDT 2019


New submission from Josh Rosenberg <shadowranger+python at gmail.com>:

Inspired by this Stack Overflow question, where it prevented using multiprocessing.Pool.map with a private method: https://stackoverflow.com/q/57497370/364696

The __name__ of a private method remains the unmangled form, even though only the mangled form exists on the class dictionary for lookup. The __reduce__ for bound methods doesn't handle them private names specially, so it will serialize it such that on the other end, it does getattr(method.__self__, method.__func__.__name__). On deserializing, it tries to perform that lookup, but of course, only the mangled name exists, so it dies with an AttributeError.

Minimal repro:

import pickle

class Spam:
    def __eggs(self):
        pass
    def eggs(self):
        return pickle.dumps(self.__eggs)

spam = Spam()
pkl = spam.eggs()                       # Succeeds via implicit mangling (but pickles unmangled name)
pickle.loads(pkl)                       # Fails (tried to load __eggs

Explicitly mangling via pickle.dumps(spam._Spam__eggs) fails too, and in the same way.

A similar problem occurs (on the serializing end) when you do:

pkl = pickle.dumps(Spam._Spam__eggs)    # Pickling function in Spam class, not bound method of Spam instance

though that failure occurs at serialization time, because pickle itself tries to look up <module>.Spam.__eggs (which doesn't exist), instead of <module>.Spam._Spam__eggs (which does).

1. It fails at serialization time (so it doesn't silently produce pickles that can never be unpickled)
2. It's an explicit PicklingError, with a message that explains what it tried to do, and why it failed ("Can't pickle <function Spam.__eggs at 0xdeadbeef)>: attribute lookup Spam.__eggs on __main__ failed")

In the use case on Stack Overflow, it was the implicit case; a public method of a class created a multiprocessing.Pool, and tried to call Pool.map with a private method on the same class as the mapper function. While normally pickling methods seems odd, for multiprocessing, it's pretty standard.

I think the correct fix here is to make method_reduce in classobject.c (the __reduce__ implementation for bound methods) perform the mangling itself (meth_reduce in methodobject.c has the same bug, but it's less critical, since only private methods of built-in/extension types would be affected, and most of the time, such private methods aren't exposed to Python at all, they're just static methods for direct calling in C).

This would handle all bound methods, but for "unbound methods" (read: functions defined in a class), it might also be good to update save_global/get_deep_attribute in _pickle.c to make it recognize the case where a component of a dotted name begins with two underscores (and doesn't end with them), and the prior component is a class, so that pickling the private unbound method (e.g. plain function which happened to be defined on a class) also works, instead of dying with a lookup error.

The fix is most important, and least costly, for bound methods, but I think doing it for plain functions is still worthwhile, since I could easily see Pool.map operations using an @staticmethod utility function defined privately in the class for encapsulation purposes, and it seems silly to force them to make it more public and/or remove it from the class.

----------
components: Interpreter Core, Library (Lib)
messages: 349716
nosy: josh.r
priority: normal
severity: normal
status: open
title: Pickling doesn't work for name-mangled private methods
versions: Python 3.9

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue37852>
_______________________________________


More information about the New-bugs-announce mailing list