Complete recursive pickle dump
Currently, pickle can only save some very simple data types into bytes. And the object itself contains reference to some builtin data or variable, only reference will be saved. I am wondering if it is possible recursively persist all the data into pickle representation. Even if some data might be builtin function of variable, it will represent it in very raw (memory level). If it contains function it will persist all its attribute, and code object. Method-wrapper and all its attribute will be persist entirely. If it contains two attribute with same memory location `id(x)` is same, then they can be persist as one, and reference each other. For example, if user pickle recursively dump a numpy ndarray into a pickle file, when user pickle load this file from a system which doesn't install numpy, its method should all works properly like transpose, reshape ... When applying recursively dump, the output pickle file or bytes might be very large. All related memory data will be dumped. As the example of numpy ndarray, when doing recursively dump, large part of numpy library's content will be persist except those really lose or none coupled part. This is actually a cool feature, because pickle is to keep python object, completeness is a great advantage of pickle than json and other data persist library. Sent with ProtonMail Secure Email.
On 27/08/21 11:27 pm, Evan Greenup via Python-ideas wrote:
If it contains function it will persist all its attribute, and code object.
For example, if user pickle recursively dump a numpy ndarray into a pickle file, when user pickle load this file from a system which doesn't install numpy, its method should all works properly like transpose, reshape ...
By design, pickle doesn't save code objects, but even if it did, most of numpy is not written in Python, so it wouldn't help in that case. You're effectively asking for pickling of binary machine code, which would be very difficult to do. Even if you were successful, the resulting pickle would be very closely tied to both a platform and a particular version of Python and all its libraries.
As the example of numpy ndarray, when doing recursively dump, large part of numpy library's content will be persist except those really lose or none coupled part.
Selecting individual machine code routines out of numpy and saving and restoring them individually is probably not feasible, so you would have to just dump the whole lot as one binary blob. At that point it would be a lot easier and more reliable just to tell the user to install numpy. -- Greg
participants (2)
-
Evan Greenup
-
Greg Ewing