Cross-platform pickling of Path objects
Hi all, I have opened http://bugs.python.org/issue27175 a month and a half ago but didn't get any feedback there, so I'll copy-paste it here for comments: Currently, pickling Path objects lead to issues when working across platforms: Paths (and, thus, objects that contain Paths) created on a POSIX platform (PosixPaths) cannot be unpickled on Windows and vice versa. There are a few possibilities around this issue. - Don't do anything about it, you should use PurePaths if you care about cross-platform compatibility: this would be pretty awkward, as any call to the Path API would require converting back the PurePath to a Path first. - Silently convert Paths to PurePaths during pickling (a solution that seems to have been adopted by http://docs.menpo.org/en/stable/api/menpo/io/export_pickle.html for example): it would be better if Paths at least roundtripped correctly within a single platform. - Convert Paths to PurePaths at unpickling time, only if the platform is different (and possibly with a warning): this is the least bad solution I came up with so far. Note that calls to the Path API on a "converted" PurePath object would be invalid anyways as the PurePath (being of a different platform) cannot be representing a valid path. Thoughts? Antony
On 07/15/2016 05:51 PM, Antony Lee wrote:
I have opened http://bugs.python.org/issue27175 a month and a half ago but didn't get any feedback there, so I'll copy-paste it here for comments:
Currently, pickling Path objects lead to issues when working across platforms: Paths (and, thus, objects that contain Paths) created on a POSIX platform (PosixPaths) cannot be unpickled on Windows and vice versa. There are a few possibilities around this issue.
- Don't do anything about it, you should use PurePaths if you care about cross-platform compatibility: this would be pretty awkward, as any call to the Path API would require converting back the PurePath to a Path first.
- Silently convert Paths to PurePaths during pickling (a solution that seems to have been adopted by http://docs.menpo.org/en/stable/api/menpo/io/export_pickle.html for example): it would be better if Paths at least roundtripped correctly within a single platform.
- Convert Paths to PurePaths at unpickling time, only if the platform is different (and possibly with a warning): this is the least bad solution I came up with so far. Note that calls to the Path API on a "converted" PurePath object would be invalid anyways as the PurePath (being of a different platform) cannot be representing a valid path.
Thoughts?
Any use-case examples? That would help in deciding on the best path forward. -- ~Ethan~
Sure. I use pickles to store "processed" (scientific) datasets; they
typically keep the path to a "raw" dataset as an object attribute (of a
Path class, either PosixPath or WindowsPath depending on the "OS"
(POSIX/Windows) of the computer where the processing is originally done),
as it is sometimes useful to go back and check the raw datasets. However,
in general the processed datasets are enough by themselves, so it makes
sense to transfer them (independently of the raw datasets) to another
computer before further processing.
Obviously, once the processed dataset has been transferred to another
computer, trying to access the raw dataset will fail with an OSError (e.g.,
FileNotFoundError); I have code in place to handle that. However, if the
file is transferred from a POSIX "OS" to a Windows "OS" or vice-versa, it
becomes impossible to even unpickle the file.
Converting Path objects to PurePaths upon unpickling on a different "OS",
either silently or possibly with a warning, would allow me to just catch
both OSErrors and AttributeErrors when trying to access the raw dataset
from the path attribute of the processed one.
Antony
2016-07-15 17:57 GMT-07:00 Ethan Furman
On 07/15/2016 05:51 PM, Antony Lee wrote:
I have opened http://bugs.python.org/issue27175 a month and a half ago
but didn't get any feedback there, so I'll copy-paste it here for comments:
Currently, pickling Path objects lead to issues when working across platforms: Paths (and, thus, objects that contain Paths) created on a POSIX platform (PosixPaths) cannot be unpickled on Windows and vice versa. There are a few possibilities around this issue.
- Don't do anything about it, you should use PurePaths if you care about cross-platform compatibility: this would be pretty awkward, as any call to the Path API would require converting back the PurePath to a Path first.
- Silently convert Paths to PurePaths during pickling (a solution that seems to have been adopted by http://docs.menpo.org/en/stable/api/menpo/io/export_pickle.html for example): it would be better if Paths at least roundtripped correctly within a single platform.
- Convert Paths to PurePaths at unpickling time, only if the platform is different (and possibly with a warning): this is the least bad solution I came up with so far. Note that calls to the Path API on a "converted" PurePath object would be invalid anyways as the PurePath (being of a different platform) cannot be representing a valid path.
Thoughts?
Any use-case examples? That would help in deciding on the best path forward.
-- ~Ethan~ _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On 16 July 2016 at 15:51, Antony Lee
Sure. I use pickles to store "processed" (scientific) datasets; they typically keep the path to a "raw" dataset as an object attribute (of a Path class, either PosixPath or WindowsPath depending on the "OS" (POSIX/Windows) of the computer where the processing is originally done), as it is sometimes useful to go back and check the raw datasets. However, in general the processed datasets are enough by themselves, so it makes sense to transfer them (independently of the raw datasets) to another computer before further processing.
Obviously, once the processed dataset has been transferred to another computer, trying to access the raw dataset will fail with an OSError (e.g., FileNotFoundError); I have code in place to handle that. However, if the file is transferred from a POSIX "OS" to a Windows "OS" or vice-versa, it becomes impossible to even unpickle the file.
Converting Path objects to PurePaths upon unpickling on a different "OS", either silently or possibly with a warning, would allow me to just catch both OSErrors and AttributeErrors when trying to access the raw dataset from the path attribute of the processed one.
The approach of converting to a PurePath with a RuntimeWarning when unpickling seems reasonable to me - if a particular project wants to silence that warning, they can use the warning machinery to suppress it when loading the pickled data. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (3)
-
Antony Lee
-
Ethan Furman
-
Nick Coghlan