Cross-platform pickling of Path objects

Hi all, I have opened http://bugs.python.org/issue27175 a month and a half ago but didn't get any feedback there, so I'll copy-paste it here for comments: Currently, pickling Path objects lead to issues when working across platforms: Paths (and, thus, objects that contain Paths) created on a POSIX platform (PosixPaths) cannot be unpickled on Windows and vice versa. There are a few possibilities around this issue. - Don't do anything about it, you should use PurePaths if you care about cross-platform compatibility: this would be pretty awkward, as any call to the Path API would require converting back the PurePath to a Path first. - Silently convert Paths to PurePaths during pickling (a solution that seems to have been adopted by http://docs.menpo.org/en/stable/api/menpo/io/export_pickle.html for example): it would be better if Paths at least roundtripped correctly within a single platform. - Convert Paths to PurePaths at unpickling time, only if the platform is different (and possibly with a warning): this is the least bad solution I came up with so far. Note that calls to the Path API on a "converted" PurePath object would be invalid anyways as the PurePath (being of a different platform) cannot be representing a valid path. Thoughts? Antony

Sure. I use pickles to store "processed" (scientific) datasets; they typically keep the path to a "raw" dataset as an object attribute (of a Path class, either PosixPath or WindowsPath depending on the "OS" (POSIX/Windows) of the computer where the processing is originally done), as it is sometimes useful to go back and check the raw datasets. However, in general the processed datasets are enough by themselves, so it makes sense to transfer them (independently of the raw datasets) to another computer before further processing. Obviously, once the processed dataset has been transferred to another computer, trying to access the raw dataset will fail with an OSError (e.g., FileNotFoundError); I have code in place to handle that. However, if the file is transferred from a POSIX "OS" to a Windows "OS" or vice-versa, it becomes impossible to even unpickle the file. Converting Path objects to PurePaths upon unpickling on a different "OS", either silently or possibly with a warning, would allow me to just catch both OSErrors and AttributeErrors when trying to access the raw dataset from the path attribute of the processed one. Antony 2016-07-15 17:57 GMT-07:00 Ethan Furman <ethan@stoneleaf.us>:

On 16 July 2016 at 15:51, Antony Lee <antony.lee@berkeley.edu> wrote:
The approach of converting to a PurePath with a RuntimeWarning when unpickling seems reasonable to me - if a particular project wants to silence that warning, they can use the warning machinery to suppress it when loading the pickled data. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Sure. I use pickles to store "processed" (scientific) datasets; they typically keep the path to a "raw" dataset as an object attribute (of a Path class, either PosixPath or WindowsPath depending on the "OS" (POSIX/Windows) of the computer where the processing is originally done), as it is sometimes useful to go back and check the raw datasets. However, in general the processed datasets are enough by themselves, so it makes sense to transfer them (independently of the raw datasets) to another computer before further processing. Obviously, once the processed dataset has been transferred to another computer, trying to access the raw dataset will fail with an OSError (e.g., FileNotFoundError); I have code in place to handle that. However, if the file is transferred from a POSIX "OS" to a Windows "OS" or vice-versa, it becomes impossible to even unpickle the file. Converting Path objects to PurePaths upon unpickling on a different "OS", either silently or possibly with a warning, would allow me to just catch both OSErrors and AttributeErrors when trying to access the raw dataset from the path attribute of the processed one. Antony 2016-07-15 17:57 GMT-07:00 Ethan Furman <ethan@stoneleaf.us>:

On 16 July 2016 at 15:51, Antony Lee <antony.lee@berkeley.edu> wrote:
The approach of converting to a PurePath with a RuntimeWarning when unpickling seems reasonable to me - if a particular project wants to silence that warning, they can use the warning machinery to suppress it when loading the pickled data. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (3)
-
Antony Lee
-
Ethan Furman
-
Nick Coghlan