Thanks Brett for your hard work. My comments below: On 11.05.2016 18:43, Brett Cannon wrote:
Rationale =========
Historically in Python, file system paths have been represented as strings or bytes. This choice of representation has stemmed from C's own decision to represent file system paths as ``const char *`` [#libc-open]_. While that is a totally serviceable format to use for file system paths, it's not necessarily optimal. At issue is the fact that while all file system paths can be represented as strings or bytes, not all strings or bytes represent a file system path.
I can remember this argument being made during the discussion. I am not sure if that 100% correct as soon as we talk about PurePaths.
This can lead to issues where any e.g. string duck-types to a file system path whether it actually represents a path or not.
To help elevate the representation of file system paths from their representation as strings and bytes to a more appropriate object representation, the pathlib module [#pathlib]_ was provisionally introduced in Python 3.4 through PEP 428. While considered by some as an improvement over strings and bytes for file system paths, it has suffered from a lack of adoption. Typically the key issue listed for the low adoption rate has been the lack of support in the standard library. This lack of support required users of pathlib to manually convert path objects to strings by calling ``str(path)`` which many found error-prone.
One issue in converting path objects to strings comes from the fact that only generic way to get a string representation of the path was to pass the object to ``str()``. This can pose a problem when done blindly as nearly all Python objects have some string representation whether they are a path or not, e.g. ``str(None)`` will give a result that ``builtins.open()`` [#builtins-open]_ will happily use to create a new file.
Exacerbating this whole situation is the ``DirEntry`` object [#os-direntry]_. While path objects have a representation that can be extracted using ``str()``, ``DirEntry`` objects expose a ``path`` attribute instead. Having no common interface between path objects, ``DirEntry``, and any other third-party path library had become an issue. A solution that allowed any path-representing object to declare that is was a path and a way to extract a low-level representation that all path objects could support was desired.
I think the "Rationale" section ignores the fact the Path also supports the .path attribute now. Which indeed defines a common interface between path objects.
[...]
Proposal ========
This proposal is split into two parts. One part is the proposal of a protocol for objects to declare and provide support for exposing a file system path representation.
https://docs.python.org/3/whatsnew/changelog.html says: "Add ‘path’ attribute to pathlib.Path objects, returning the same as str(), to make it more similar to DirEntry. Library code can now write getattr(p, ‘path’, p) to get the path as a string from a Path, a DirEntry, or a plain string. This is essentially a small one-off protocol." So, in order to promote the "small one-off protocol" to a more broader protocol, this PEP proposes a simple rename of .path to .__fspath__, is that correct? The only issue I see with it is that it requires another function (os.fspath) to extract the "low-level representation". .path seems far easier to me.
The other part is changes to Python's standard library to support the new protocol.
I think this could be another PEP unrelated to the first part.
These changes will also have the pathlib module drop its provisional status.
Not sure if that should be part of the PEP, maybe yes.
[...]
The remainder of the PEP unfolds as a flawless implication of the rationale and the proposed idea. Unfortunately, I don't have anything to contribute to the open issues. All solutions have their pros and cons and everything that could be said has been said. I think you need to decide. Sven