[Python-Dev] file system path protocol PEP

Sven R. Kunze srkunze at mail.de
Thu May 12 04:31:18 EDT 2016


Thanks Brett for your hard work. My comments below:

On 11.05.2016 18:43, Brett Cannon wrote:
> Rationale
> =========
>
> Historically in Python, file system paths have been represented as
> strings or bytes. This choice of representation has stemmed from C's
> own decision to represent file system paths as
> ``const char *`` [#libc-open]_. While that is a totally serviceable
> format to use for file system paths, it's not necessarily optimal. At
> issue is the fact that while all file system paths can be represented
> as strings or bytes, not all strings or bytes represent a file system
> path.

I can remember this argument being made during the discussion. I am not 
sure if that 100% correct as soon as we talk about PurePaths.

> This can lead to issues where any e.g. string duck-types to a
> file system path whether it actually represents a path or not.
>
> To help elevate the representation of file system paths from their
> representation as strings and bytes to a more appropriate object
> representation, the pathlib module [#pathlib]_ was provisionally
> introduced in Python 3.4 through PEP 428. While considered by some as
> an improvement over strings and bytes for file system paths, it has
> suffered from a lack of adoption. Typically the key issue listed
> for the low adoption rate has been the lack of support in the standard
> library. This lack of support required users of pathlib to manually
> convert path objects to strings by calling ``str(path)`` which many
> found error-prone.
>
> One issue in converting path objects to strings comes from
> the fact that only generic way to get a string representation of the
> path was to pass the object to ``str()``. This can pose a
> problem when done blindly as nearly all Python objects have some
> string representation whether they are a path or not, e.g.
> ``str(None)`` will give a result that
> ``builtins.open()`` [#builtins-open]_ will happily use to create a new
> file.
>
> Exacerbating this whole situation is the
> ``DirEntry`` object [#os-direntry]_. While path objects have a
> representation that can be extracted using ``str()``, ``DirEntry``
> objects expose a ``path`` attribute instead. Having no common
> interface between path objects, ``DirEntry``, and any other
> third-party path library had become an issue. A solution that allowed
> any path-representing object to declare that is was a path and a way
> to extract a low-level representation that all path objects could
> support was desired.

I think the "Rationale" section ignores the fact the Path also supports 
the .path attribute now. Which indeed defines a common interface between 
path objects.

>
> [...]
>
> Proposal
> ========
>
> This proposal is split into two parts. One part is the proposal of a
> protocol for objects to declare and provide support for exposing a
> file system path representation.

https://docs.python.org/3/whatsnew/changelog.html says:

"Add ‘path’ attribute to pathlib.Path objects, returning the same as 
str(), to make it more similar to DirEntry. Library code can now write 
getattr(p, ‘path’, p) to get the path as a string from a Path, a 
DirEntry, or a plain string. This is essentially a small one-off protocol."

So, in order to promote the "small one-off protocol" to a more broader 
protocol, this PEP proposes a simple rename of .path to .__fspath__, is 
that correct?


The only issue I see with it is that it requires another function 
(os.fspath) to extract the "low-level representation". .path seems far 
easier to me.

> The other part is changes to Python's
> standard library to support the new protocol.

I think this could be another PEP unrelated to the first part.

> These changes will also have the pathlib module drop its provisional 
> status.

Not sure if that should be part of the PEP, maybe yes.

> [...]

The remainder of the PEP unfolds as a flawless implication of the 
rationale and the proposed idea.

Unfortunately, I don't have anything to contribute to the open issues. 
All solutions have their pros and cons and everything that could be said 
has been said. I think you need to decide.

Sven



More information about the Python-Dev mailing list