[Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

Eric Snow ericsnowcurrently at gmail.com
Tue Apr 19 18:22:46 EDT 2016


On Tue, Apr 19, 2016 at 10:50 AM, Brett Cannon <brett at python.org> wrote:
> Ah, but you see that doesn't make porting easy. If I have a bunch of
> path-manipulating code using os.path already and I want to add support for
> pathlib I can either (a) rewrite all of that path-manipulating code to work
> using pathlib, or (b) simply call `path = os.fspath(path)` and be done with
> it. Basically if you have written any code that uses os.path then you will
> have to care about (a) or (b) as a way to add support for pathlib short of
> the `str(path)` hack we're all working to get away from. And if people truly
> liked option (a) then this conversation wouldn't be such a big deal as we
> would have seen more people using pathlib already (yes, the provisional tag
> may have scared some off, but my guess is it's more from not wanting to
> rewrite os.path-using code).
>
> Now if you can convince me that the use of bytes paths is very minimal and
> thus people doing path manipulations with them will be a very small minority
> then I'm happy to try and use this to keep pushing people towards avoiding
> bytes for file paths. But over the years people such as yourself, Stephen,
> have convinced me that people do some really crazy stuff with their file
> systems and that it isn't isolated to just one or two people. And so it
> becomes this situation where we need to ask ourselves if we are going to
> tell them to just deal with it or help them transition.
>
> The other way to convince me is that people needing to support older
> versions of Python will use `path = path.__fspath__() if hasattr(path,
> '__fspath__') else path` and that allowing bytes with that idiom is going to
> cost them dearly. My current assumption is that it won't because people
> using that idiom are using os.path and those functions will complain when
> mixing str and bytes together, but I'm open to being convinced otherwise.
>
> I guess what I'm trying to get at is that I understand the desire to get
> people to get the bytes path habit, but to me the best way will be to get
> people quickly and easily transitioned over to pathlib as a carrot rather
> than using the lack of bytes path support in this transition as a stick.

Perhaps I missed previous discussion on the point, but why not support
both __fspath__() -> str and __fssyspath__() -> bytes?  Returning
NotImplemented would indicate "try the other one".  For example,
DirEntry.__fspath__() would return NotImplemented when the underlying
value is bytes and vice-versa.

A str-specific os.fspath would looks something like this:

    def fspath(path):
        try:
            fspath = type(path).__fspath__
        except AttributeError:
            pass
        else:
            rendered = fspath(path)
            if rendered is not NotImplemented:
                return rendered
        raise TypeError

...and a more lenient, polymorphic version (for use by os.path.*,
etc.) would look like this:

    def _fspath(path):
        try:
            fspath = type(path).__fspath__
        except AttributeError:
            pass
        else:
            rendered = fspath(path)
            if rendered is not NotImplemented:
                return rendered

       try:
            fspath = type(path).__fssyspath__
        except AttributeError:
            pass
        else:
            rendered = fspath(path)
            if rendered is not NotImplemented:
                return rendered

        # nothing to do
        return path

The hard distinction between the two dunder methods preserves the
conceptual str/bytes division we're aiming for.  It will be much
easier to identify which path implementations are dealing with (or
supporting) bytes paths.  Likewise with the two helpers and their
usage.

-eric


More information about the Python-Dev mailing list