[Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

Paul Moore p.f.moore at gmail.com
Thu Apr 14 06:07:49 EDT 2016


On 14 April 2016 at 08:02, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> So let me propose what I think is the elephant in the room.  If you're
> going to have a polymorphic __fspath__, then pathlib is *the* example
> of a module that *desperately* needs to be polymorphic.  Consider:
>
>     A non-text Application has some bytes and passes them to
>         pathlib.Path as <type A>
>     manipulates them and passes the result to
>         os.scandir as <type B>
>     expecting a return of
>         DirEntries of <type C>
>
> <type A> == <type C> == bytes, and <type B> == Path is TOOWTDI, no?

I'm not sure I follow this logic at all. But from my reading your
argument contradicts your conclusion, so maybe I'm misunderstanding.

To me, the "obvious" conclusion is that pathlib is not appropriate in
non-text applications, because <type A> *cannot* be bytes (the
constructor rejects bytes). I see no reason to change that - non-text
applications are inherently low level, and shouldn't expect to use
high-level abstractions like pathlib.

> But under the current proposal which doesn't touch the internal
> mechanisms of pathlib and allows, but has no way to request, bytes
> returns, <type A> == str, <type B> == Path, and <type C> == str,
> requiring two explicit conversions that bytes-shoveling developers
> will tell you should be unnecessary.  QED, pathlib should be
> polymorphic as a central part of this proposal.

Nope, QED pathlib is not a low level abstraction.

So your argument to me doesn't help much, because it's a given that
pathlib is str-only. The debate is about how things like scandir
(specifically DirEntry objects) and Ethan's pathlib replacement, which
*do* allow bytes in and out, should participate in the new protocol,
when they are bytes (they obviously should work just like pathlib when
they are strings).

In my opinion, they *shouldn't* the new protocol should be string-only
(at least initially).

If I understand (from a couple of brief mentions) Ethan has a
string-like path object and a bytes-like path object, so he could
support fspath on the string-like one but not the bytes-like one. He
may not like having slightly different APIs for the two types, I don't
know, but it's possible. But DirEntry is polymorphic, so it *will*
have a __fspath__ method, and needs to know what to do when it's
bytes-like (I guess with a bit of getattr hacking DirEntry *could*
expose a __fspath__ method only if it's string-like, but that seems
like a pretty gross hack).

So:

1. pathlib remains string-like, and is the canonical example of
__fspath__, returns strings only
2. DirEntry is the only other example of the protocol in the stdlib,
but is polymorphic
3. I'm not aware of any 3rd party library that has polymorphic classes
(Ethan can correct me if I'm wrong here)

So the only purpose I know of for discussing __fspath__ returning
bytes is for scandir, and hypothetical polymorphic 3rd party path
abstractions (and possibly Ethan's preference to have a common API for
his 2 classes).

I propose we should have a string-only __fspath__ protocol in 3.6.
Bytes-format DirEntry objects can raise an error in __fspath__. If it
becomes obvious with usage that we need bytes support in __fspath__ we
can add it (compatibly - string-only code wouldn't need to change) in
3.7. That seems far better to me than trying to design bytes support
without actual use cases.

Paul


More information about the Python-Dev mailing list