[Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

Stephen J. Turnbull stephen at xemacs.org
Tue Apr 19 23:16:29 EDT 2016

Brett Cannon writes:

 > Now if you can convince me that the use of bytes paths is very
 > minimal

I doubt that I can do that, because all that Python 2 code is
effectively bytes.  To the extent that people are just passing it into
their bytes-domain code and it works for them, they probably "port" to
Python 3 by using bytes for paths.  I just don't think bytes usage per
se matters to the issue of polymorphism of __fspath__.

 > Ah, but you see that doesn't make porting easy. If I have a bunch
 > of path-manipulating code using os.path already and I want to add
 > support for pathlib I can either (a) rewrite all of that
 > path-manipulating code to work using pathlib, or (b) simply call
 > `path = os.fspath(path)` and be done with it.

OK, so what matters here is not "how many people are using bytes".
They can keep using os.path, which is what they probably have already
been using.  What we are worrying about is that

(1) some really attractive producer of pathlib.Paths will be
    published, and

(2) people will want to plug that producer into their bytes paths
    consumers using os.fspath(path) "and be done with it".

Excuse me, but that doesn't make sense as written.  Path.__fspath__
will return str, in any case.  So these developers have to consume
text to use pathlib, even merely as a consumer of Paths.  No need for
polymorphism here, simply because it won't be used in this instance.

What's left is DirEntry (and perhaps other producers of byte-oriented
objects in os and os.path).  If they're currently using DirEntry,
they're currently accessing .path.  Surely bytes users can continue
doing that, even if we offer str users the advantage of new protocols?

I conclude that there is no real use in having a polymorphic
__fspath__ unless callers of os.fspath can communicate desired return
type to it, and it implicitly coerces to that type.  But then open and
friends *implicitly* consume __fspath__.  So there probably needs to
be a way to communicate the desired type to them in the case where
they receive an __fspath__-bearing object so they can tell os.fspath
what their callers want, no?

Supporting both "pipeline polymorphism" of this kind and implicit
conversion protocols at the same time is quite complicated, I think.

 > [Folks] have convinced me that people do some really crazy stuff
 > with their file systems and that it isn't isolated to just one or
 > two people.  And so it becomes this situation where we need to ask
 > ourselves if we are going to tell them to just deal with it or help
 > them transition.

People who have to deal with really crazy stuff in filesystems are
already manipulating paths as text.  It's not we who need help with
the transition that matters (bytes to text).  We can use os.path or
pathlib, but bytes just don't matter because we're not using them in
path manipulations.

It's people who live in monolingual mono-encoding environments who
will be using bytes successfully, and be resistent to costly changes
that don't make their lives better.  But the bytes vs. text cost is
inherent in using pathlib, so polymorphism doesn't help promote
pathlib.  It might help promote use of os.scandir in bytes-oriented
code, though I don't see that as a huge effect nor more than mildly
desirable.  Is it?


More information about the Python-Dev mailing list