On Fri, 13 May 2016 at 04:00 Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, May 12, 2016 at 08:53:12PM +0000, Brett Cannon wrote:

> Second draft that takes Guido's comments into consideration. The biggest
> change is os.fspath() now returns whatever path.__fspath__() returns
> instead of restricting it to only str.

Counter suggestion:

- __fspath__() method may return either bytes or str (no change
  from the PEP as it stands now);

- but os.fspath() will only return str;

- and os.fspathb() will only return bytes;

- there is no os function that returns "str or bytes, I don't
  care which". (If you really need that, call __fspath__ directly.)

Note that this differs from the already rejected suggestion that there
should be two dunder methods, __fspath__() and __fspathb__().

Why?

(1) Normally, the caller knows whether they want str or bytes. (That's
been my experience, you may disagree.) If so, and they call os.fspath()
expecting a str, they won't be surprised by it returning bytes. And visa
versa for when you expect a bytes path.

(2) This behaviour will match that of os.{environ[b],getcwd[b],getenv[b]}.

Cons:

(3) Polymorphic code that truly doesn't care whether it gets bytes or
str will have a slightly less convenient way of getting it, namely by
calling __fspath__() itself, instead of os.fspath().

I prefer what's in the PEP. I get where you coming from, Steven, but I don't think it will be common enough to worry about. Think of os.fspath() like next() where it truly is a very minor convenience function that happens to special-case str and bytes.
 




A few other comments below:


> builtins
> ''''''''
>
> ``open()`` [#builtins-open]_ will be updated to accept path objects as
> well as continue to accept ``str`` and ``bytes``.

I think it is a bit confusing to refer to "path objects", as that seems
like you are referring only to pathlib.Path objects. It took me far too
long to realise that here you mean generic path-like objects that obey
the __fspath__ protocol rather than a specific concrete class.

Since the ABC is called "PathLike", I suggest we refer to "path-like
objects" rather than "path objects", both in the PEP and in the Python
docs for this protocol.

I went back and forth with this in my head while writing the PEP. The problem with making "path-like" mean "objects implementing the PathLike ABC" becomes how do you refer to an argument of a function that accepts anything os.fspath() does (i.e. PathLike, str, and bytes)?
 




>     def fspath(path: t.Union[PathLike, str, bytes]) -> t.Union[str, bytes]:
>         """Return the string representation of the path.
>
>         If str or bytes is passed in, it is returned unchanged.
>         """

I've already suggested a change to this, above, but independent of that,
a minor technical query:

>         try:
>             return path.__fspath__()

Would I be right in saying that in practice this will actually end up
being type(path).__fspath__() to match the behaviour of all(?) other
dunder methods?

I wasn't planning on it because for most types the accessing of the method directly off of the type for magic methods is because of some special struct field at the C level that we're pulling from. Since we're not planning to have an equivalent struct field I don't see any need to do the extra work of avoiding the instance participating in method lookup. Obviously if people disagree for some reason then please let me know (maybe for perf by avoiding the overhead of checking for the method on the instance?).