[Python-Dev] Defining a path protocol

Wed Apr 6 18:22:50 EDT 2016

On 6 April 2016 at 20:39, Brett Cannon <brett at python.org> wrote:
>> I'm a little confused by this. To support the older pathlib, they have
>> to do patharg = str(patharg), because *none* of the proposed
>> attributes (path or __path__) will exist.
>>
>> The getattr trick is needed to support the *new* pathlib, when you
>> need a real string. Currently you need a string if you call stdlib
>> functions or builtins. If we fix the stdlib/builtins, the need goes
>> away for those cases, but remains if you need to call libraries that
>> *don't* support pathlib (os.path will likely be one of those) or do
>> direct string manipulation.
>>
>> In practice, I see the getattr trick as an "easy fix" for libraries
>> that want to add support but in a minimally-intrusive way. On that
>> basis, making the trick easy to use is important, which argues for an
>> attribute.
>
> So then where's the confusion? :) You seem to get the points. I personally
> find `path.__path__() if hasattr(path, '__path__') else path` also readable
> (if obviously a bit longer).

The confusion is that you seem to be saying that people can use
getattr(path, '__path__', path) to support older versions of Python.
But the older versions are precisely the ones that don't have __path__
so you won't be supporting them.

>> >> >  3. Built-in? (name is dependent on #1 if we add one)
>> >>
>> >> fspath() -- and it would be handy to have a function that return either
>> >> the __fspath__ results, or the string (if it was one), or raise an
>> >> exception if neither of the above work out.
>>
>> fspath regardless of the name chosen in #1 - a new builtin called path
>> just has too much likelihood of clashing with user code.
>>
>> But I'm not sure we need a builtin. I'm not at all clear how
>> frequently we expect user code to need to use this protocol. Users
>> can't use the builtin if they want to be backward compatible, But code
>> that doesn't need backward compatibility can probably just work with
>> pathlib (and the stdlib support for it) directly. For display, the
>> implicit conversion to str is fine. For "get me a string representing
>> the path", is the "path" attribute being abandoned in favour of this
>> special method?
>
> Yes.

OK. So the idiom to get a string from a known Path object would be any of:

1. str(path)
2. fspath(path)
3. path.__path__()

(1) is safe if you know you have a Path object, but could incorrectly
convert non-Path objects. (2) is safe in all cases. (3) is ugly. Did I
miss any options?

So I think we need a builtin.

Code that needs to be backward compatible will still have to use
str(path), because neither the builtin nor the __path__ protocol will
exist in older versions of Python. Maybe a compatibility library could
add

try:
    fspath
except NameError:
    try:
        import pathlib
        def fspath(p):
            if isinstance(p, pathlib.Path):
                return str(p)
            return p
    except ImportError:
        def fspath(p):
            return p

It's messy, like all compatibility code, but it allows code to use
fspath(p) in older versions.

>> I'm inclined to think that if you are writing "pure
>> pathlib" code, pathobj.path looks more readable than fspath(pathobj) -
>> certainly no *less* readable.
>
> I don't' know what you mean by "pure pathlib". You mean code that only works
> with pathlib objects? Or do you mean code that accepts pathlib objects but
> uses strings internally?

I mean code that knows it has a Path object to work with (and not a
string or anything else). But the point is moot if the path attribute
is going away.

Other than to say that I do prefer the name "path", I just don't think
it's a reasonable name for a builtin. Even if it's OK for user
variables to have the same name as builtins, IDEs tend to colour
builtins differently, which is distracting. (Temporary variables named
"file" or "dir" are the ones I hit frequently...)

If all we're debating is the name, though, I think we're pretty much there :-)

Paul