[Python-Dev] Defining a path protocol
Paul Moore
p.f.moore at gmail.com
Wed Apr 6 15:32:39 EDT 2016
On 6 April 2016 at 19:32, Brett Cannon <brett at python.org> wrote:
>> > Now we need clear details. :) Some open questions are:
>> >
>> > 1. Name: __path__, __fspath__, or something else?
>>
>> __fspath__
>
> +1 for __path__, +0 for __fspath__ (I don't know how widespread the notion
> that "fs" means "file system" is).
Agreed. But if we have a builtin, it should follow the name of the
special attribute/method. And I'm not that keen on having a builtin
with a generic name like 'path'.
>> > 2. Method or attribute? (changes what kind of one-liner you might use
>> > in libraries, but I think historically all protocols have been
>> > methods and the serialized string representation might be costly to
>> > build)
>>
>> I would prefer an attribute, but yeah I think dunders are typically
>> methods, and I don't see this being special enough to not follow that
>> trend.
>
> Depends on what we want to tell 3rd-party libraries to do to support pathlib
> if they are on 3.3 or if they are worried about people using Python 3.4.2 or
> 3.5.1. An attribute still works with `getattr(path, '__path__', path)`. But
> with a method you probably want either `path.__path__() if hasattr(path,
> '__path__') else path` or `getattr(path, '__path__', lambda: path)()`.
I'm a little confused by this. To support the older pathlib, they have
to do patharg = str(patharg), because *none* of the proposed
attributes (path or __path__) will exist.
The getattr trick is needed to support the *new* pathlib, when you
need a real string. Currently you need a string if you call stdlib
functions or builtins. If we fix the stdlib/builtins, the need goes
away for those cases, but remains if you need to call libraries that
*don't* support pathlib (os.path will likely be one of those) or do
direct string manipulation.
In practice, I see the getattr trick as an "easy fix" for libraries
that want to add support but in a minimally-intrusive way. On that
basis, making the trick easy to use is important, which argues for an
attribute.
>> > 3. Built-in? (name is dependent on #1 if we add one)
>>
>> fspath() -- and it would be handy to have a function that return either
>> the __fspath__ results, or the string (if it was one), or raise an
>> exception if neither of the above work out.
fspath regardless of the name chosen in #1 - a new builtin called path
just has too much likelihood of clashing with user code.
But I'm not sure we need a builtin. I'm not at all clear how
frequently we expect user code to need to use this protocol. Users
can't use the builtin if they want to be backward compatible, But code
that doesn't need backward compatibility can probably just work with
pathlib (and the stdlib support for it) directly. For display, the
implicit conversion to str is fine. For "get me a string representing
the path", is the "path" attribute being abandoned in favour of this
special method? I'm inclined to think that if you are writing "pure
pathlib" code, pathobj.path looks more readable than fspath(pathobj) -
certainly no *less* readable.
But I'm not one of the people who disliked using .path, so I'm
probably not best placed to judge. It would be good if someone who
*does* feel strongly could explain why fspath(pathobj) is better than
pathobj.path.
> So:
>
> # Attribute
> def fspath(path):
> hasattr(path, '__path__'):
> return path.__path__
> if isinstance(path, str):
> return path
> raise NotImplementedError # Or TypeError?
>
> # Method
> def fspath(path):
> try:
> return path.__path__()
> except AttributeError:
> if isinstance(path, str):
> return path
> raise TypeError # Or NotImplementedError?
You could of course use try/except for the attribute case. Or hasattr
for the method case (where it would avoid masking AttributeError
exceptions raised within the dunder method call (a possibility if user
classes implement their own version of the protocol).
Paul
More information about the Python-Dev
mailing list