[Python-Dev] Defining a path protocol
brett at python.org
Wed Apr 6 18:46:24 EDT 2016
On Wed, 6 Apr 2016 at 15:22 Paul Moore <p.f.moore at gmail.com> wrote:
> On 6 April 2016 at 20:39, Brett Cannon <brett at python.org> wrote:
> >> I'm a little confused by this. To support the older pathlib, they have
> >> to do patharg = str(patharg), because *none* of the proposed
> >> attributes (path or __path__) will exist.
> >> The getattr trick is needed to support the *new* pathlib, when you
> >> need a real string. Currently you need a string if you call stdlib
> >> functions or builtins. If we fix the stdlib/builtins, the need goes
> >> away for those cases, but remains if you need to call libraries that
> >> *don't* support pathlib (os.path will likely be one of those) or do
> >> direct string manipulation.
> >> In practice, I see the getattr trick as an "easy fix" for libraries
> >> that want to add support but in a minimally-intrusive way. On that
> >> basis, making the trick easy to use is important, which argues for an
> >> attribute.
> > So then where's the confusion? :) You seem to get the points. I
> > find `path.__path__() if hasattr(path, '__path__') else path` also
> > (if obviously a bit longer).
> The confusion is that you seem to be saying that people can use
> getattr(path, '__path__', path) to support older versions of Python.
> But the older versions are precisely the ones that don't have __path__
> so you won't be supporting them.
Because pathlib is provisional the change will go into the next releases of
Python 3.4, 3.5, and in 3.6 so new-old will have whatever we do. :) I think
the key point is that this sort of thing will occur before you have access
to some new built-in or something.
> >> >> > 3. Built-in? (name is dependent on #1 if we add one)
> >> >>
> >> >> fspath() -- and it would be handy to have a function that return
> >> >> the __fspath__ results, or the string (if it was one), or raise an
> >> >> exception if neither of the above work out.
> >> fspath regardless of the name chosen in #1 - a new builtin called path
> >> just has too much likelihood of clashing with user code.
> >> But I'm not sure we need a builtin. I'm not at all clear how
> >> frequently we expect user code to need to use this protocol. Users
> >> can't use the builtin if they want to be backward compatible, But code
> >> that doesn't need backward compatibility can probably just work with
> >> pathlib (and the stdlib support for it) directly. For display, the
> >> implicit conversion to str is fine. For "get me a string representing
> >> the path", is the "path" attribute being abandoned in favour of this
> >> special method?
> > Yes.
> OK. So the idiom to get a string from a known Path object would be any of:
> 1. str(path)
> 2. fspath(path)
> 3. path.__path__()
> (1) is safe if you know you have a Path object, but could incorrectly
> convert non-Path objects. (2) is safe in all cases. (3) is ugly. Did I
> miss any options?
Other than path.__path__ being an attribute, nope.
> So I think we need a builtin.
Well, the ugliness shouldn't survive forever if the community shifts over
to using pathlib while the built-in will. We also don't have a built-in for
__index__() so it depends on whether we expect this sort of thing to be the
purview of library authors or if normal people will be interacting with it
(it's probably both during the transition, but I don't know afterwards).
> Code that needs to be backward compatible will still have to use
> str(path), because neither the builtin nor the __path__ protocol will
> exist in older versions of Python.
str(path) will definitely work, path.__path__ will work if you're running
the next set of bugfix releases. fspath(path) will only work in Python 3.6
> Maybe a compatibility library could
> except NameError:
> import pathlib
> def fspath(p):
> if isinstance(p, pathlib.Path):
> return str(p)
> return p
> except ImportError:
> def fspath(p):
> return p
> It's messy, like all compatibility code, but it allows code to use
> fspath(p) in older versions.
I would tweak it to check for __fspath__ before it resorted to calling
str(), but yes, that could be something people use.
> >> I'm inclined to think that if you are writing "pure
> >> pathlib" code, pathobj.path looks more readable than fspath(pathobj) -
> >> certainly no *less* readable.
> > I don't' know what you mean by "pure pathlib". You mean code that only
> > with pathlib objects? Or do you mean code that accepts pathlib objects
> > uses strings internally?
> I mean code that knows it has a Path object to work with (and not a
> string or anything else). But the point is moot if the path attribute
> is going away.
> Other than to say that I do prefer the name "path", I just don't think
> it's a reasonable name for a builtin. Even if it's OK for user
> variables to have the same name as builtins, IDEs tend to colour
> builtins differently, which is distracting. (Temporary variables named
> "file" or "dir" are the ones I hit frequently...)
> If all we're debating is the name, though, I think we're pretty much there
It seems like __fspath__ may be leading as a name, but not that many people
have spoken up. But that is not the only thing still up for debate. :)
We have not settled on whether a built-in is necessary. Maybe whatever
function we come with should live in pathlib itself and not have it be a
We have also not settled on whether __fspath__ should be a method or
attribute as that changes the boilerplate one-liner people may use if a
built-in isn't available. This is the first half of the protocol.
What exactly should this helper function do? E.g. does it simply return its
argument if __fspath__ isn't defined, or does it check for __fspath__, then
if it's an instance of str, then TypeError? This is the second half of the
protocol and will end up defining what a "path-like object" represents.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-Dev