[Python-Dev] Defining a path protocol
Gregory P. Smith
greg at krypto.org
Wed Apr 6 18:54:42 EDT 2016
Note: While I do not object to the bike shed colors being proposed, if you
call the attribute .__path__ that is somewhat confusing when thinking about
the import system which declares that *"any module that contains a __path__
attribute is considered a package"*.
So would module.__path__ become a Path instance in a potential future
making module.__path__.__path__ meaningfully confusing? ;)
I'm not worried about people who shove pathlib.Path instances in as values
into sys.modules and expect anything but pain. :P
On Wed, Apr 6, 2016 at 3:46 PM Brett Cannon <brett at python.org> wrote:
> On Wed, 6 Apr 2016 at 15:22 Paul Moore <p.f.moore at gmail.com> wrote:
>> On 6 April 2016 at 20:39, Brett Cannon <brett at python.org> wrote:
>> >> I'm a little confused by this. To support the older pathlib, they have
>> >> to do patharg = str(patharg), because *none* of the proposed
>> >> attributes (path or __path__) will exist.
>> >> The getattr trick is needed to support the *new* pathlib, when you
>> >> need a real string. Currently you need a string if you call stdlib
>> >> functions or builtins. If we fix the stdlib/builtins, the need goes
>> >> away for those cases, but remains if you need to call libraries that
>> >> *don't* support pathlib (os.path will likely be one of those) or do
>> >> direct string manipulation.
>> >> In practice, I see the getattr trick as an "easy fix" for libraries
>> >> that want to add support but in a minimally-intrusive way. On that
>> >> basis, making the trick easy to use is important, which argues for an
>> >> attribute.
>> > So then where's the confusion? :) You seem to get the points. I
>> > find `path.__path__() if hasattr(path, '__path__') else path` also
>> > (if obviously a bit longer).
>> The confusion is that you seem to be saying that people can use
>> getattr(path, '__path__', path) to support older versions of Python.
>> But the older versions are precisely the ones that don't have __path__
>> so you won't be supporting them.
> Because pathlib is provisional the change will go into the next releases
> of Python 3.4, 3.5, and in 3.6 so new-old will have whatever we do. :) I
> think the key point is that this sort of thing will occur before you have
> access to some new built-in or something.
>> >> >> > 3. Built-in? (name is dependent on #1 if we add one)
>> >> >>
>> >> >> fspath() -- and it would be handy to have a function that return
>> >> >> the __fspath__ results, or the string (if it was one), or raise an
>> >> >> exception if neither of the above work out.
>> >> fspath regardless of the name chosen in #1 - a new builtin called path
>> >> just has too much likelihood of clashing with user code.
>> >> But I'm not sure we need a builtin. I'm not at all clear how
>> >> frequently we expect user code to need to use this protocol. Users
>> >> can't use the builtin if they want to be backward compatible, But code
>> >> that doesn't need backward compatibility can probably just work with
>> >> pathlib (and the stdlib support for it) directly. For display, the
>> >> implicit conversion to str is fine. For "get me a string representing
>> >> the path", is the "path" attribute being abandoned in favour of this
>> >> special method?
>> > Yes.
>> OK. So the idiom to get a string from a known Path object would be any of:
>> 1. str(path)
>> 2. fspath(path)
>> 3. path.__path__()
>> (1) is safe if you know you have a Path object, but could incorrectly
>> convert non-Path objects. (2) is safe in all cases. (3) is ugly. Did I
>> miss any options?
> Other than path.__path__ being an attribute, nope.
>> So I think we need a builtin.
> Well, the ugliness shouldn't survive forever if the community shifts over
> to using pathlib while the built-in will. We also don't have a built-in for
> __index__() so it depends on whether we expect this sort of thing to be the
> purview of library authors or if normal people will be interacting with it
> (it's probably both during the transition, but I don't know afterwards).
>> Code that needs to be backward compatible will still have to use
>> str(path), because neither the builtin nor the __path__ protocol will
>> exist in older versions of Python.
> str(path) will definitely work, path.__path__ will work if you're running
> the next set of bugfix releases. fspath(path) will only work in Python 3.6
> and newer.
>> Maybe a compatibility library could
>> except NameError:
>> import pathlib
>> def fspath(p):
>> if isinstance(p, pathlib.Path):
>> return str(p)
>> return p
>> except ImportError:
>> def fspath(p):
>> return p
>> It's messy, like all compatibility code, but it allows code to use
>> fspath(p) in older versions.
> I would tweak it to check for __fspath__ before it resorted to calling
> str(), but yes, that could be something people use.
>> >> I'm inclined to think that if you are writing "pure
>> >> pathlib" code, pathobj.path looks more readable than fspath(pathobj) -
>> >> certainly no *less* readable.
>> > I don't' know what you mean by "pure pathlib". You mean code that only
>> > with pathlib objects? Or do you mean code that accepts pathlib objects
>> > uses strings internally?
>> I mean code that knows it has a Path object to work with (and not a
>> string or anything else). But the point is moot if the path attribute
>> is going away.
>> Other than to say that I do prefer the name "path", I just don't think
>> it's a reasonable name for a builtin. Even if it's OK for user
>> variables to have the same name as builtins, IDEs tend to colour
>> builtins differently, which is distracting. (Temporary variables named
>> "file" or "dir" are the ones I hit frequently...)
>> If all we're debating is the name, though, I think we're pretty much
>> there :-)
> It seems like __fspath__ may be leading as a name, but not that many
> people have spoken up. But that is not the only thing still up for debate.
> We have not settled on whether a built-in is necessary. Maybe whatever
> function we come with should live in pathlib itself and not have it be a
> We have also not settled on whether __fspath__ should be a method or
> attribute as that changes the boilerplate one-liner people may use if a
> built-in isn't available. This is the first half of the protocol.
> What exactly should this helper function do? E.g. does it simply return
> its argument if __fspath__ isn't defined, or does it check for __fspath__,
> then if it's an instance of str, then TypeError? This is the second half of
> the protocol and will end up defining what a "path-like object" represents.
> Python-Dev mailing list
> Python-Dev at python.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-Dev