[Python-Dev] Defining a path protocol
Brett Cannon
brett at python.org
Wed Apr 6 19:27:27 EDT 2016
On Wed, 6 Apr 2016 at 15:54 Gregory P. Smith <greg at krypto.org> wrote:
> Note: While I do not object to the bike shed colors being proposed, if you
> call the attribute .__path__ that is somewhat confusing when thinking about
> the import system which declares that *"any module that contains a
> __path__ attribute is considered a package"*.
>
> So would module.__path__ become a Path instance in a potential future
> making module.__path__.__path__ meaningfully confusing? ;)
>
> I'm not worried about people who shove pathlib.Path instances in as values
> into sys.modules and expect anything but pain. :P
>
Ah, good point. I think that kills __path__ then as an option.
-Brett
>
> __gps__
>
>
>
> On Wed, Apr 6, 2016 at 3:46 PM Brett Cannon <brett at python.org> wrote:
>
>> On Wed, 6 Apr 2016 at 15:22 Paul Moore <p.f.moore at gmail.com> wrote:
>>
>>> On 6 April 2016 at 20:39, Brett Cannon <brett at python.org> wrote:
>>> >> I'm a little confused by this. To support the older pathlib, they have
>>> >> to do patharg = str(patharg), because *none* of the proposed
>>> >> attributes (path or __path__) will exist.
>>> >>
>>> >> The getattr trick is needed to support the *new* pathlib, when you
>>> >> need a real string. Currently you need a string if you call stdlib
>>> >> functions or builtins. If we fix the stdlib/builtins, the need goes
>>> >> away for those cases, but remains if you need to call libraries that
>>> >> *don't* support pathlib (os.path will likely be one of those) or do
>>> >> direct string manipulation.
>>> >>
>>> >> In practice, I see the getattr trick as an "easy fix" for libraries
>>> >> that want to add support but in a minimally-intrusive way. On that
>>> >> basis, making the trick easy to use is important, which argues for an
>>> >> attribute.
>>> >
>>> > So then where's the confusion? :) You seem to get the points. I
>>> personally
>>> > find `path.__path__() if hasattr(path, '__path__') else path` also
>>> readable
>>> > (if obviously a bit longer).
>>>
>>> The confusion is that you seem to be saying that people can use
>>> getattr(path, '__path__', path) to support older versions of Python.
>>> But the older versions are precisely the ones that don't have __path__
>>> so you won't be supporting them.
>>>
>>
>> Because pathlib is provisional the change will go into the next releases
>> of Python 3.4, 3.5, and in 3.6 so new-old will have whatever we do. :) I
>> think the key point is that this sort of thing will occur before you have
>> access to some new built-in or something.
>>
>>
>>>
>>> >> >> > 3. Built-in? (name is dependent on #1 if we add one)
>>> >> >>
>>> >> >> fspath() -- and it would be handy to have a function that return
>>> either
>>> >> >> the __fspath__ results, or the string (if it was one), or raise an
>>> >> >> exception if neither of the above work out.
>>> >>
>>> >> fspath regardless of the name chosen in #1 - a new builtin called path
>>> >> just has too much likelihood of clashing with user code.
>>> >>
>>> >> But I'm not sure we need a builtin. I'm not at all clear how
>>> >> frequently we expect user code to need to use this protocol. Users
>>> >> can't use the builtin if they want to be backward compatible, But code
>>> >> that doesn't need backward compatibility can probably just work with
>>> >> pathlib (and the stdlib support for it) directly. For display, the
>>> >> implicit conversion to str is fine. For "get me a string representing
>>> >> the path", is the "path" attribute being abandoned in favour of this
>>> >> special method?
>>> >
>>> > Yes.
>>>
>>> OK. So the idiom to get a string from a known Path object would be any
>>> of:
>>>
>>> 1. str(path)
>>> 2. fspath(path)
>>> 3. path.__path__()
>>>
>>> (1) is safe if you know you have a Path object, but could incorrectly
>>> convert non-Path objects. (2) is safe in all cases. (3) is ugly. Did I
>>> miss any options?
>>>
>>
>> Other than path.__path__ being an attribute, nope.
>>
>>
>>>
>>> So I think we need a builtin.
>>>
>>
>> Well, the ugliness shouldn't survive forever if the community shifts over
>> to using pathlib while the built-in will. We also don't have a built-in for
>> __index__() so it depends on whether we expect this sort of thing to be the
>> purview of library authors or if normal people will be interacting with it
>> (it's probably both during the transition, but I don't know afterwards).
>>
>>
>>>
>>> Code that needs to be backward compatible will still have to use
>>> str(path), because neither the builtin nor the __path__ protocol will
>>> exist in older versions of Python.
>>
>>
>> str(path) will definitely work, path.__path__ will work if you're running
>> the next set of bugfix releases. fspath(path) will only work in Python 3.6
>> and newer.
>>
>>
>>> Maybe a compatibility library could
>>> add
>>>
>>> try:
>>> fspath
>>> except NameError:
>>> try:
>>> import pathlib
>>> def fspath(p):
>>> if isinstance(p, pathlib.Path):
>>> return str(p)
>>> return p
>>> except ImportError:
>>> def fspath(p):
>>> return p
>>>
>>> It's messy, like all compatibility code, but it allows code to use
>>> fspath(p) in older versions.
>>>
>>
>> I would tweak it to check for __fspath__ before it resorted to calling
>> str(), but yes, that could be something people use.
>>
>>
>>>
>>> >> I'm inclined to think that if you are writing "pure
>>> >> pathlib" code, pathobj.path looks more readable than fspath(pathobj) -
>>> >> certainly no *less* readable.
>>> >
>>> > I don't' know what you mean by "pure pathlib". You mean code that only
>>> works
>>> > with pathlib objects? Or do you mean code that accepts pathlib objects
>>> but
>>> > uses strings internally?
>>>
>>> I mean code that knows it has a Path object to work with (and not a
>>> string or anything else). But the point is moot if the path attribute
>>> is going away.
>>>
>>> Other than to say that I do prefer the name "path", I just don't think
>>> it's a reasonable name for a builtin. Even if it's OK for user
>>> variables to have the same name as builtins, IDEs tend to colour
>>> builtins differently, which is distracting. (Temporary variables named
>>> "file" or "dir" are the ones I hit frequently...)
>>>
>>> If all we're debating is the name, though, I think we're pretty much
>>> there :-)
>>>
>>
>> It seems like __fspath__ may be leading as a name, but not that many
>> people have spoken up. But that is not the only thing still up for debate.
>> :)
>>
>> We have not settled on whether a built-in is necessary. Maybe whatever
>> function we come with should live in pathlib itself and not have it be a
>> built-in?
>>
>> We have also not settled on whether __fspath__ should be a method or
>> attribute as that changes the boilerplate one-liner people may use if a
>> built-in isn't available. This is the first half of the protocol.
>>
>> What exactly should this helper function do? E.g. does it simply return
>> its argument if __fspath__ isn't defined, or does it check for __fspath__,
>> then if it's an instance of str, then TypeError? This is the second half of
>> the protocol and will end up defining what a "path-like object" represents.
>>
> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>>
> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/greg%40krypto.org
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160406/7641f586/attachment-0001.html>
More information about the Python-Dev
mailing list