[Python-Dev] pathlib - current status of discussions

Koos Zevenhoven k7hoven at gmail.com
Thu Apr 14 15:17:21 EDT 2016


On Thu, Apr 14, 2016 at 9:35 PM, Random832 <random832 at fastmail.com> wrote:
> On Thu, Apr 14, 2016, at 13:56, Koos Zevenhoven wrote:
>> (1) Code that has access to pathname/filename data and has some level
>> of control over what data type comes in. This code may for instance
>> choose to deal with either bytes or str
>>
>> (2) Code that takes the path or file name that it happens to get and
>> does something with it. This type of code can be divided into
>> subgroups as follows:
>>
>>   (2a) Code that accepts only one type of paths (e.g. str, bytes or
>> pathlib) and fails if it gets something else.
>
> Ideally, these should go away.
>

I don't think so. (1) might even be the most common type of all code.
This is code that gets a path from user input, from a config file,
from a database etc. and then does things with it, typically including
passing it to type (2) code and potentially getting a path back from
there too.

>>   (2b) Code that wants to support different types of paths such as
>> str, bytes or pathlib objects. This includes os.path.*, os.scandir,
>> and various other standard library code. Presumably there is also
>> third-party code that does the same. These functions may want to
>> preserve the str-ness or bytes-ness of the paths in case they return
>> paths, as the stdlib now does. But new code may even want to return
>> pathlib objects when they get such objects as inputs.
>
> Hold on. None of the discussion I've seen has included any way to
> specify how to construct a new object representing a different path
> other than the ones passed in. Surely you're not suggesting type(a)(b).
>

That's right. This protocol is not solving the issue of returning
'rich' path objects. It's solving the issue of passing those objects
to lower-level functions or to interact with other 'rich' path types.
What I meant by this is that there may be code that *does* want to do
type(a)(b), which is out of our control. Maybe I should not have
mentioned that.

> Also, how does DirEntry fit in with any of this?
>

os.scandir + DirEntry are one of the many things in the stdlib that
give you pathnames of the same type as those that were put in.

>> This is the
>> duck-typing or polymorphic code we have been talking about. Code of
>> this type (2b) may want to avoid implicit conversions because it makes
>> the life of code of the other types more difficult.
>
> As long as the type it returns is still a path/bytes/str (and therefore
> can be accepted when the caller passes it somewhere else) what's the
> problem?

No, because not all paths are passed to the function that does the
implicit conversion, and then when for instance os.path.joining two
paths of a differenty type, it raises an error.

In other words: Most non-library code (even library code?) deals with
one specific type and does not want implicit conversions to other
types. Some code (2b) deals with several types and, at least in the
stdlib, such code returns paths of the same type as they are given,
which makes said "most non-library code" happy, because it does not
force the programmer to think about type conversions.

(Then there is also code that explicitly deals with type conversions,
such as os.fsencode and os.fsdecode.)

-Koos


More information about the Python-Dev mailing list