On Tue, 29 Mar 2016 at 02:45 Michel Desmoulin <desmoulinmichel@gmail.com> wrote:

Le 29/03/2016 09:50, Sven R. Kunze a écrit :
> On 25.03.2016 22:56, Michel Desmoulin wrote:
>> Although path.py, which I have been using for years now (and still
>> prefer to pathlib) subclass str and never caused any problem whatsoever.
>> So really, we should pinpoint where it could be an issue and see if this
>> is a frequent problem, because right now, it seems more a decision based
>> on purity than practicality.
> I agree. The PEP says:
> """
> No confusion with builtins
> ----------------------------------
> In this proposal, the path classes do not derive from a builtin type.
> This contrasts with some other Path class proposals which were derived
> from str . They also do not pretend to implement the sequence protocol:
> if you want a path to act as a sequence, you have to lookup a dedicated
> attribute (the parts attribute).
> Not behaving like one of the basic builtin types also minimizes the
> potential for confusion if a path is combined by accident with genuine
> builtin types.
> """"
> I have to admit I cannot follow these statements but they should have
> appeared to be necessary back then. As experience shows the PyPI module
> fared far better here.
> I am great a fan of theory over practice as what has been proven in
> theory cannot be proven wrong in practice. However, this only holds if
> we talk about hard proof. For soft things like "human interaction",
> "mutual understanding" or "potential for confusion", only the practice
> of many many people can "prove" what's useful, what's "practical".

+1. Can somebody defend, with practical examples, the imperative to
refuse to inherit from str ? And weight it against the benefits of it ?

We have been doing it for years, and so far it's been really, really
nice. I can't recall problem I had because of it ever. I can recall
numerous editions of my code because I forgot str() on pathlib.Path.

 I may regret this, but I will give this a shot. :)

I wrote a blog post at http://www.snarky.ca/why-pathlib-path-doesn-t-inherit-from-str because my response was getting a bit long. I have pasted it below in CommonMark.


Over on [python-ideas](https://mail.python.org/mailman/listinfo/python-ideas) a discussion has broken out about somehow trying to make `p'/some/path/to/a/file` return an [instance of `pathlib.Path`](https://docs.python.org/3/library/pathlib.html#pathlib.Path). This led to a splinter discussion as to why `pathlib.Path` doesn't inherit from `str`? I figured instead of burying my response to this question in the thread I'd blog about it to try and explain one approach to API design.

I think the key question in all of this is whether paths are semantically equivalent to a sequence of characters? Obviously the answer is "no" since paths have structure, directly represent something like files and directories, etc. Now paths do have a serialized representation as strings -- at least most of the time, but I'm ignoring Linux and the crazy situation of binary paths -- which is why C APIs take in `char *` as the representation of paths (and because C tends to make one try to stick with the types that are part of the C standard). So not all strings represent paths, but all paths can be represented by strings. I think we can all agree with that.

OK, so if all paths can be represented as strings, why don't we just make `pathlib.Path` subclass `str`? Well, as I said earlier, not all strings are paths. You can't concatenate a string representing a path with some other random string and expect to get back a string that still represents a valid string (and I'm not talking about "valid" as in "the file doesn't exist", I'm talking about "syntactically not possible"). This is what [PEP 428](https://www.python.org/dev/peps/pep-0428/) is talking about when it says:

> Not behaving like one of the basic builtin types [list str] also minimizes the potential for confusion if a path is combined by accident with genuine builtin types [like str].

Now at this point someone someone will start saying, "but Brett, '[practicality beats purity](https://www.python.org/dev/peps/pep-0020/)' and all of those pre-existing, old OS APIs want a string as an argument!" And you're right, in terms of practicality it would be easier for `pathlib.Path` to inherit from `str` ... for the short/medium term. One thing people are forgetting is that "[explicit is better than implicit](https://www.python.org/dev/peps/pep-0020/)" as well and as with most design decisions there's not one clear answer. Implicitly treating a path as a string currently works, but that's just because we have inherited a suboptimal representation for paths from C. Now if you use an explicit representation of paths like `pathlib.Path` then you gain semantic separation and understanding of what you are working with. You also avoid any issues that come with implicit `str` compatibility as I pointed out earlier.

And before anyone says, "so what?", think about this: Python 2/3. While I'm sure some will say I'm overblowing the comparison, but Python 3 came about because the implicit compatibility of binary and textual data in Python 2 caused major headaches to the point that we made a backwards-incompatible change that has caused widespread ramifications (but which we seem to be coming out on the other side of). By not inheriting from `str`, `pathlib.Path` has avoided a similar potential issue from the get-go. Or another way of looking at it is to ask why doesn't `dict` inherit from `str` so that it's easier to make it easier to stick in the body of an HTTP response so it can be implicitly treated as JSON? If you going, "eww, no because they are different types" then you at least understand the argument I'm trying to make (if you're going "I like that idea", then I think [Perl](https://www.perl.org/) might be more to your liking than Python and that's fine since the two languages just have different designs).

Now back to that "practicality beats purity" side of this. Obviously there are lots of APIs out there that take a string as an argument for a file path. And I understand people don't like using `str(path)` or the upcoming/new [`getattr(path, 'path', path)` idiom](https://docs.python.org/3/library/pathlib.html#pathlib.PurePath.path) to work around the limitations forced upon them by other APIs that only take a `str` because they occasionally forget it. For this point, I have two answers. One is still "explicit is better than implicit" and you should have tests to begin with to catch when you forget to convert your `pathlib.Path` objects to `str` as needed. I'm just one of those programmers who's willing to type a bit more for easy-to-read code that's less error-prone within reason (and I obviously view this as reasonable).

But more importantly, why don't you work with the code and/or projects that are forcing you to convert to `str` to start accepting `pathlib` objects as well for those same APIs? If projects start to work on making `pathlib` objects acceptable anywhere a path is accepted then once Python 3.4 is the oldest version of Python that is supported by the project then they can start considering dropping support for `str` as paths in new releases (obviously this also includes dropping support for Python 2 unless people use [pathlib2](https://pypi.python.org/pypi/pathlib2/)). And I fully admit the stdlib is not exempt from a place that needs updating, so for `importlib` I have [opened an issue](http://bugs.python.org/issue26667) to update it to show this isn't just talk on my end (unfortunately there's some unique bootstrapping problems to import where stuff like `sys.path` probably can't be updated since that has to exist before you can import `pathlib` and that sort of thing ripples out, but I'm hoping I can update at least some things and this is a very unique case of potentially needing to stick with `str`). Hopefully if people start asking for `pathlib` support from projects they will add it, eventually leading to an alleviation of the desire to have `pathlib.Path` inherit from `str`.