[Python-ideas] Working with Path objects: p-strings?

Paul Moore p.f.moore at gmail.com
Wed Mar 30 11:18:41 EDT 2016


On 30 March 2016 at 15:26, Chris Barker - NOAA Federal
<chris.barker at noaa.gov> wrote:
>>> . They actually want the benefits of both, the pure datastructure with its convenience methods and the dirty str-like thing with its convenience methods.'''
>>
>> The desire is not "str vs. Path"; its "Path + str". (at least from what I can tell)
>
> I don't think so. ( though I agree with the lazy part).

I'm not sure I follow your point here.

> People want paths to be a strings so that they will work with all the code
> that already works with strings.

Correct. That's the prime motivation. But you then say

> But whatever happened to duck typing? Paths don't need to BE strings.
> Rather, everything that needs a path needs to accept anything that
> acts like a path.

But "all the code that already works with strings" doesn't do that. If
we're allowed to change that code then a simple

    patharg = getattr(patharg, 'path', patharg)

is sufficient to work with path objects or strings.

Of course this doesn't address functions that *return* paths (as
strings). There the caller has to wrap the return value in Path(). Or
the function changes to return Path objects, which won't be backward
compatible (whether that matters depends on what the code is).

> I suppose a __path__ magic method would be the "proper" way to do
> this, but it's really not necessary, it can be handled by the libs
> themselves.

Well, the 'path' attribute is basically the same as a __path__ magic
method. (That may be what you mean by "can be handled by the libs" -
I'm not sure how you'd handle it *other* than in the libraries that
currently assume a string).

> I think there is a fine plan in place, but if more flexibility is
> required, something like numpy's asarray() would be handy -- it passes
> actual Numpy arrays through untouched, and makes various efforts to
> make an array out of anything else.
>
> It's remarkably easy and effective to write functions that take
> virtually anything remotely array like -- lists, tuples, array.arrays,
> custom objects, etc.

Again, this seems to be backwards - the problem isn't treating things
that aren't paths as paths (just do p = Path(p) and that's sorted -
paths are immutable so there's no need for the complexity that I
imagine asarray has to manage). The problem is existing code that
expects and/or returns strings. Until that is changed or replaced with
code that expects and returns Path objects, people will always need to
convert to and from strings.

And of course code that has to deal with *both* Path and string
objects (for compatibility reasons) can easily enough handle anything
it receives, but has a decision to make about what to return - if it
returns Path objects, it won't be backward compatible. But if it
returns string objects, we'll never get away from the need to convert
strings to paths in our code at some point.

As a simple case in point, what should the appdirs module
(https://pypi.python.org/pypi/appdirs) do? Continue to return strings,
and the user needs to wrap them in Path(), or switch to returning Path
objects and break backward compatibility? Or maintain a messy API that
caters (somehow) for both possibilities?

It may be that we're talking at cross purposes here. I'm thinking very
much of users working with library code on from PyPI and things like
that. You may be looking at this from the perspective of a user who
controls the majority of the code they are working with. I'm not sure.

A transition like this is never simple, and libraries pretty much have
to wait for end user code to switch first. And paths as string
subclasses doesn't really count as switching, it simply makes it
easier to ignore the issue. Arguably being able to pass a path object
to code that expects a string means that you don't have to change your
code twice, first to handle path objects because the library code
doesn't, then again to switch back when the library code is fixed. But
then there is zero motivation for the library code to change. And if
code *accepting* paths doesn't change, then code *producing* paths
won't either. And we're stuck where we are at the moment, with the
string representation as the lowest common denominator representation
of a path, and everyone converting back and forth if they want the
richer Path interface (or deciding it's not worth it, and sticking to
the old os.path routines...).

Paul


More information about the Python-ideas mailing list