[Python-ideas] PEP 428 - object-oriented filesystem paths

Mon Oct 8 19:59:58 CEST 2012

On Sat, Oct 6, 2012 at 9:44 PM, Calvin Spealman <ironfroggy at gmail.com> wrote:
> Responding late, but I didn't get a chance to get my very strong
> feelings on this proposal in yesterday.
>
> I do not like it. I'll give full disclosure and say that I think our
> earlier failure to include the path library in the stdlib has been a
> loss for Python and I'll always hope we can fix that one day. I still
> hold out hope.
>
> It feels like this proposal is "make it object oriented, because
> object oriented is good" without any actual justification or obvious
> problem this solves. The API looks clunky and redundant, and does not
> appear to actually improve anything over the facilities in the os.path
> module. This takes a lot of things we can already do with paths and
> files and remixes them into a not-so intuitive API for the sake of
> change, not for the sake of solving a real problem.

The PEP needs to better articulate the rationale, but the key points are:
- better abstraction and encapsulation of cross-platform logic so file
manipulation algorithms written on Windows are more likely to work
correctly on POSIX systems (and vice-versa)
- improved ability to manipulate paths with Windows semantics on a
POSIX system (and vice-versa)
- better support for creation of "mock" filesystem APIs

> As for specific problems I have with the proposal:
>
> Frankly, I think not keeping the / operator for joining is a huge
> mistake. This is the number one best feature of path and despite that
> many people don't like it, it makes sense. It makes our most common
> path operation read very close to the actual representation of the
> what you're creating. This is great.

It trades readability (and discoverability) for brevity. Not good.

> Not inheriting from str means that we can't directly path these path
> objects to existing code that just expects a string, so we have a
> really hard boundary around the edges of this new API. It does not
> lend itself well to incrementally transitioning to it from existing
> code.

It's the exact design philosophy as was used in the creation of the
new ipaddress module: the objects in ipaddress must still be converted
to a string or integer before they can be passed to other operations
(such as the socket module APIs). Strings and integers remain the data
interchange formats here as well (although far more focused on strings
in the path case).

>
> The stat operations and other file-facilities tacked on feel out of
> place, and limited. Why does it make sense to add these facilities to
> path and not other file operations? Why not give me a read method on
> paths? or maybe a copy? Putting lots of file facilities on a path
> object feels wrong because you can't extend it easily. This is one
> place that function(thing) works better than thing.function()

Indeed, I'm personally much happier with the "pure" path classes than
I am with the ones that can do filesystem manipulation. Having both
"p.open(mode)" and "open(str(p), mode)" seems strange. OTOH, I can see
the attraction in being able to better fake filesystem access through
the method API, so I'm willing to go along with it.

> Overall, I'm completely -1 on the whole thing.

I find this very hard to square with your enthusiastic support for
path.py. Like ipaddr, which needed to clean up its semantic model
before it could be included in the standard library (as ipaddress), we
need a clean cross-platform semantic model for path objects before a
convenience API can be added for manipulating them.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia