[Python-Dev] PEP 355 status

glyph at divmod.com glyph at divmod.com
Sat Sep 30 06:52:58 CEST 2006


On Fri, 29 Sep 2006 12:38:22 -0700, Guido van Rossum <guido at python.org> wrote:
>I would recommend not using it. IMO it's an amalgam of unrelated
>functionality (much like the Java equivalent BTW) and the existing os
>and os.path modules work just fine. Those who disagree with me haven't
>done a very good job of convincing me, so I expect this PEP to remain
>in limbo indefinitely, until it is eventually withdrawn or rejected.

Personally I don't like the path module in question either, and I think that PEP 355 presents an exceptionally weak case, but I do believe that there are several serious use-cases for "object oriented" filesystem access.  Twisted has a module for doing this:

    http://twistedmatrix.com/trac/browser/trunk/twisted/python/filepath.py

I hope to one day propose this module as a replacement, or update, for PEP 355, but I have neither the time nor the motivation to do it currently.  I wouldn't propose it now; it is, for example, mostly undocumented, missing some useful functionality, and has some weird warts (for example, the name of the path-as-string attribute is "path").

However, since it's come up I thought I'd share a few of the use-cases for the general feature, and the things that Twisted has done with it.

1: Testing.  If you want to provide filesystem stubs to test code which interacts with the filesystem, it is fragile and extremely complex to temporarily replace the 'os' module; you have to provide a replacement which knows about all the hairy string manipulations one can perform on paths, and you'll almost always forget some weird platform feature.  If you have an object with a narrow interface to duck-type instead; for example, a "walk" method which returns similar objects, or an "open" method which returns a file-like object, mocking the appropriate parts of it in a test is a lot easier.  The proposed PEP 355 module can be used for this, but its interface is pretty wide and implicit (and portions of it are platform-specific), and because it is also a string you may still have to deal with platform-specific features in tests (or even mixed os.path manipulations, on the same object).

This is especially helpful when writing tests for error conditions that are difficult to reproduce on an actual filesystem, such as a network filesystem becoming unavailable.

2: Fast failure, or for lack of a better phrase, "type correctness".  PEP 355 gets close to this idea when it talks about datetimes and sockets not being strings.  In many cases, code that manipulates filesystems is passing around 'str' or 'unicode' objects, and may be accidentally passed the contents of a file rather than its name, leading to a bizarre failure further down the line.  FilePath fails immediately with an "unsupported operand types" TypeError in that case.  It also provides nice, immediate feedback at the prompt that the object you're dealing with is supposed to be a filesystem path, with no confusion as to whether it represents a relative or absolute path, or a path relative to a particular directory.  Again, the PEP 355 module's subclassing of strings creates problems, because you don't get an immediate and obvious exception if you try to interpolate it with a non-path-name string, it silently "succeeds".

3: Safety.  Almost every web server ever written (yes, including twisted.web) has been bitten by the "/../../../" bug at least once.  The default child(name) method of Twisted's file path class will only let you go "down" (to go "up" you have to call the parent() method), and will trap obscure platform features like the "NUL" and "CON" files on Windows so that you can't trick a program into manipulating something that isn't actually a file.  You can take strings you've read from an untrusted source and pass them to FilePath.child and get something relatively safe out.  PEP 355 doesn't mention this at all.

4: last, but certainly not least: filesystem polymorphism.  For an example of what I mean, take a look at this in-development module:

    http://twistedmatrix.com/trac/browser/trunk/twisted/python/zippath.py

It's currently far too informal, and incomplete, and there's no specified interface.  However, this module shows that by being objects and not module-methods, FilePath objects can also provide a sort of virtual filesystem for Python programs.  With FilePath plus ZipPath, You can write Python programs which can operate on a filesystem directory or a directory within a Zip archive, depending on what object they are passed.

On a more subjective note, I've been gradually moving over personal utility scripts from os.path manipulations to twisted.python.filepath for years.  I can't say that this will be everyone's experience, but in the same way that Python scripts avoid the class of errors present in most shell scripts (quoting), t.p.f scripts avoid the class of errors present in most Python scripts (off-by-one errors when looking at separators or extensions).

I hope that eventually Python will include some form of OO filesystem access, but I am equally hopeful that the current PEP 355 path.py is not it.


More information about the Python-Dev mailing list