[Python-Dev] Path object design

Talin talin at acm.org
Wed Nov 1 04:20:50 CET 2006


I'm right in the middle of typing up a largish post to go on the 
Python-3000 mailing list about this issue. Maybe we should move it over 
there, since its likely that any path reform will have to be targeted at 
Py3K...?

Mike Orr wrote:
> I just saw the Path object thread ("PEP 355 status", Sept-Oct), saying
> that the first object-oriented proposal was rejected.  I'm in favor of
> the "directory tuple" approach which wasn't mentioned in the thread.
> This was proposed by Noal Raphael several months ago: a Path object
> that's a sequence of components (a la os.path.split) rather than a
> string.  The beauty of this approach is that slicing and joining are
> expressed naturally using the [] and + operators, eliminating several
> methods.
> 
> Introduction:  http://wiki.python.org/moin/AlternativePathClass
> Feature discussion:  http://wiki.python.org/moin/AlternativePathDiscussion
> Reference implementation:  http://wiki.python.org/moin/AlternativePathModule
> 
> (There's a link to the introduction at the end of PEP 355.)  Right now
> I'm working on a test suite, then I want to add the features marked
> "Mike" in the discussion -- in a way that people can compare the
> feature alternatives in real code -- and write a PEP.  But it's a big
> job for one person, and there are unresolved issues on the discussion
> page, not to mention things brought up in the "PEP 355 status" thread.
>  We had three people working on the discussion page but development
> seems to have ground to a halt.
> 
> One thing is sure -- we urgently need something better than os.path.
> It functions well but it makes hard-to-read and unpythonic code.  For
> instance, I have an application that has to add its libraries to the
> Python path, relative to the executable's location.
> 
> /toplevel
>     app1/
>         bin/
>             main_progam.py
>             utility1.py
>             init_app.py
>         lib/
>             app_module.py
>     shared/
>         lib/
>             shared_module.py
> 
> The solution I've found is an init_app module in every application
> that sets up the paths.  Conceptually it needs "../lib" and
> "../../shared/lib", but I want the absolute paths without hardcoding
> them, in a platform-neutral way.  With os.path, "../lib" is:
> 
>     os.path.join(os.path.dirname(os.path.dirname(__FILE__)), "lib")
> 
> YUK!  Compare to PEP 355:
> 
>     Path(__FILE__).parent.parent.join("lib")
> 
> Much easier to read and debug.  Under Noam's proposal it would be:
> 
>     Path(__FILE__)[:-2] + "lib"
> 
> I'd also like to see the methods more intelligent: don't raise an
> error if an operation is already done (e.g., a directory exists or a
> file is already removed).  There's no reason to clutter one's code
> with extra if's when the methods can easily encapsulate this. This was
> considered a too radical departure from os.path for some, but I have
> in mind even more radical convenience methods which I'd put in a
> third-party subclass if they're not accepted into the standard
> library, the way 'datetime' has third-party subclasses.
> 
> In my application I started using Orendorff's path module, expecting
> the standard path object would be close to it.  When PEP 355 started
> getting more changes and the directory-based alternative took off, I
> took path.py out and rewrote my code for os.path until an alternative
> becomes more stable. Now it looks like it will be several months and
> possibly several third-party packages until one makes it into the
> standard library. This is unfortunate.  Not only does it mean ugly
> code in applications, but it means packages can't accept or return
> Path objects and expect them to be compatible with other packages.
> 
> The reasons PEP 355 was rejected also sound strange.  Nick Coghlan
> wrote (Oct 1):
> 
>> Things the PEP 355 path object lumps together:
>>   - string manipulation operations
>>   - abstract path manipulation operations (work for non-existent filesystems)
>>   - read-only traversal of a concrete filesystem (dir, stat, glob, etc)
>>   - addition & removal of files/directories/links within a concrete filesystem
> 
>> Dumping all of these into a single class is certainly practical from a utility
>> point of view, but it's about as far away from beautiful as you can get, which
>> creates problems from a learnability point of view, and from a
>> capability-based security point of view.
> 
> What about the convenience of the users and the beauty of users' code?
>  That's what matters to me.  And I consider one class *easier* to
> learn.  I'm tired of memorizing that 'split' is in os.path while
> 'remove' and 'stat' are in os.  This seems arbitrary: you're statting
> a path, aren't you?  Also, if you have four classes (abstract path,
> file, directory, symlink), *each* of those will have 3+
> platform-specific versions.  Then if you want to make an enhancement
> subclass you'll have to make 12 of them, one for each of the 3*4
> combinations of superclasses.  Encapsulation can help with this, but
> it strays from the two-line convenience for the user:
> 
>     from path import Path
>     p = Path("ABC")      # Works the same for files/directories on any platform.
> 
> Nevertheless, I'm open to seeing a multi-class API, though hopefully
> less verbose than Talin's preliminary one (Oct 26).  Is it necessary
> to support path.parent(), pathobj.parent(), io.dir.listdir(), *and*
> io.dir.Directory().  That's four different namespaces to memorize
> which function/method is where, and if a function/method belongs to
> multiple ones it'll be duplicated, and you'll have to remember that
> some methods are duplicated and others aren't...  Plus, constructors
> like io.dir.Directory() look too verbose.  io.Directory() might be
> acceptable, with the functions as class methods.
> 
> I agree that supporting non-filesystem directories (zip files,
> CSV/Subversion sandboxes, URLs) would be nice, but we already have a
> big enough project without that.  What constraints should a Path
> object keep in mind in order to be forward-compatible with this?
> 
> If anyone has design ideas/concerns about a new Path class(es), please
> post them.  If anyone would like to work on a directory-based
> spec/implementation, please email me.
> 


More information about the Python-Dev mailing list