[Python-3000] Mini Path object

Talin talin at acm.org
Wed Nov 8 11:21:29 CET 2006


Mike Orr wrote:
> Remember that 99% of Python programmers are concerned only with native
> paths.  I have never used a non-native path or multiple-platform paths
> in an application.  So we need to make the native case easy and clear.
>  For that reason I'd rather keep the non-native cases and conversion
> code in separate classes.

This is only true if one is speaking of fully-qualified paths.

Most of the time, however, we deal with relative paths. Relative paths 
are, for the most part, universal already - if they weren't, then it 
would be impossible to create a makefile that works on both Linux and 
Windows, because the embedded file paths would be incompatible. If 
relative paths couldn't be used in conjunction with different path 
types, then there would be no way for make, Apache, Scons, svn, and a 
whole bunch of other apps I can think of to work cross-platform.

What I want is a way to have data files that contain embedded paths 
encoded in a way that allows those data files to be transported across 
platform. Generally those embedded paths will be relative to something, 
rather than fully-qualfied, such as ${root}/something/or/other.

What I want to avoid is a situation where I have to edit my config file 
to switch all the path separators from '/' to '\' when I move my 
application from OS X to Win32.

As a Python programmer, my ideal is to be able to write as if I *didn't 
know* what the underlying platform was.

>> I don't think you need to follow too closely the syntax of os.path -
>> rather, we should concentrate on the semantics, and even more
>> importantly, the scope of the existing module. In other words don't try
>> to do too much more than os.path did.
> 
> I think a layered approach is important.  Use the code in the existing
> modules because it's well tested.  Add a native-path object on top of
> that.  Put non-native paths and conversions on the side.  Then put a
> filesystem-access class (or functions) on top of the native-path
> object.  Then high-level functions/methods on top of that.  Then when
> we lobby for stdlib inclusion it'll be "one level and everything below
> it".  People can see and test that level, and ignore the (possibly
> more controversial) levels above it.

BTW, I don't think that the "filesystem object" is going to fly in the 
way you describe it. Specifically, it sounds like yet another swiss army 
knife class.

I think that there's a reasonable chance of acceptance for an object 
that does filesystem-like operations that *doesn't overlap* with what 
the Path object does. But what you are proposing is a *superset* of what 
Path does (because you're saying its a subclass), and I don't think that 
will go over well.

The basic rationale is simple: "Things" and "Names for things" are 
distinct from each other, and shouldn't be conflated into the same 
object. A path is a name for a thing, but it is not the same as the 
thing it is naming.

>> I'm in favor of the verb 'combine' being used to indicate both a joining
>> and a simplification of a path.
> 
> 
> I like the syntax of a join method.  With a multi-arg constructor it's
> not necessary though. PEP 355 hints at problems with the .joinpath()
> method though it didn't say what they were.   .combine() is an OK
> name, and there's a precedent in the datetime module.  But people are
> used to calling it "joining paths" so I'm not sure we should rename it
> so easily.  We can't call it .join due to string compatibility.
> .joinpath is ugly if we delete "path" from the other method names.
> .pjoin comes to mind but people might find it too cryptic and too
> similar to .join.  By just using the constructor we avoid the debate
> over the method name.

The reason I don't like the word "join" is that it implies some sort of 
concatenation, like an "append" operation - i.e. the combination of "a" 
and "b" joined together is "a/b". However, when we combine paths, we're 
really doing "path algebra", which is a more sophisticated mixing of 
paths. Thus "a" joined with "/b" is just "/b", not "a/b".

> Again, 99.9% of Python users have a functioning cwd, and .abspath() is
> one of those things people expect in any programming language. Nobody
> is forcing you to call it on the Xbox.  However, I'm not completely
> opposed to dropping .abspath().

And I'm not completely opposed to keeping it either. I'm just a 
minimalist :) :)

> People need to add/delete/replace extensions, and they don't want to
> use character slicing / .rfind / .endswith / len() / + to do it.  They
> expect the library to at least handle extension splitting as well as
> os.path does.  Adding a few convenience methods would be unobtrusive
> and express people really want to do:
> 
>     p2 = p.add_ext(".tar")
>     p2 = p.del_ext()
>     p2 = Path("foo.gzip").replace_ext(".bz2")
> 
> But what harm is there in making them scalable to multiple extensions?
> 
>     .add_exts(*exts)
>     .del_exts(N)
>     .replace_exts(N, *exts)

Someone in another message pointed out that paths, being based on 
strings, are immutable, so this whole handling of extensions will have 
to be done another way.

-- Talin


More information about the Python-3000 mailing list