[Python-3000] Mini Path object
Talin
talin at acm.org
Wed Nov 8 11:21:29 CET 2006
Mike Orr wrote:
> Remember that 99% of Python programmers are concerned only with native
> paths. I have never used a non-native path or multiple-platform paths
> in an application. So we need to make the native case easy and clear.
> For that reason I'd rather keep the non-native cases and conversion
> code in separate classes.
This is only true if one is speaking of fully-qualified paths.
Most of the time, however, we deal with relative paths. Relative paths
are, for the most part, universal already - if they weren't, then it
would be impossible to create a makefile that works on both Linux and
Windows, because the embedded file paths would be incompatible. If
relative paths couldn't be used in conjunction with different path
types, then there would be no way for make, Apache, Scons, svn, and a
whole bunch of other apps I can think of to work cross-platform.
What I want is a way to have data files that contain embedded paths
encoded in a way that allows those data files to be transported across
platform. Generally those embedded paths will be relative to something,
rather than fully-qualfied, such as ${root}/something/or/other.
What I want to avoid is a situation where I have to edit my config file
to switch all the path separators from '/' to '\' when I move my
application from OS X to Win32.
As a Python programmer, my ideal is to be able to write as if I *didn't
know* what the underlying platform was.
>> I don't think you need to follow too closely the syntax of os.path -
>> rather, we should concentrate on the semantics, and even more
>> importantly, the scope of the existing module. In other words don't try
>> to do too much more than os.path did.
>
> I think a layered approach is important. Use the code in the existing
> modules because it's well tested. Add a native-path object on top of
> that. Put non-native paths and conversions on the side. Then put a
> filesystem-access class (or functions) on top of the native-path
> object. Then high-level functions/methods on top of that. Then when
> we lobby for stdlib inclusion it'll be "one level and everything below
> it". People can see and test that level, and ignore the (possibly
> more controversial) levels above it.
BTW, I don't think that the "filesystem object" is going to fly in the
way you describe it. Specifically, it sounds like yet another swiss army
knife class.
I think that there's a reasonable chance of acceptance for an object
that does filesystem-like operations that *doesn't overlap* with what
the Path object does. But what you are proposing is a *superset* of what
Path does (because you're saying its a subclass), and I don't think that
will go over well.
The basic rationale is simple: "Things" and "Names for things" are
distinct from each other, and shouldn't be conflated into the same
object. A path is a name for a thing, but it is not the same as the
thing it is naming.
>> I'm in favor of the verb 'combine' being used to indicate both a joining
>> and a simplification of a path.
>
>
> I like the syntax of a join method. With a multi-arg constructor it's
> not necessary though. PEP 355 hints at problems with the .joinpath()
> method though it didn't say what they were. .combine() is an OK
> name, and there's a precedent in the datetime module. But people are
> used to calling it "joining paths" so I'm not sure we should rename it
> so easily. We can't call it .join due to string compatibility.
> .joinpath is ugly if we delete "path" from the other method names.
> .pjoin comes to mind but people might find it too cryptic and too
> similar to .join. By just using the constructor we avoid the debate
> over the method name.
The reason I don't like the word "join" is that it implies some sort of
concatenation, like an "append" operation - i.e. the combination of "a"
and "b" joined together is "a/b". However, when we combine paths, we're
really doing "path algebra", which is a more sophisticated mixing of
paths. Thus "a" joined with "/b" is just "/b", not "a/b".
> Again, 99.9% of Python users have a functioning cwd, and .abspath() is
> one of those things people expect in any programming language. Nobody
> is forcing you to call it on the Xbox. However, I'm not completely
> opposed to dropping .abspath().
And I'm not completely opposed to keeping it either. I'm just a
minimalist :) :)
> People need to add/delete/replace extensions, and they don't want to
> use character slicing / .rfind / .endswith / len() / + to do it. They
> expect the library to at least handle extension splitting as well as
> os.path does. Adding a few convenience methods would be unobtrusive
> and express people really want to do:
>
> p2 = p.add_ext(".tar")
> p2 = p.del_ext()
> p2 = Path("foo.gzip").replace_ext(".bz2")
>
> But what harm is there in making them scalable to multiple extensions?
>
> .add_exts(*exts)
> .del_exts(N)
> .replace_exts(N, *exts)
Someone in another message pointed out that paths, being based on
strings, are immutable, so this whole handling of extensions will have
to be done another way.
-- Talin
More information about the Python-3000
mailing list