[Python-Dev] PEP 355 (object-oriented paths)

Talin talin at acm.org
Thu Apr 20 10:31:30 CEST 2006


I didn't have a chance to comment earlier on the Path class PEP,
but I'm dealing with an analogous situation at work and I'd like to
weigh in on it.

The issue is - should paths be specialized objects or regular strings?

PEP 355 does an excellent job, I think, of presenting the case for paths
being objects. And I certainly think that it would be a good idea to
clean up some of the path-related APIs that are in the stdlib now.

However, all that being said, I'd like to make some counter-arguments.

There are a lot of programming languages out there today that have
custom path classes, while many others use strings. A particularly
useful comparison is Java vs. C#, two languages which have many
aspects in common, but take diametrically opposite approaches to
the handling of paths. Java uses a Path class, while C# uses strings
as paths, and supplies a set of function-based APIs for manipulating
them.

Having used both languages extensively, I think that I prefer strings.
One of the main reasons for this is that the majority of path
manipulations are single operations - such as, take a base path
and a relative path and combine them. Although you do occasionally
see cases where a complex series of operations is performed on a
path, such cases tend to be (a) rare, and (b) not handled well by
the standard API, in the sense that what is being done to the path
is not something that was anticipated by the person who wrote the
path API in the first place.

Given that the ultimate producers and consumers of paths (that is,
the filesystem APIs, the input fields of dialog boxes, the argv array)
know nothing about Path objects, the question is, is it worth converting
to a special object and back again just to do a simple concatenate?
I think that I would prefer to have a nice orthogonal set of path
manipulation functions.

Another reason why I am a bit dubious about a class-based approach
is that it tends to take anything that is related to a filepath and lump
them into a single module.

I think that there are some fairly good reasons why different path-
related functions should be in different modules. For example,
one thing that irks me (and others) about the Path class in Java is
that it makes no distinction between methods that are merely textual
conversions, and methods which actually go out and touch the disk.
I would rather that functions that invoke filesystem activity to be
partitioned away from functions that merely involve string
manipulation.

Creating a tempfile, for example, or determining whether a file is
writeable should not be in the same bucket as determining the
file extension, or whether a path is relative or absolute.

What I would like to see, instead, is for the various path-related
functions to be organized into a clear set of categories. For example,
if "os.path" is the module for pure operations on paths, without
reference to the filesystem, then the current path separator character
should be a member of that module, not the "os" module.

-- Talin



More information about the Python-Dev mailing list