[Python-Dev] Unipath package
Mike Orr
sluggoster at gmail.com
Sun Jan 28 11:53:35 CET 2007
I finally finished my path package (Unipath) and put it in the Cheeseshop.
http://sluggo.scrapping.cc/python/unipath/
There's a Path class for pathname calculations, and a FSPath subclass
for filesystem calls. I'm hoping Path -- or something resembling it
-- will find its way into os.path in Python 2.6 or 3.0. FSPath is
full of convenience methods so it may not be everybody's cup of tea,
but perhaps something similar can go into Python in the farther future
Unipath is an early alpha release so the API may change as it gets
more real-world use. There's an extensive unittest suite, which
passes on Python 2.5 and 2.4.4 on Linux. Windows and Macintosh
testers are needed.
Following are highlights from the python-3000 discussion and deviations from it:
- Path subclasses unicode, or str if the platform can't handle
Unicode pathnames.
- FSPath subclasses Path. This allows you to do "from unipath import
FSPath as Path" and pretend there's only one class. I find this much
more convenient in applications, so you don't have to cast objects
back and forth between the classes. Also, it just became infeasable
not to let them inherit, because so many methods call other methods.
- Nevertheless, you can use Path alone and rest assured it will never
touch the filesystem. You can even use Path objects with os.*
functions if you are so heretically inclined. If Path is accepted
into the stdlib, FSPath will inherit it from there.
- I tried splitting FSPath into several mixins according to type of
operation, but the need for methods to call other methods in a
different category sabotaged that too. So FSPath proudly has about
fifty methods. (Path has 10 public methods and 4 properties.)
- The dirname property is called .parent. The basename property is
.name. The extension property is .ext. The name without extension is
.stem.
- .components() returns a list of directory components. The first
component is "/", a Windows drive root, a UNC share, or "" for a
relative path. .split_root() returns the root and the rest.
- PosixPath, NTPath, and MacPath are Path subclasses using a specific
path library. This allows you to express non-native paths and convert
paths. Passing a relative foreign path to a *Path constructor
converts it to the destination type. Passing an absolute foreign path
is an error, because there's no sane way to interpret "C:\\" on Posix
or "/" on Windows. I'm skeptical whether this non-native support is
really worth it, because .norm() and .norm_case() already convert
slashes to backslashes (which is what Talin really wanted to do --
have Posix paths in a config file and automatically convert them to
the native format). And if Python is going to drop Mac OS 9 soon then
MacPath is obsolete. So maybe this non-native path code will prove
less than useful and will be deleted. On the other hand, if someone
is burning to write a zippath or ftppath library, you can use it with
this.
- Setting Path.auto_norm to true will automatically normalize all
paths on construction. I'm not sure if this should be the default,
because normalizing may produce the wrong path (e.g., if it contains a
symlink). You can always pass norm=True or norm=False to the
constructor to enable/disable it.
- p.child("subdir", "grandkid") is Glyph's favorite "safe join" method
that prevents creating a path pointing above 'p'. You can use it as a
.joinpath if you don't mind this restriction. The constructor allows
multiple positional arguments, which are joined using os.path.join().
- Listing is p.listdir(pattern=None, filter=None, names_only=False).
This returns a non-recursive list of paths. If 'names_only' is true
it returns the same as os.listdir(). p.walk(pattern=None,
filter=None, top_down=True) is the recursive counterpart, yielding
paths. In both cases 'pattern' is a glob pattern, and 'filter' is one
of the constants (FILES, DIRS, LINKS, FILES_NO_LINKS, DIRS_NO_LINKS,
DEAD_LINKS) or a custom boolean function. I spent a lot of time going
back and forth between Orendorff's six listing methods and Raphael's
one, and finally decided on this, partly because I'm not satisfied
with how either Orendorff's class or Raphael's or os.walk() handle
symlinks -- sometimes you want to ignore the links and then iterate
them separately.
- .read_file(mode) and .write_file(content, mode) are a compromise
between Orendorff's seven methods and purists' desire for zero
methods.
- .mkdir(), .rmdir(), .rmtree(), .copy(), .copy_stat(), .copy_tree(),
and .move() are more fancy than their stdlib/Orendorff counterparts.
They silently succeed if the operation is already done, and have
arguments to smartly create/delete intermediate directories, etc.
- Two extra functions are in 'unipath.tools'. 'dict2dir' creates a
directory hierarchy modeled after a dict. 'dump_path' displays an
ASCII tree of a directory hierarchy, with file sizes and symlink
targets showing.
Enjoy! and please provide feedback.
--Mike Orr <sluggoster at gmail.com>
More information about the Python-Dev
mailing list