[Python-ideas] PEP 428 - object-oriented filesystem paths

Antoine Pitrou solipsis at pitrou.net
Sun Oct 7 19:37:35 CEST 2012

On Sat, 6 Oct 2012 10:44:37 -0700
Guido van Rossum <guido at python.org> wrote:
> But rather than diving right into the syntax, I would like to focus on
> some use cases. (Some of this may already be in the PEP, my
> apologize.) Some things I care about (based on path manipulations I
> remember I've written at some point or another):
> - Distinguishing absolute paths from relative paths; this affects
> joining behavior as for os.path.join().

The proposed API does function like os.path.join() in that respect:
when joining a relative path to an absolute path, the relative path is
simply discarded:

>>> p = PurePath('a')
>>> q = PurePath('/b')
>>> p[q]

> - Various normal forms that can be used for comparing paths for
> equality; there should be a pure normalization as well as an impure
> one (like os.path.realpath()).

Impure normalization is done with the resolve() method:

>>> os.chdir('/etc')
>>> Path('ssl/certs').resolve()

(/etc/ssl/certs being a symlink to /etc/pki/tks/certs on my system)

Pure comparison already obeys case-sensitivity rules as well as the
different path separators:

>>> PureNTPath('a/b') == PureNTPath('A\\B')
>>> PurePosixPath('a/b') == PurePosixPath('a\\b')

Note the case information isn't lost either:

>>> str(PureNTPath('a/b'))
>>> str(PureNTPath('A/B'))

> - An API that encourage Unix lovers to write code that is most likely
> also to make sense on Windows.
> - An API that encourages Windows lovers to write code that is most
> likely also to make sense on Unix.

I agree on these goals, that's why I'm trying to avoid system-specific
methods. For example is_reserved() is also defined under Unix, it just
always returns False:

>>> PurePosixPath('CON').is_reserved()
>>> PureNTPath('CON').is_reserved()

> - Integration with fnmatch (pure) and glob (impure).

This is provided indeed, with the match() and glob() methods

> - In addition to stat(), some simple derived operations like
> getmtime(), getsize(), islink().

The PEP proposes properties mimicking the stat object attributes:

>>> p = Path('setup.py')
>>> p.st_size
>>> p.st_mtime

And methods to query the file type:

>>> p.is_symlink()
>>> p.is_file()

Perhaps the properties / methods mix isn't very consistent.

> - Easy checks and manipulations (applying to the basename) like "ends
> with .pyc", "starts with foo", "ends with .tar.gz", "replace .pyc
> extension with .py", "remove trailing ~", "append .tmp", "remove
> leading @", and so on.

I'll try to reconcile this with Ben Finney's suffix / suffixes proposal.

> - Matching on patterns on directory names (e.g. "does not contain a
> segment named .hg").

Sequence-like access on the parts property provides this:

>>> p = PurePath('foo/.hg/hgrc')
>>> '.hg' in p.parts



Software development and contracting: http://pro.pitrou.net

More information about the Python-ideas mailing list