[Python-ideas] PEP 428 - object-oriented filesystem paths
ncoghlan at gmail.com
Sat Oct 13 17:37:18 CEST 2012
On Sat, Oct 13, 2012 at 8:06 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> The question is: why do you want to do that?
> I know there are a limited bunch of special cases where Posix filesystem
> paths may be case-insensitive, but nobody really cares about them today,
> and I don't expect many people to bother tomorrow. Playing with
> individual parameters of path semantics sounds like a theoretical bother
> more than a practical one.
It's a useful trick for writing genuinely cross-platform code: when
I'm writing cross-platform code on *nix, I want my paths to behave
like posix paths in every respect *except* I want them to complain
somehow if any of my names only differ by case. I've been burnt in the
past by checking in conflicting names on a Linux machine and then
wondering why the Windows checkouts were broken. The only real way to
deal with that is to avoid relying on filesystem case sensitivity for
correct behaviour of your application, even when the underlying OS
*permits* case sensitivity.
This becomes even *more* important if NFS and CIFS filesystems are
being shared between *nix and Windows systems, but it applies any time
a file system may be shared (e.g. creating archive files, checking in
to a source control system, etc). I have the luxury right now of only
needing to care about Linux systems, but I've had to deal with the
mess in the past and "act case insensitive everywhere" is the only
sanity preserving option. Python itself deals with this mostly via the
stylistic rule of "always use lowercase module and package names", but
it would be nice if a new path abstraction allowed the problem to be
On the Windows side, it would be nice to be able to request the use of
"/" as the directory separator when converting to a string. Using "\"
has the potential to cause interoperability problems (e.g. with
If you don't like the implicit nature of contexts (a perfectly
reasonable complaint), then I suggest going for an explicit strategy
pattern with flavours rather than requiring classes.
With this approach, the flavour would be specified on a *per-instance*
basis (with the default behaviour being determined by the OS).
The main class hierarchy would just be PurePath <-- Path and there
would be a separate PathFlavor ABC with PosixFlavor and WindowsFlavor
subclasses (public Python stdlib APIs generally follow US spelling and
drop the 'u').
The main classes would then *delegate* the flavour dependent
operations like parsing, conversion to a string and equality
comparisons to the flavour objects.
It's really the public use of the strategy pattern that prevents the
combinatorial explosion - you can just have a single OS-based default
(as is already the case with PurePath.__new__ and Path.__new__ playing
type selection games), rather than allowing the default to be
configured per thread. The decimal-style thread-based dynamic contexts
are more useful when you want to change the behaviour *without* either
copying or mutating objects, which I agree is overkill for path
Since pathlib already uses the Flavor objects as strategies
internally, it should just be a matter of switching from the use of
inheritance to specify the flavour to using a keyword-only argument in
the constructor. The "case-insensitive posix path" example would then
case_sensitive = False
return Path(*args, flavor=PosixCaseInsensitiveFlavor)
You can add as many new flavours as you want, and it's only one class
per flavour rather than up to 3 (the flavour itself, the pure variant
and the concrete variant).
This class hierarchy is also more amenable to the introduction of
MutablePath as a second subclass of PurePath - a path variant with
mutable properties still sounds potentially attractive to me (over a
wide variety of return-a-modified-copy methods for various cases).
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
More information about the Python-ideas