[Python-ideas] os.path.commonprefix: Yes that old chestnut.
Paul Moore
p.f.moore at gmail.com
Tue Mar 24 15:16:58 CET 2015
On 24 March 2015 at 14:07, Andrew Barnert <abarnert at yahoo.com> wrote:
>>> Needless to say, an itertools (or "sequencetools") function that you call on parts does nothing to either help or hinder this problem. But it does seem to lend itself better to approaches where parts holds some new FooPathComponent type, or maybe a str on POSIX but a new CaseInsensitiveStr on Windows.
>>
>> (p.__class__(pp) for pp in p.parts)
>
> Sure, but then your whole expression looks something like:
>
> p1.__class__(*more_itertools.common_prefix(
> (p1.__class__(pp) for pp in p1.parts),
> (p2.__class__(pp) for pp in p2.parts)))
>
> Which doesn't read quite as nicely as "just call an itertools function on the parts and construct a Path from them" sounds like it should.
Agreed, absolutely :-)
> Which implies that you'd probably want at least a recipe in the pathlib docs that referenced the recipe in the itertools docs or something.
>
> And that many people who aren't on Windows just wouldn't bother and would write something non-portable until they got a complaint from a Windows user and found it worth investigating...
I still think it's worth considering as a path object method, for
convenience. I just think the *semantics* should be that of the
equivalent list operation, because that's easily understandable. But
to an extent, that's the trap that os.path.commonprefix fell into
(although to a much worse level...) Hence my concern that we find some
real use cases for the operation, so that we can ensure that the list
semantics match what people actually want the operation to do.
> (While we're at it: most POSIX OS's can handle both case-sensitive and case-insensitive filesystems, and at least some OS X functions take that into account, although that may not be true at the BSD level, only at the POSIX level. For that matter, doesn't the HFS+ filesystem also consider two paths equal if they have the same NFKD, even if they have different code points? But I guess if I'm remembering right, this would be no more or less broken than any other use of PosixPath on Mac, so it's not worth worrying about here, right?)
This (and the decision to treat a/b/ and a/b as the same) are
decisions that have already been made, for better or worse, by
pathlib, and I have no intention of getting sucked into them here.
IMO, a common_prefix method on pathlib objects should follow pathlib
semantics, and can easily do that by being built on top of existing
pathlib operations and uncontroversial list operations. That's the
only approach that I think makes sense - and the only remaining
question is whether it meets a real-world requirement, or whether it
will end up being an oddity that no-one ever uses because it doesn't
*quite* do what they want.
Paul
More information about the Python-ideas
mailing list