[Python-ideas] os.path.commonpath()

Ronald Oussoren ronaldoussoren at mac.com
Wed Nov 7 08:22:40 CET 2012


On 7 Nov, 2012, at 3:05, Bruce Leban <bruce at leapyear.org> wrote:

> It would be nice if in conjunction with this os.path.commonprefix is renamed as string.commonprefix with the os.path.commonprefix kept for backwards compatibility (and deprecated).
> 
> more inline
> 
> On Tue, Nov 6, 2012 at 7:49 AM, Ronald Oussoren <ronaldoussoren at mac.com> wrote:
> 
> On 6 Nov, 2012, at 16:27, Serhiy Storchaka <storchaka at gmail.com> wrote:
> > What should be a common prefix of '/var/log/apache2' and '/var//log/mysql'?
> /var/log
> 
> > What should be a common prefix of '/usr' and '//usr'?
> /usr
> 
> > What should be a common prefix of '/usr/local/' and '/usr/local/'?
> /usr/local
> 
> It appears that you want the result to never include a trailing /. However, you've left out one key test case:
> 
> What is commonpath('/usr', '/var')?
> 
> It seems to me that the only reasonable value is '/'.

I agree

> 
> If you change the semantics so that it either (1) it always always includes a trailing / or (2) it includes a trailing slash if the two paths have it in common, then you don't have the weirdness that in this case it returns a slash and in others it doesn't. I am slightly inclined to (1) at this point.

I'd prefer to only have a path seperator at the end when it has semantic meaning. That would mean that only the root of a filesystem tree ("/" on Unix, but also "C:\" and "\\server\share\" on Windows) have a separator and the end.

> 
> It would also be a bit surprising that there are cases where commonpath(a,a) != a.

That's already true, commonpath('/usr//bin', '/usr//bin') would be  '/usr/bin' and not '/usr//bin'.

> 
>  
> > What should be a common prefix of '/usr/local/' and '/usr/local/bin'?
> /usr/local
> 
> > What should be a common prefix of '/usr/bin/..' and '/usr/bin'?
> /usr/bin
> 
> seems better than the alternative of interpreting the '..'.

That was the hard choice in the list, my reason for picking this result is that interpreting '..' can change the meaning of a path when dealing with symbolic links and therefore would make the function less useful (and you can always call os.path.normpath when you do want to interpret '..').  

Stripping '.' elements would be fine, e.g. commonpath('/usr/./bin/ls', '/usr/bin/sh') could be '/usr/bin'. 

> 
> * Relative paths that don't share a prefix should raise an exception
> 
> Why? Why is an empty path not a reasonable result?

An empty string is not a valid path.  Now that I reconsider this question: "." would be a valid path, and would have a sane meaning.

>  
> * On windows two paths that don't have the same drive should raise an exception
> 
> I disagree. On unix systems, should two paths that don't have the same drive also raise an exception? What if I'm using this function on windows to compare two http paths or two paths to a remote unix system? Raising an exception in either case would be wrong.

The paths in URLs don't have a drive, hence both URL paths would have the "same" drive.   More importantly: posixpath.commonpath would be better to compare two http or remote unix paths as that function uses the correct separator (ntpath.commonpath uses a backslash as separator)

Also: when two paths have a different drive letter or UNC share name there is no way to have a value for the prefix that allows for the construction of a path from the common prefix to one of those paths.

That is,

     path1 = "c:\windows"
     path2 = "d:\data"

     pfx = commonpath(path1, path2)

The only value of pfx that would result in there being a value of 'sfx' such that   os.path.join(pfx, sfx) == path1 is the empty string, but that value does not refer to a filesystem location.  That means you have to explictly test if commonpath returns the empty string because you likely have to behave differently when there is no shared prefix. I'd then prefer if commonpath raises an exception, because it would be too easy to forget to check for this (especially when developing on a unix based platform and later porting to windows).  An exception would mean code blows up, instead of giving unexpected results (leading to questions like "Why is your program writing junk in my home directory?")

> 
> 
> The alternative is to return some arbitrary value (like None) that you have to test for, which would IMHO make it too easy to accidently pass an useless value to some other API and get a confusing exeption later on.
> 
> Yes, don't return a useless value. An empty string is useful in the relative path case and '/' is useful in the non-relative but paths don't have common prefix at all case. 

"/" *is* the common prefix for absolute paths on Unix that don't share any path elements.  As mentioned above "." (or rather os.path.curdir) would be a sane result for relative paths.

Ronald 
> 
> 
> --- Bruce

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121107/d8b2fed9/attachment.html>


More information about the Python-ideas mailing list