On Sat, 27 Dec 2008 10:58:07 am Nick Coghlan wrote:
The doc for os.path.commonprefix states:
Return the longest path prefix (taken character-by-character)
that is a prefix of all paths in list. If list is empty, return the empty string (''). Note that this may return invalid paths because it works a character at a time.
I remember encountering this in an earlier version of Python 2.x (maybe 2.2 or 2.3?) and "fixed" it to work by pathname components instead of by characters. That had to be reverted because it was a behavior change and broke code which used it for strings which didn't represent paths. After the reversion I then forgot about it.
I just stumbled upon it again. It seems to me this would have been a good thing to fix in 3.0. Is this something which could change in 3.1 (or be deprecated in 3.1 with deletion in 3.2)?
Why can't we add an "allow_fragment" keyword that defaults to True? Then "allow_fragment=False" will stop at the last full directory name and ignore any partial matches on the filenames or the next subdirectory (depending on where the common prefix ends).
For what it's worth, I think that the two pieces of functionality are different enough that in an ideal world they should be two different functions rather than one function with a switch. I think os.path.commonprefix should only operate on path components, and if character-by-character prefix matching on general strings is useful, then it should be a string method.