
[Rob Cliffe]
The os.path.commonprefix function basically returns the common initial characters (if any) shared by a sequence of strings, e.g. ... It seems to me that this function (or something similar) should not be in os.path, but somewhere else where it would be more visible.
It's certainly in a strange place ;-)
(1) a str.commonprefix function. Unfortunately it would have to be used like this str.commonprefix(<sequence of strings>) which would make it [I think] the only str function which couldn't be called as a string method.
Sure it could. like astr.commonprefix(*strs) to find the longest common prefix among `astr` and the (zero or more) optional arguments.
... One wrinkle: os.patch.commonprefix, if passed an empty sequence, returns an empty STRING. If this function were designed from scratch, it should probably return None.
Don't think so. An empty string is a valid string too, and is the "obviously correct" common prefix of "abc" and "xyz".
I also suspect that this function could be made more efficient. It sorts the sequence. While sorting is very fast (thanks, Uncle Tim!) it seems a bit OTT in this case.
It doesn't sort. It finds the min and the max of all the inputs, as separate operations, and finds the longest common prefix of those teo alone. Apart from the initial min/max calls, the other inputs are never looked at again. It's a matter of logical deduction that if S <= L have K initial characters in common, then every string T with S <= T <= L must have the same K initial characters. As a micro-optimization, you might think the min'max could be skipped if there were only two input strings. But then for i, c in enumerate(s1): if c != s2[i]: return s1[:i] could blow up with an IndexError if len(s2) < len(s1). As is, that can't happen, because s1 <= s2 is known. At worst, the `for` loop can run to exhaustion (in which case s2.startswith(s1)). So. to my eye, there are surprisingly few possibilities for speeding this.