[Python-ideas] os.path.join

Andrew Barnert abarnert at yahoo.com
Thu Oct 31 17:26:28 CET 2013


On Oct 31, 2013, at 5:56, Paul Moore <p.f.moore at gmail.com> wrote:

> On 31 October 2013 10:30, anatoly techtonik <techtonik at gmail.com> wrote:
>>> I agree it might be confusing but it's pretty explicitly documented.
>> 
>> Yes. It is confusing.
>> 
>> 1. How often the operations to join absolute paths is needed?
> 
> Infrequently, but occasionally. Usually the first argument will be a
> fixed value which is a "base path" and the second will be a
> user-supplied (or similar) value which is to be interpreted as
> relative to the base, unless it's an absolute path when it's to be
> used unchanged.

Agreed. Any command line tool that takes an optional base path in an flag arg and paths to files in positional args works that way--or it works by first chdir-ing to the base path then using the paths, which names the same files. It would be very surprising if it didn't.  Even the way base URLs and href URLs on web pages are combined is based on this behavior.

Languages that don't do it this way are surprising. For example, Ruby's File.join always treats the argument as a relative path. So I had to write my own method that did the equivalent of "p2 if p2.startswith(os.sep) else p1.join(p2)". (Perl of course has at least 5 ways to do it, 2 that act like Python, 1 that acts like Ruby, and 2 that double up the separator with whatever meaning that happens to have on each platform--but that isn't surprising; it's perl.)

>> 2. What is expected result of this operation?
> 
> Exactly what Python currently does.
> 
>> For me, as a user, the answer to 1 is 'never', for 2 I'd expect 2nd
>> path to be treated as relative one. Thinking about this as 2nd path is
>> an absolute path from the mountpoint specified in the 1st.
> 
> I would never want this behaviour in any real application I have encountered.

Agreed.

I've never seen anyone argue that the other behavior would be more "natural". I _have_ seen an argument that it's more "secure", but this seems like a silly argument. After all, Ruby's File.join doesn't stop you from joining "../../../etc", so why should it stop you from joining "/etc"? And, if there _is_ a good reason to stop you, why does it return a path that's likely to silently work (but not in the way the user intended) rather than raise? And what if you're writing a command line tool intended for system administration rather than a web app? Even with a web app, if you run inside a chroot or similar jail, how do you provide access to the entire jail? It seems like the kind of "security" feature that PHP hacks would devise, like using extra quoting so an attacker has to throw an extra quote in if he wants to inject SQL...



More information about the Python-ideas mailing list