[Python-Dev] Path object design
Phillip J. Eby
pje at telecommunity.com
Sat Nov 4 03:09:47 CET 2006
At 01:56 AM 11/4/2006 +0100, Andrew Dalke wrote:
>os.join assumes the base is a directory
>name when used in a join: "inserting '/' as needed" while RFC
>1808 says
>
> The last segment of the base URL's path (anything
> following the rightmost slash "/", or the entire path if no
> slash is present) is removed
>
>Is my intuition wrong in thinking those should be the same?
Yes. :)
Path combining and URL absolutization(?) are inherently different
operations with only superficial similarities. One reason for this is that
a trailing / on a URL has an actual meaning, whereas in filesystem paths a
trailing / is an aberration and likely an actual error.
The path combining operation says, "treat the following as a subpath of the
base path, unless it is absolute". The URL normalization operation says,
"treat the following as a subpath of the location the base URL is
*contained in*".
Because of this, os.path.join assumes a path with a trailing separator is
equivalent to a path without one, since that is the only reasonable way to
interpret treating the joined path as a subpath of the base path.
But for a URL join, the path /foo and the path /foo/ are not only
*different paths* referring to distinct objects, but the operation wants to
refer to the *container* of the referenced object. /foo might refer to a
directory, while /foo/ refers to some default content (e.g.
index.html). This is actually why Apache normally redirects you from /foo
to /foo/ before it serves up the index.html; relative URLs based on a base
URL of /foo won't work right.
The URL approach is designed to make peer-to-peer linking in a given
directory convenient. Instead of referring to './foo.html' (as one would
have to do with filenames, you can simply refer to 'foo.html'. But the
cost of saving those characters in every link is that joining always takes
place on the parent, never the tail-end. Thus directory URLs normally end
in a trailing /, and most tools tend to automatically redirect when
somebody leaves it off. (Because otherwise the links would be wrong.)
More information about the Python-Dev
mailing list