[Python-Dev] Path object design

"Martin v. Löwis" martin at v.loewis.de
Sun Nov 5 20:22:13 CET 2006


Andrew Dalke schrieb:
> I have looked at the spec, and can't figure out how its explanation
> matches the observed urljoin results.  Steve's excerpt trimmed out
> the strangest example.

Unfortunately, you didn't say which of these you want explained.
As it is tedious to write down even a single one, I restrain to the
one with the What?! remark.

>>>> urlparse.urljoin("http://blah.com/a/b/c", "../../../..")  # What?!
> 'http://blah.com/'

Please follow me through section 5 of

http://www.ietf.org/rfc/rfc3986.txt

5.2.1: Pre-parse the Base URI
 B.scheme = "http"
 B.authority = "blah.com"
 B.path = "/a/b/c"
 B.query = undefined
 B.fragment = undefined

5.2.2: Transform References
 parse("../../../..")
 R.scheme = R.authority = R.query = R.fragment = undefined
 R.path = "../../../.."
 (strictness not relevant, R.scheme is already undefined)
 R.scheme is not defined
 R.authority is not defined
 R.path is not ""
 R.path does not start with /
 T.path = merge("/a/b/c", "../../../..")
 T.path = remove_dot_segments(T.path)
 T.authority = "blah.com"
 T.scheme = "http"
 T.fragment = undefined

5.2.3 Merge paths
 merge("/a/b/c", "../../../..") =
 (base URI does have path)
 "/a/b/../../../.."

5.2.4 Remove Dot Segments
 remove_dot_segments("/a/b/../../../..")
 1. I = "/a/b/../../../.."
    O = ""
 2. A (does not apply)
    B (does not apply)
    C (does not apply)
    D (does not apply)
    E O="/a" I="/b/../../../.."
 2. E O="/a/b" I="/../../../.."
 2. C O="/a" I="/../../.."
 2. C O="" I="/../.."
 2. C O="" I="/.."
 2. C O="" I="/"
 2. E O="/" I=""
 3. Result: "/"

5.3 Component Recomposition
 result = ""
 (scheme is defined)
 result = "http:"
 (authority is defined)
 result = "http://blah.com"
 (append path)
 result = "http://blah.com/"

HTH,
Martin


More information about the Python-Dev mailing list