urllib interpretation of URL with ".."
"Martin v. Löwis"
martin at v.loewis.de
Sat Jun 23 09:14:39 CEST 2007
John Nagle schrieb:
> Here's a URL, found in a link, which gives us trouble
> when we try to follow the link:
> Browsers immediately turn this into
> and go from there, but urllib tries to open it explicitly, which
> results in an HTTP error 400.
> Is "urllib" wrong?
I can't see how. HTTP 1.1 says that the parameter to the GET
request should be an abs_path; RFC 2396 says that
/../acatalog/shop.html is indeed an abs_path, as .. is a valid
segment. That RFC also has a section on relative identifiers
and normalization; it defines what .. means *in a relative path*.
Section 4 is explicit about .. in absolute URIs:
# The syntax for relative URI is a shortened form of that for absolute
# URI, where some prefix of the URI is missing and certain path
# components ("." and "..") have a special meaning when, and only when,
# interpreting a relative path.
Notice the "and only when": the browsers who modify above
URL before sending it seem to be in clear violation of
More information about the Python-list