[Python-Dev] bug in urlparse
Mike Brown
mike at skew.org
Thu Sep 8 20:41:39 CEST 2005
jepler at unpythonic.net wrote:
> According to RFC 2396[1] section 5.2:
RFC 2396 is obsolete. It was superseded by RFC 3986 / STD 66 early this year.
In particular, the procedure for removing dot-segments from the path component
of a URI reference -- a procedure that is only supposed to be done when
'resolving' a reference to absolute form (i.e., merging it with a base URI,
which, being a URI, not a URI reference, is not allowed to contain
dot-segments) -- has received a significant overhaul.
The implementation guidance you quoted from RFC 2396 is no longer relevant.
Technically, it never was relevant, since urlparse only claims to implement
RFC 1808 (2396's predecessor, now ten years old).
The new procedure says
"...dot-segments are intended for use in URI references to
express an identifier relative to the hierarchy of names in the base
URI. The remove_dot_segments algorithm respects that hierarchy by
removing extra dot-segments rather than treat them as an error or
leaving them to be misinterpreted by dereference implementations."
-Mike
More information about the Python-Dev
mailing list