[Python-bugs-list] [ python-Bugs-450225 ] urljoin fails RFC tests
noreply@sourceforge.net
noreply@sourceforge.net
Mon, 18 Mar 2002 06:22:19 -0800
Bugs item #450225, was opened at 2001-08-12 06:10
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=450225&group_id=5470
Category: Python Library
Group: Python 2.1.1
Status: Open
Resolution: None
Priority: 5
Submitted By: Aaron Swartz (aaronsw)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: urljoin fails RFC tests
Initial Comment:
I've put together a test suite for Python's URLparse
module, based on the tests in Appendix C of
RFC2396 (the URI RFC). They're available at:
http://lists.w3.org/Archives/Public/uri/2001Aug/
0013.html
The major problem seems to be that it treats
queries and parameters as special components
(not just normal parts of the path), making this
related to:
http://sourceforge.net/tracker/?group_id=5470&
atid=105470&func=detail&aid=210834
----------------------------------------------------------------------
Comment By: Jon Ribbens (jribbens)
Date: 2002-03-18 14:22
Message:
Logged In: YES
user_id=76089
I think it would be better btw if '..' components taking
you 'off the top' were stripped. RFC 2396 says this is
valid behaviour, and it's what 'real' browsers do.
i.e.
http://a/b/ + ../../../d == http://a/d
----------------------------------------------------------------------
Comment By: Aaron Swartz (aaronsw)
Date: 2001-11-05 18:34
Message:
Logged In: YES
user_id=122141
Oops, meant to attach it...
----------------------------------------------------------------------
Comment By: Aaron Swartz (aaronsw)
Date: 2001-11-05 18:30
Message:
Logged In: YES
user_id=122141
Sure, here they are:
import urlparse
base = 'http://a/b/c/d;p?q'
assert urlparse.urljoin(base, 'g:h') == 'g:h'
assert urlparse.urljoin(base, 'g') == 'http://a/b/c/g'
assert urlparse.urljoin(base, './g') == 'http://a/b/c/g'
assert urlparse.urljoin(base, 'g/') == 'http://a/b/c/g/'
assert urlparse.urljoin(base, '/g') == 'http://a/g'
assert urlparse.urljoin(base, '//g') == 'http://g'
assert urlparse.urljoin(base, '?y') == 'http://a/b/c/?y'
assert urlparse.urljoin(base, 'g?y') == 'http://a/b/c/g?y'
assert urlparse.urljoin(base, '#s') == 'http://a/b/c/
d;p?q#s'
assert urlparse.urljoin(base, 'g#s') == 'http://a/b/c/g#s'
assert urlparse.urljoin(base, 'g?y#s') == 'http://a/b/c/
g?y#s'
assert urlparse.urljoin(base, ';x') == 'http://a/b/c/;x'
assert urlparse.urljoin(base, 'g;x') == 'http://a/b/c/g;x'
assert urlparse.urljoin(base, 'g;x?y#s') == 'http://a/b/c/
g;x?y#s'
assert urlparse.urljoin(base, '.') == 'http://a/b/c/'
assert urlparse.urljoin(base, './') == 'http://a/b/c/'
assert urlparse.urljoin(base, '..') == 'http://a/b/'
assert urlparse.urljoin(base, '../') == 'http://a/b/'
assert urlparse.urljoin(base, '../g') == 'http://a/b/g'
assert urlparse.urljoin(base, '../..') == 'http://a/'
assert urlparse.urljoin(base, '../../') == 'http://a/'
assert urlparse.urljoin(base, '../../g') == 'http://a/g'
assert urlparse.urljoin(base, '') == base
assert urlparse.urljoin(base, '../../../g') == 'http://a/../g'
assert urlparse.urljoin(base, '../../../../g') == 'http://a/../../g'
assert urlparse.urljoin(base, '/./g') == 'http://a/./g'
assert urlparse.urljoin(base, '/../g') == 'http://a/../g'
assert urlparse.urljoin(base, 'g.') == 'http://a/b/c/
g.'
assert urlparse.urljoin(base, '.g') == 'http://a/b/c/
.g'
assert urlparse.urljoin(base, 'g..') == 'http://a/b/c/
g..'
assert urlparse.urljoin(base, '..g') == 'http://a/b/c/
..g'
assert urlparse.urljoin(base, './../g') == 'http://a/b/g'
assert urlparse.urljoin(base, './g/.') == 'http://a/b/c/
g/'
assert urlparse.urljoin(base, 'g/./h') == 'http://a/b/c/
g/h'
assert urlparse.urljoin(base, 'g/../h') == 'http://a/b/c/
h'
assert urlparse.urljoin(base, 'g;x=1/./y') ==
'http://a/b/c/g;x=1/y'
assert urlparse.urljoin(base, 'g;x=1/../y') == 'http://a/b/
c/y'
assert urlparse.urljoin(base, 'g?y/./x') ==
'http://a/b/c/g?y/./x'
assert urlparse.urljoin(base, 'g?y/../x') ==
'http://a/b/c/g?y/../x'
assert urlparse.urljoin(base, 'g#s/./x') == 'http://a/b/
c/g#s/./x'
assert urlparse.urljoin(base, 'g#s/../x') == 'http://a/b/
c/g#s/../x'
----------------------------------------------------------------------
Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2001-11-05 18:05
Message:
Logged In: YES
user_id=3066
This looks like its probably related to #478038; I'll try to
tackle them together. Can you attach your tests to the bug
report on SF? Thanks!
----------------------------------------------------------------------
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=450225&group_id=5470