[Python-bugs-list] urljoin() bug with odd no of '..' (PR#194)

DrMalte@ddd.de DrMalte@ddd.de
Sun, 30 Jan 2000 19:40:47 -0500 (EST)


Full_Name: Malte John
Version: 1.5.2 and 1.4
OS: Linux 
Submission from: router.ddd.de (212.105.193.65)


While playing with linbot I noticed some failed requests to 
'http://xxx.xxx.xx/../img/xxx.gif'
for a document in the root directory containing
<IMG SRC="../img/xxx.gif">.

The Reason is in urlparse.urljoin()
urljoin() fails to remove an odd number of '../' from the path.

Demonstration:

from urlparse import urljoin

print urljoin( 'http://127.0.0.1/', '../imgs/logo.gif' )
# gives       'http://127.0.0.1/../imgs/logo.gif'
# should give 'http://127.0.0.1/imgs/logo.gif'

print urljoin( 'http://127.0.0.1/', '../../imgs/logo.gif' )
# gives 'http://127.0.0.1/imgs/logo.gif'
# works

# '../../imgs/logo.gif' gives 'http://127.0.0.1/../imgs/logo.gif' and so on

The patch for 1.5.2
( I'm not sure if it works generally, but tests with linbot looked good)

*** /usr/local/lib/python1.5/urlparse.py        Sat Jun 26 19:11:59 1999
--- urlparse.py Mon Jan 31 01:31:45 2000
***************
*** 170,175 ****
--- 170,180 ----
                segments[-1] = ''
        elif len(segments) >= 2 and segments[-1] == '..':
                segments[-2:] = ['']
+
+       if segments[0] == '':
+               while segments[1] == '..':      # remove all leading '..'
+                       del segments[1]
+
        return urlunparse((scheme, netloc, joinfields(segments, '/'),
                           params, query, fragment))