[Python-checkins] cpython (3.4): Issue #24657: Prevent CGIRequestHandler from collapsing the URL query

martin.panter python-checkins at python.org
Sat Oct 3 02:44:33 EDT 2015


https://hg.python.org/cpython/rev/634fe6a90e0c
changeset:   98510:634fe6a90e0c
branch:      3.4
user:        Martin Panter <vadmium+py at gmail.com>
date:        Sat Oct 03 05:55:46 2015 +0000
summary:
  Issue #24657: Prevent CGIRequestHandler from collapsing the URL query

Initial patch from Xiang Zhang. Also fix out-of-date _url_collapse_path() doc
string.

files:
  Lib/http/server.py           |  13 +++++++++----
  Lib/test/test_httpservers.py |   7 +++++++
  Misc/NEWS                    |   3 +++
  3 files changed, 19 insertions(+), 4 deletions(-)


diff --git a/Lib/http/server.py b/Lib/http/server.py
--- a/Lib/http/server.py
+++ b/Lib/http/server.py
@@ -887,13 +887,15 @@
     The utility of this function is limited to is_cgi method and helps
     preventing some security attacks.
 
-    Returns: A tuple of (head, tail) where tail is everything after the final /
-    and head is everything before it.  Head will always start with a '/' and,
-    if it contains anything else, never have a trailing '/'.
+    Returns: The reconstituted URL, which will always start with a '/'.
 
     Raises: IndexError if too many '..' occur within the path.
 
     """
+    # Query component should not be involved.
+    path, _, query = path.partition('?')
+    path = urllib.parse.unquote(path)
+
     # Similar to os.path.split(os.path.normpath(path)) but specific to URL
     # path semantics rather than local operating system semantics.
     path_parts = path.split('/')
@@ -914,6 +916,9 @@
     else:
         tail_part = ''
 
+    if query:
+        tail_part = '?'.join((tail_part, query))
+
     splitpath = ('/' + '/'.join(head_parts), tail_part)
     collapsed_path = "/".join(splitpath)
 
@@ -995,7 +1000,7 @@
         (and the next character is a '/' or the end of the string).
 
         """
-        collapsed_path = _url_collapse_path(urllib.parse.unquote(self.path))
+        collapsed_path = _url_collapse_path(self.path)
         dir_sep = collapsed_path.find('/', 1)
         head, tail = collapsed_path[:dir_sep], collapsed_path[dir_sep+1:]
         if head in self.cgi_directories:
diff --git a/Lib/test/test_httpservers.py b/Lib/test/test_httpservers.py
--- a/Lib/test/test_httpservers.py
+++ b/Lib/test/test_httpservers.py
@@ -565,6 +565,13 @@
             (b'a=b?c=d' + self.linesep, 'text/html', 200),
             (res.read(), res.getheader('Content-type'), res.status))
 
+    def test_query_with_continuous_slashes(self):
+        res = self.request('/cgi-bin/file4.py?k=aa%2F%2Fbb&//q//p//=//a//b//')
+        self.assertEqual(
+            (b'k=aa%2F%2Fbb&//q//p//=//a//b//' + self.linesep,
+             'text/html', 200),
+            (res.read(), res.getheader('Content-type'), res.status))
+
 
 class SocketlessRequestHandler(SimpleHTTPRequestHandler):
     def __init__(self):
diff --git a/Misc/NEWS b/Misc/NEWS
--- a/Misc/NEWS
+++ b/Misc/NEWS
@@ -93,6 +93,9 @@
 - Issue #25232: Fix CGIRequestHandler to split the query from the URL at the
   first question mark (?) rather than the last. Patch from Xiang Zhang.
 
+- Issue #24657: Prevent CGIRequestHandler from collapsing slashes in the
+  query part of the URL as if it were a path. Patch from Xiang Zhang.
+
 - Issue #22958: Constructor and update method of weakref.WeakValueDictionary
   now accept the self and the dict keyword arguments.
 

-- 
Repository URL: https://hg.python.org/cpython


More information about the Python-checkins mailing list