Bug in SimpleHTTPRequestHandler.send_head?
Hi all, I'm new to this group and the Python language as such. I stumbled on it when I joined a project to build a rich network library for C++, which in turn uses Python and its CGI HTTP server implementation as part of its unit test suite. We're having a little trouble when serving a text file containing Windows line endings (CRLF) -- the resulting content contains Unix line endings only (LF). This breaks our tests, because we can't verify that the body, as parsed by our HTTP client, is the same as the source file we're serving through the Python HTTP server. I've isolated it to the SimpleHTTPRequestHandler.send_head method in SimpleHTTPServer.py: -- ctype = self.guess_type(path) if ctype.startswith('text/'): mode = 'r' else: mode = 'rb' try: f = open(path, mode) except IOError: self.send_error(404, "File not found") return None -- The f object is returned from this method, and used with shutil.copyfileobj to copy the contents to the output stream. This is easily fixed by omitting the content-type check entirely, and blindly using mode 'rb', and I think that makes sense, because the server should not be concerned with the contents of the body, so treating it as a binary stream seems right. This also fixes another issue, where the actual body size differs from what's specified in the Content-Length header, because CR characters are stripped when the body is served, but Content-Length contains the source file's binary size. I'm not sure which source control system you're using, so I won't try to provide a patch, but I believe the code should read: -- if os.path.isdir(path): if not self.path.endswith('/'): # redirect browser - doing basically what apache does self.send_response(301) self.send_header("Location", self.path + "/") self.end_headers() return None for index in "index.html", "index.htm": index = os.path.join(path, index) if os.path.exists(index): path = index break else: return self.list_directory(path) #patch: removed content-type check try: f = open(path, 'rb') #patch: always open in binary mode except IOError: self.send_error(404, "File not found") return None self.send_response(200) self.send_header("Content-type", self.guess_type(path)) #patch: content-type check here instead fs = os.fstat(f.fileno()) -- My changes marked with "#patch[...]". Grateful for any comments! Best wishes, - Kim
Hello Kim, Thanks for your post. The source code control used for Python is Subversion. Patches submitted to this list will unfortunately get lost. Please post the bug report along with your comments and patch to the Python bug tracker: http://bugs.python.org/ Michael Foord Kim Gräsman wrote:
Hi all,
I'm new to this group and the Python language as such. I stumbled on it when I joined a project to build a rich network library for C++, which in turn uses Python and its CGI HTTP server implementation as part of its unit test suite.
We're having a little trouble when serving a text file containing Windows line endings (CRLF) -- the resulting content contains Unix line endings only (LF). This breaks our tests, because we can't verify that the body, as parsed by our HTTP client, is the same as the source file we're serving through the Python HTTP server.
I've isolated it to the SimpleHTTPRequestHandler.send_head method in SimpleHTTPServer.py:
-- ctype = self.guess_type(path) if ctype.startswith('text/'): mode = 'r' else: mode = 'rb' try: f = open(path, mode) except IOError: self.send_error(404, "File not found") return None --
The f object is returned from this method, and used with shutil.copyfileobj to copy the contents to the output stream.
This is easily fixed by omitting the content-type check entirely, and blindly using mode 'rb', and I think that makes sense, because the server should not be concerned with the contents of the body, so treating it as a binary stream seems right.
This also fixes another issue, where the actual body size differs from what's specified in the Content-Length header, because CR characters are stripped when the body is served, but Content-Length contains the source file's binary size.
I'm not sure which source control system you're using, so I won't try to provide a patch, but I believe the code should read:
-- if os.path.isdir(path): if not self.path.endswith('/'): # redirect browser - doing basically what apache does self.send_response(301) self.send_header("Location", self.path + "/") self.end_headers() return None for index in "index.html", "index.htm": index = os.path.join(path, index) if os.path.exists(index): path = index break else: return self.list_directory(path) #patch: removed content-type check try: f = open(path, 'rb') #patch: always open in binary mode except IOError: self.send_error(404, "File not found") return None self.send_response(200) self.send_header("Content-type", self.guess_type(path)) #patch: content-type check here instead fs = os.fstat(f.fileno()) --
My changes marked with "#patch[...]".
Grateful for any comments!
Best wishes, - Kim _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.u...
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/ http://www.trypython.org/ http://www.ironpython.info/ http://www.theotherdelia.co.uk/ http://www.resolverhacks.net/
At 1:19 PM +0100 9/5/08, Michael Foord wrote:
Hello Kim,
Thanks for your post. The source code control used for Python is Subversion.
Patches submitted to this list will unfortunately get lost. Please post the bug report along with your comments and patch to the Python bug tracker:
Patches are usually done with patch, using the output of diff -u. bugs.python.org links to the Python wiki with Help : Tracker Documentation, and searching the wiki can turn up some info on bug submission, but I don't see any step-by-step instructions for newbies. If you're not yet confident that this is really a bug or don't want to wrestle with the bug tracker just now, you might get more disscussion on the newsgroup comp.lang.python. Probably the subject should not say "bug", or you might only get suggestions to submit a bug, but rather something like "Should SimpleHTTPRequestHandler.send_head() change text line endings?", or whatever you think might provoke discussion. FWIW, Python 2.6 and 3.0 are near release, so any accepted patch would at the earliest go into the next after version of Python: 2.7 or 3.1. Patches often laguish and need a champion to push them through. Helping review other patches or bugs is one way to contribute. -- ____________________________________________________________________ TonyN.:' <mailto:tonynelson@georgeanelson.com> ' <http://www.georgeanelson.com/>
participants (3)
-
Kim Gräsman
-
Michael Foord
-
Tony Nelson