[Python-Dev] Bug in SimpleHTTPRequestHandler.send_head?
Kim Gräsman
kim.grasman at gmail.com
Fri Sep 5 13:56:02 CEST 2008
Hi all,
I'm new to this group and the Python language as such. I stumbled on
it when I joined a project to build a rich network library for C++,
which in turn uses Python and its CGI HTTP server implementation as
part of its unit test suite.
We're having a little trouble when serving a text file containing
Windows line endings (CRLF) -- the resulting content contains Unix
line endings only (LF). This breaks our tests, because we can't verify
that the body, as parsed by our HTTP client, is the same as the source
file we're serving through the Python HTTP server.
I've isolated it to the SimpleHTTPRequestHandler.send_head method in
SimpleHTTPServer.py:
--
ctype = self.guess_type(path)
if ctype.startswith('text/'):
mode = 'r'
else:
mode = 'rb'
try:
f = open(path, mode)
except IOError:
self.send_error(404, "File not found")
return None
--
The f object is returned from this method, and used with
shutil.copyfileobj to copy the contents to the output stream.
This is easily fixed by omitting the content-type check entirely, and
blindly using mode 'rb', and I think that makes sense, because the
server should not be concerned with the contents of the body, so
treating it as a binary stream seems right.
This also fixes another issue, where the actual body size differs from
what's specified in the Content-Length header, because CR characters
are stripped when the body is served, but Content-Length contains the
source file's binary size.
I'm not sure which source control system you're using, so I won't try
to provide a patch, but I believe the code should read:
--
if os.path.isdir(path):
if not self.path.endswith('/'):
# redirect browser - doing basically what apache does
self.send_response(301)
self.send_header("Location", self.path + "/")
self.end_headers()
return None
for index in "index.html", "index.htm":
index = os.path.join(path, index)
if os.path.exists(index):
path = index
break
else:
return self.list_directory(path)
#patch: removed content-type check
try:
f = open(path, 'rb') #patch: always open in binary mode
except IOError:
self.send_error(404, "File not found")
return None
self.send_response(200)
self.send_header("Content-type", self.guess_type(path))
#patch: content-type check here instead
fs = os.fstat(f.fileno())
--
My changes marked with "#patch[...]".
Grateful for any comments!
Best wishes,
- Kim
More information about the Python-Dev
mailing list