[issue20719] test_robotparser failure on "SPARC Solaris 10 (cc%2C 64b) [SB] 3.x" buildbot

STINNER Victor report at bugs.python.org
Sat Feb 22 00:19:41 CET 2014


STINNER Victor added the comment:

> It looks like the new python.org web server configuration was just changed to no longer gzip robots.txt so the test is no longer failing for me.

If I check HTTP headers of http://www.python.org/robots.txt using a small Python script sending "GET /robots.txt HTTP/1.0" and "Host: www.python.org" (but no Accept-Encoding header): I still see "Content-Encoding: gzip".

It looks like a bug in the HTTP server serving www.python.org, because my client didn't send "Accept-Encoding: gzip, deflate".

The RFC 2616 (HTTP/1.1) says "If no Accept-Encoding field is present in a request, the server MAY assume that the client will accept any content coding."
http://www.w3.org/Protocols/rfc2616/rfc2616.html

See also:

"HTTP/1.1 (unlike HTTP/1.0) carefully specifies the Accept-Encoding header, used by a client to indicate what content-codings it can handle, and which ones it prefers."
http://www8.org/w8-papers/5c-protocols/key/key.html

The best solution would be to implement #1508475: support gzip in urllib.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue20719>
_______________________________________


More information about the Python-bugs-list mailing list