[issue12455] urllib2 Request() forces capitalize() on header names, breaking some requests

Cal Leeming report at bugs.python.org
Thu Jun 30 22:17:21 CEST 2011


Cal Leeming <cal.leeming at simplicitymedialtd.co.uk> added the comment:

(short answer, I found the cause, and a suitable monkey patch) - below are details of how I did it and steps I took.

-----

Okay so I forked AbstractHTTPHandler() then patched do_request_(), at which point "request.headers" and request.header_items() have the correct header name (Content-MD5).

So I tried this:
opener = urllib2.build_opener(urllib2.HTTPHandler(debuglevel=1))
opener.addheaders = [("Content-TE5", 'test'), ]

However the headers came back capitalized, so the problem is happening somewhere after addheaders. 


 > grep -R "addheaders" *.py
urllib.py:        self.addheaders = [('User-Agent', self.version)]
urllib.py:        self.addheaders.append(args)
urllib.py:        for args in self.addheaders: h.putheader(*args)
urllib.py:            for args in self.addheaders: h.putheader(*args)
urllib2.py:        self.addheaders = [('User-agent', client_version)]
urllib2.py:        for name, value in self.parent.addheaders:

> grep -R "def putheader" *.py
httplib.py:    def putheader(self, header, value):
httplib.py:    def putheader(self, header, *values):

I also then found: http://stackoverflow.com/questions/3278418/testing-urllib2-application-http-responses-loaded-from-files

I then patched this:

            class HTTPConnection(httplib.HTTPConnection):
                def putheader(self, header, value):
                    print [header, value]

This in turn brought back:
['Content-Md5', 'nts0yj7AdzJALyNOxafDyA==']

Which means it's happening before putheader(). So I patched _send_request() on HTTPConnection(), and that also brought back 'Content-Md5'. Exception trace shows:

  File "/ddcms/dev/webapp/../webapp/sites/ma/management/commands/ddcms.py", line 147, in _send_request
    _res = opener.open(req)
  -- CORRECT --
  File "/usr/local/lib/python2.6/urllib2.py", line 391, in open
    response = self._open(req, data)
  -- CORRECT --
  File "/usr/local/lib/python2.6/urllib2.py", line 409, in _open
    '_open', req)
  -- CORRECT --
  File "/usr/local/lib/python2.6/urllib2.py", line 369, in _call_chain
    result = func(*args)
  -- CORRECT --
  File "/ddcms/dev/webapp/../webapp/sites/ma/management/commands/ddcms.py", line 126, in http_open
    return self.do_open(HTTPConnection, req)
  -- CORRECT --
  File "/usr/local/lib/python2.6/urllib2.py", line 1142, in do_open
    h.request(req.get_method(), req.get_selector(), req.data, headers)
  -- INVALID --
  File "/usr/local/lib/python2.6/httplib.py", line 914, in request
    self._send_request(method, url, body, headers)
  File "/ddcms/dev/webapp/../webapp/sites/ma/management/commands/ddcms.py", line 122, in _send_request
    raise


The line that causes it?

                    headers = dict(
                        (name.title(), val) for name, val in headers.items())
                    
So it would appear that title() also needs monkey patching.. Patched to use:


# Patch case sensitive headers (due to reflected API being non RFC compliant, and
# urllib2 not giving the option to choose between the two)
class _str(str):
    def capitalize(s):
        print "capitalize() bypassed: sending value: %s" % ( s )
        return s
    
    def title(s):
        print "title() bypassed: sending value: %s" % ( s )
        return s

_headers = {_str('Content-MD5') : _md5_content}

capitalize() bypassed: sending value: Content-MD5
title() bypassed: sending value: Content-MD5
send: 'POST /url/api HTTP/1.1\r\nContent-MD5: nts0yj7AdzJALyNOxafDyA==\r\n\r\n'

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue12455>
_______________________________________


More information about the Python-bugs-list mailing list