Incorrect 'Host' header when using urllib2 to access a server by its IPv6 link-local address

slmnhq salman.haq at gmail.com
Tue Feb 9 19:02:45 CET 2010


I have a snippet of Python code that makes an HTTP GET request to an
Apache web server (v 2.2.3) using urllib2. The server responds with an
HTTP 400 error presumably because of a malformed 'Host' header.

The snippet is quite simple: it creates a url based on IPv6 string
literal syntax, then creates a Request object and calls urlopen.

def via_urllib2 (addr="fe80::207:b8ff:fedc:636b", scope_id="eth4"):
    host = "[%s%%%s]:80" % (addr, scope_id) # "[fe80::207:b8ff:fedc:
636b%eth4]"
    url = "http://"+host+"/system/status/"
    req = urllib2.Request(url)
    f = urllib2.urlopen(req)
    print f.read()

The urlopen() throws: HTTPError: HTTP Error 400: Bad Request

The Apache error_log reports "Client sent malformed Host header".

Googling for "apache Client sent malformed Host header ipv6" I came
across the following bug:

https://issues.apache.org/bugzilla/show_bug.cgi?id=35122

The problem is that Apache does not handle the scope id in the host
header field very well and reports a 400 error. So I tried to override
that field by creating my own header in the above snippet:

   ...
   req.add_header('Host', "["+urllib.quote(addr)+"]")
   ...

Now the header is simply 'Host: [fe80::207:b8ff:fedc:636b]" (notice
the lack of "%eth4"). However, this still results in the same error.

I know this is not a problem with urllib2 per say, but I'm posting
here in the hope that some Python coder may be knowledgeable enough
about HTTP and IPv6 that they might be able to provide an answer.

Thank you,

Salman



More information about the Python-list mailing list