Thanks for the reply. I just sent another mail in the thread.
"Glyph" == Glyph Lefkowitz email@example.com writes:
Glyph> Well, I know this isn't terribly helpful, but "a bug in getPage" is Glyph> really the only thing that comes to mind. Or, some Glyph> legal-but-unusual behavior in getPage which triggers a bug on the Glyph> EC2 side of things.
The error arose from a combination of things (signing a string that included a host:port but then only sending a host in the Host header). Turns out you can resolve it either way - using a port in both, or omitting the port from both.
BTW, in reading about the Host header, it seems like getPage (more specifically HTTPPageGetter) should be sending a port number in the header, at least when the port is not 80. I base that remark on these:
That's a 1.1 spec as you surely know, and http.py sends an HTTP/1.0 header, so you could argue that sending the Host is therefore just a nicety and there's no need for a port. But the Host header isn't described in the HTTP 1.0 RFC, so it seems more like if you're going to send it you may as well conform to HTTP 1.1.
But I guess that argument is somehow incorrect. I say that because a comment in some other code I'm looking at that uses httplib, says that prior to 2.6, httplib *used* to append a ":443" to SSL requests, but that it no longer does. I guess sending the port was dropped from httplib for good reason, and so HTTPPageGetter shouldn't add it. But I don't know.
I'm very far from being an expert on HTTP headers though. Not as far as I'd like to be, though :-)
Thanks again for the reply.