[Python-Dev] Bug? http.client assumes iso-8859-1 encoding of HTTP headers

martin at v.loewis.de martin at v.loewis.de
Sat Jan 4 21:43:57 CET 2014


Quoting Chris Angelico <rosuav at gmail.com>:

> I'm pretty sure that server is in violation of the spec, so all bets
> are off as to what any other server will do. If you know you're
> dealing with this one server, you can probably hack around this, but I
> don't think it belongs in core code. Unless, of course, I'm completely
> wrong about the spec, or if there's a de facto spec that lots of
> servers follow, in which case maybe it would be worth doing.

It would be possible to support this better by using "ascii" with
"surrogateescape" when receiving the redirect, and using the same
for all URLs coming into http.client. This would implement a
best-effort strategy at preserving the bogus URL, and still maintain
the notion that URLs are text (with the other path being to also
allow bytes as URLs, and always parsing Location as bytes).

Regards,
Martin



More information about the Python-Dev mailing list