[Web-SIG] HTTP headers encoding
manlio_perillo at libero.it
Thu Dec 3 15:49:08 CET 2009
I'm doing some tests to try to understand how HTTP headers are encoded
I have written a simple WSGI application that asks authentication
credentials and then print them on the terminal and return the data as
response, as raw bytes
Then I used some browsers to try to send an username with non ascii
When I try with simple characters in the iso-8859-1 charset, things
works well; the data is encoded using this charset.
However when I try to use some extraneus character, like Euro, there are
Firefox (Iceweasel 3.0.14, Linux Debian Squeeze) sends me a
I don't know where \xac come from, but it is the last byte in the utf-8
encoded Euro: '\xe2\x82\xac'
Internet Explorer 6.0 sends me a
and this this the Euro characted encoded using cp1252 (and I suspect
that it always use this encoding, instead of iso-8859-1).
Unfortunately I can not test with IE 7 and 8.
With a browser working on a terminal, like lynx, things get worse.
If I enter as user name the string "àè", lynx sends me
This happens in a GNOME terminal, with an it_IT.utf8 locale.
wget and curl do the same.
Can someone else reproduce this?
More information about the Web-SIG