[Web-SIG] FW: Closing #63: RFC2047 encoded words
James Y Knight
foom at fuhm.net
Wed Apr 8 20:14:10 CEST 2009
On Apr 8, 2009, at 12:57 PM, Robert Brewer wrote:
> Yes, but parsers need to continue decoding them for many years to
> IMO WSGI origin servers should do this so we can write the decoding
> logic once and forget about it (assuming middleware and apps far
> outnumber origin servers).
Decoding RFC 2047 quoted words is rather trivial compared to correctly
parsing all the HTTP headers. Plus, as I said before, you can't even
*do* the RFC2047 decoding without parsing the headers at the same time
to figure out which pieces need to be decoded! And furthermore, nobody
needs to "continue" decoding them for years to come, *because nobody
decodes them now*!
WSGI is intentionally exposing a fairly low-level view of the world.
So my opinion is that the headers in the dict should be byte strings
and that anyone who wants decoded headers also probably really wants
(or ought to want!) parsed headers, and thus should be using an http
header parsing library. That can expose values as unicode strings if
it wants to.
If you want to start a discussion about having a standard parsed-
header object in WSGI, that's another thing, but saying that WSGI
servers should *partially* decode the headers seems rather silly to me.
More information about the Web-SIG