[Web-SIG] Python 3.0 and WSGI 1.0.
James Y Knight
foom at fuhm.net
Thu Apr 2 20:09:19 CEST 2009
On Apr 2, 2009, at 1:40 PM, Tres Seaver wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> James Y Knight wrote:
>> On Apr 2, 2009, at 7:33 AM, Graham Dumpleton wrote:
>>> """When running under Python 3, servers MUST provide CGI HTTP
>>> variables as strings, decoded from the headers using HTTP standard
>>> encodings (i.e. latin-1 + RFC 2047)"""
>>> Which is fair enough and basically what the RFCs say. At the
>>> moment I
>>> don't apply RFC 2047 rules in Python 3.0 support in mod_wsgi, so
>>> need to do that.
>> I'd really *really* like to recommend that any mention of RFC 2047 is
>> stricken from the WSGI server requirements. I cannot imagine that
>> decoding actually accomplishing anything other than opening security
>> holes (think a filter in an upstream proxy that doesn't know how to
>> 2047-decoding passing something through that you now decode.)
>> Also, you have to only do the decoding on TEXT words according to the
>> spec, so the WSGI container now needs an HTTP header parser just in
>> order to determine where it should decode RFC2047 words and where not
>> to? I don't think so...
> Couldn't the spec mandate that decoding RFC 2047 headers is the
> responsibility of the non-middleware WSGI server? I agree that
> middleware and applications shouldn't know ore care about that
> Under Python 2.x, the server would transcode those values to the
> "common" encoding used for all values in the WSGI environment; under
> Python 3.x, it would just decode them to unicode.
I think you're saying you agree with exactly the opposite of what I
meant. The server/gateway (aka apache mod_wsgi) *must not* be required
to handle RFC2047 decoding. Only the application (or a header parsing
library that the application uses) can possibly handle this properly.
That's why I think it should not be mentioned at all in the WSGI
requirements for the server.
Furthermore, although they certainly can if they want, I'd recommend
that no applications actually bother with doing such decoding, since
RFC2047 words in http headers are essentially never used.
More information about the Web-SIG