[Web-SIG] Newline values in WSGI response header values.
sh at defuze.org
Thu Jun 12 10:58:09 CEST 2008
> 2008/6/12 Sylvain Hellegouarch <sh at defuze.org>:
>>> Can anyone confirm for me what the behaviour should be if someone
>>> includes a newline in the value of a WSGI response header?
>>> CGI specification would seem to disallow it and thus WSGI adapter
>>> should by rights possibly produce an error if user code does it.
>>> At the moment I know of no WSGI adapter implementation which validates
>>> whether a newline appears in the value of a WSGI response header. For
>>> many WSGI adapters this means that a header of:
>>> Key1: "Value1\r\nKey2: Value2"
>>> will actually translate into two separate headers being sent back to
>>> For a header of:
>>> Key3: "Value3a\r\nValue3b"
>>> in a WSGI adapter which simply passes things through, the client would
>>> get an invalid header line, which in general it would ignore. If
>>> however this was generated when hosted with a CGI-WSGI adapter, for
>>> Apache at least, Apache would generate a 500 error itself due to
>>> detected a header line of invalid format.
>>> Thus, is an embedded newline in value invalid? Would it be reasonable
>>> for a WSGI adapter to flag it as an error?
>> I might be reading the spec wrong but it doesn't seem to be forbidden by
>> RFC 2616.
>> Section 4.2 says:
>>> Any LWS that occurs between field-content MAY be replaced with a single
>> SP before interpreting the field value or forwarding the message
>> Then a look at the definition of separators shows us that SP is a valid
>> Since section 2.1 tells:
>>> Except where noted otherwise, linear white space (LWS) can be included
>> between any two adjacent words (token or quoted-string), and between
>> adjacent words and separators, without changing the interpretation of a
>> It sounds to me that this is a valid construct but a WSGI adapter might
>> consider converting those CRLF into simple SP as said in 2.1 again:
>>> A recipient MAY replace any linear white space with a single SP before
>> interpreting the field value or forwarding the message downstream.
> A LWS is:
> LWS = [CRLF] 1*( SP | HT )
> Ie, not just a single CRLF, but a CRLF followed by a space or tab.
> Thus, can't just replace CRLF only with a space.
> Anyway, the wording of my question and reference to CGI was a bit
> wrong, as WSGI response headers are probably more governed by HTTP
> To clarify, what we really have is two cases, the first is return of a
> value with a valid LWS as specified by HTTP RFC.
> If the WSGI adapter is mapping direct to HTTP, then it can pass it
> straight through. If however the WSGI adapter hosts on top a interface
> with CGI like semantics, then it should translate LWS to single space
> as described.
> The second case is an embedded CRLF which isn't followed by space or
> tab and thus isn't a LWS. This is the case which causes problems and
> am asking whether it should be detected and flagged as an errornous
You might want to take the question to the HTTP-BIS charter and follow-up
on that issue:
More information about the Web-SIG