[Web-SIG] HTTP header canonicalization?
mnot at mnot.net
Mon Aug 23 00:10:40 CEST 2004
The only problem I'm aware of is Set-Cookie, which can have an unquoted
expires date in it; e.g.,
Set-Cookie: CUSTOMER=WILE_E_COYOTE; path=/; expires=Wednesday,
09-Nov-99 23:12:40 GMT
If you have two of these, the comma after the day (here, "Wednesday")
makes parsing problematic.
Note that this is only specified in the original netscape cookie spec
, not the State Management RFC . See section 10.1.2 of  for
more discussion of this issue.
So, you *shouldn't* see these, especially since WSGI is about the
server side. All the same, I'll ask around to see how often they're
still seen in the wild.
It would also be interesting to hear from people working on WSGI
application frameworks to find out how many expect to set multiple
cookies with expires (as opposed to max-age) in at least one; it might
be best to simply disallow doing so, or to require quoting.
Regarding ordering of headers with different names; I don't think so.
Note that HTTP says
"""it is "good practice" to send general-header fields first, followed
by request-header or response-header fields, and ending with the
This isn't very strict, though.
WRT header length limitations, most people start get nervous when they
get larger than 2048 characters; some proxies (esp. older ones) did
limit there, or even at 1024 characters.
Note that headers can be split into multiple lines as well as multiple
Example: foo, bar
is equivalent to
Overall, I think that modelling headers as dictionary in the
application and passing them in that form to a server is a good thing,
as long as the Set-Cookie issue is kept in mind. Servers might have to
modify their serialisation on the wire to account for line lengths and
aesthetics (generally, the only time you run into line length problems
is when you're extending HTTP to do non-browsing things), but that
doesn't need to be exposed to the application.
On Aug 22, 2004, at 11:16 AM, Phillip J. Eby wrote:
> Does anybody see any issues with this? The upside is that it makes it
> easy for servers/gateways to add missing headers (using
> 'headerdict.setdefault()'), and it should also be easier for
> application/framework developers to build up their headers
> incrementally in the same way.
> The only downsides I see that could possibly come up are:
> * There's some reason to have headers with different names in a
> specific order, even though the spec is adamant that such an ordering
> is insignificant and not to be relied upon.
> * There's some reason to split multi-value headers into separate
> header lines, even though the spec is adamant that the forms are
> equivalent, and that HTTP has no limitations on line length.
> Does anybody know whether any HTTP clients in practice are affected by
> these matters?
Mark Nottingham http://www.mnot.net/
More information about the Web-SIG