[Web-SIG] HTTP header canonicalization?

Mark Nottingham mnot at mnot.net
Mon Aug 23 00:10:40 CEST 2004

The only problem I'm aware of is Set-Cookie, which can have an unquoted 
expires date in it; e.g.,

   Set-Cookie: CUSTOMER=WILE_E_COYOTE; path=/; expires=Wednesday, 
09-Nov-99 23:12:40 GMT

If you have two of these, the comma after the day (here, "Wednesday") 
makes parsing problematic.

Note that this is only specified in the original netscape cookie spec 
[1], not the State Management RFC [2]. See section 10.1.2 of [2] for 
more discussion of this issue.

So, you *shouldn't* see these, especially since WSGI is about the 
server side. All the same, I'll ask around to see how often they're 
still seen in the wild.

It would also be interesting to hear from people working on WSGI 
application frameworks to find out how many expect to set multiple 
cookies with expires (as opposed to max-age) in at least one; it might 
be best to simply disallow doing so, or to require quoting.

Regarding ordering of headers with different names; I don't think so. 
Note that HTTP says

"""it is "good practice" to send general-header fields first, followed 
by request-header or response-header fields, and ending with the 
entity-header fields."""

This isn't very strict, though.

WRT header length limitations, most people start get nervous when they 
get larger than 2048 characters; some proxies (esp. older ones) did 
limit there, or even at 1024 characters.

Note that headers can be split into multiple lines as well as multiple 
instances; e.g.,

Example: foo, bar

is equivalent to

Example: foo
Example: bar


Example: foo,

Overall, I think that modelling headers as dictionary in the 
application and passing them in that form to a server is a good thing, 
as long as the Set-Cookie issue is kept in mind. Servers might have to 
modify their serialisation on the wire to account for line lengths and 
aesthetics (generally, the only time you run into line length problems 
is when you're extending HTTP to do non-browsing things), but that 
doesn't need to be exposed to the application.


1. http://wp.netscape.com/newsref/std/cookie_spec.html
2. http://rfc2109.x42.com/

On Aug 22, 2004, at 11:16 AM, Phillip J. Eby wrote:

> Does anybody see any issues with this?  The upside is that it makes it 
> easy for servers/gateways to add missing headers (using 
> 'headerdict.setdefault()'), and it should also be easier for 
> application/framework developers to build up their headers 
> incrementally in the same way.
> The only downsides I see that could possibly come up are:
>  * There's some reason to have headers with different names in a 
> specific order, even though the spec is adamant that such an ordering 
> is insignificant and not to be relied upon.
>  * There's some reason to split multi-value headers into separate 
> header lines, even though the spec is adamant that the forms are 
> equivalent, and that HTTP has no limitations on line length.
> Does anybody know whether any HTTP clients in practice are affected by 
> these matters?

Mark Nottingham     http://www.mnot.net/

More information about the Web-SIG mailing list